### Description

In this exercise, you will use the Custom Vision service to train an object detection model that can detect and locate three classes of fruit (apple, banana, and orange) in an image.

### Prerequisite

- Create an Azure Custom Vision Resource
    - (technically, you are creating two resources. One for training and one for prediction)
- Create a Custom Vision project in the Custom Vision portal
    - You could do this in code via the SDK: [Link Here](https://learn.microsoft.com/en-us/azure/ai-services/custom-vision-service/quickstarts/object-detection?tabs=windows%2Cvisual-studio&pivots=programming-language-python)
- Upload and tag images (supervised learning)
    - Similarly, you could also do this via code (refer to the link above)
    - After the images have been uploaded, select the first one to open it.
    - Hold the mouse over any object in the image until an automatically detected region is displayed
    - When the region surrounds the object, add a new tag with the appropriate object type (apple, banana, or orange)

- Train the model (or wait until you upload files using the SDK)
    - Instructor note: I did most of the prework, leaving one image to be tagged in the Studio

- Look at the model's performance and test the model with some images you find online

- Create your `.env` file

### Use the Custom Vision SDK to upload images

You can use the UI in the Custom Vision portal to tag your images, but many AI development teams use other tools that generate files containing information about tags and object regions in images. In scenarios like this, you can use the Custom Vision training API to upload tagged images to the project.

#### You Will Need
- the Project ID (from the computer vision settings in the portal. you could fetch this using the SDK as well)
- custom vision Endpoint and Key (for training service)

Then refer to the 04b. Add_tagged_images.ipynb file

### Bounding Box Field Definitions

---

**1. `left`**  
This is the **normalized horizontal coordinate** of the top-left corner of the bounding box.  
- Measured as a fraction of the total image width.  
- Example: `left = 0.5` means the box starts halfway across the image horizontally.  
- So `left = 0.117563143` means about 11.75% of the way across the image from the left edge.

---

**2. `top`**  
This is the **normalized vertical coordinate** of the top-left corner of the bounding box.  
- Measured as a fraction of the total image height.  
- Example: `top = 0.141307935` means the box starts about 14.13% down from the top edge.

---

**3. `width`**  
This is the **normalized width** of the bounding box.  
- Expressed as a fraction of the image’s total width.  
- Example: `width = 0.3` means the box spans 30% of the total image width.

---

**4. `height`**  
This is the **normalized height** of the bounding box.  
- Expressed as a fraction of the image’s total height.  
- Example: `height = 0.421263933` means the box is about 42.13% of the total image height.

---

### Putting It Together

If your image is **1000 pixels wide** and **800 pixels tall**, then for the `"orange"` box:

- `left = 0.501782656` → pixel **501.78** (of 1000) from the left  
- `top = 0.141307935` → pixel **113.04** (of 800) from the top  
- `width = 0.30014348` → box width **300.14 pixels**  
- `height = 0.421263933` → box height **337.01 pixels**


### Test A Model

- The model will need to be trained first. Do so with the train option in the studio
- In the Custom Vision portal, when training has finished, review the Precision, Recall, and mAP performance metrics - these measure the prediction accuracy of the object detection model, and should all be high.
- At the top right of the page, click Quick Test, and then in the Image URL box, type `https://aka.ms/test-fruit` and click the quick test image (➔) button.

#### Using the SDK to use the object detector in a client application

Publish the object detection model


- In the Custom Vision portal, on the Performance page, click 🗸 Publish to publish the trained model with the following settings:
    - Model name: fruit-detector
    - Prediction Resource: The prediction resource you created previously which ends with “-Prediction” (not the training resource).
- make sure to note the prediction service Endpoint and key

In [1]:
from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
from msrest.authentication import ApiKeyCredentials
from matplotlib import pyplot as plt
from PIL import Image, ImageDraw, ImageFont
import numpy as np
import os
from dotenv import load_dotenv

In [2]:
load_dotenv()

True

In [None]:
prediction_endpoint = os.getenv('CUSTOM_IMAGE_PREDICT_ENDPOINT')
prediction_key = os.getenv('CUSTOM_IMAGE_PREDICT_KEY')
project_id = "97462dc8-2998-4044-951f-ed14d6ad89d9"
model_name = os.getenv('ModelName')

In [None]:
credentials = ApiKeyCredentials(in_headers={"Prediction-key": prediction_key})
prediction_client = CustomVisionPredictionClient(endpoint=prediction_endpoint, credentials=credentials)

In [None]:
# Detect objects in the image
image_file = 'images/Object Detection/test_images/produce.jpg'
#print('Detecting objects in', image_file)
with open(image_file, mode="rb") as image_data:
    results = prediction_client.detect_image(project_id, model_name, image_data)

In [None]:
results

In [None]:
# Loop over each prediction
for prediction in results.predictions:
    # Get each prediction with a probability > 50%
    if (prediction.probability*100) > 50:
        print(prediction.tag_name)

#### We could even annotate our Images

In [None]:
def save_tagged_image(source_path, detected_objects):
    #Load the image using Pillow
    image = Image.open(source_path)
    h, w, ch = np.array(image).shape
    # Create a figure for the results
    fig = plt.figure(figsize=(8, 8))
    plt.axis('off')

    # Display the image with boxes around each detected object
    draw = ImageDraw.Draw(image)
    lineWidth = int(w/100)
    color = 'magenta'
    for detected_object in detected_objects:
        # Only show objects with a > 50% probability
        if (detected_object.probability*100) > 50:
            # Box coordinates and dimensions are proportional - convert to absolutes
            left = detected_object.bounding_box.left * w 
            top = detected_object.bounding_box.top * h 
            height = detected_object.bounding_box.height * h
            width =  detected_object.bounding_box.width * w
            # Draw the box
            points = ((left,top), (left+width,top), (left+width,top+height), (left,top+height),(left,top))
            draw.line(points, fill=color, width=lineWidth)
            # Add the tag name and probability
            plt.annotate(detected_object.tag_name + ": {0:.2f}%".format(detected_object.probability * 100),(left,top), backgroundcolor=color)
    plt.imshow(image)
    outputfile = 'images/Object Detection/test_images/annotated_test_image.jpg'
    fig.savefig(outputfile)
    print('Results saved in', outputfile)

In [None]:
# Create and save an annotated image
save_tagged_image(image_file, results.predictions)