# Lesson 33 Object Detection & More
NOTE: This Lesson assumes python Python 3.12.12 so that you can use `label-studio`.

**However**, all YOLO Models work just fine on the latest 3.14 version of python. And for using Roboflow to annotate images. I recommend python version : 3.14

![CLDvS](https://gperdrizet.github.io/FSA_devops/assets/lesson_33_images/CLDvS.png)

- **Classification** *(Is there a cat or no cat?)*
    - You could perfrom multi-class or multi-label too!
        - **MC**: [0.90: "Cat", 0.04: "Dog", 0.01: "Rat"]
        - **ML**: {0.99 : 'Animal', 0.7 : 'Cat'}
---
- **Localization** *(Where is the Cat?)*
    - You could even find multiple Cats!! 
        - Cat_1 : [x1, x2, y1, y2]
        - Cat_2: [x3, x4, y3, y4]...
---
- **Object Detection** *(Where are the Cats? Oh, yeah I can locate dogs too!)*
    - Obj_1: 
        - Localization: [x1, x2, y1, y2]
        - Classification: [0.90: "Cat", 0.04: "Dog", 0.01: "Rat"] : "Cat"
    - Obj_2: 
        - Localization: [x3, x4, y3, y4]
        - Classification: [0.01: "Cat", 0.98: "Dog", 0.01: "Rat"] : "Dog"
    - Obj_3: 
        - Localization: [x5, x6, y5, y6]
        - Classification: [0.00: "Cat", 0.95: "Dog", 0.05: "Rat"] : "Dog"
---
- **Image Segmentaion** *(I'm creating a scrap-book of cute animals!)*
    - =^..^= 
        - Classification: [0.90: "Cat", 0.04: "Dog", 0.01: "Rat"] : "Cat" 
    - ˁ(OᴥO)ˀ
        - Classification: [0.01: "Cat", 0.98: "Dog", 0.01: "Rat"] : "Dog"
    - ) _ _ __/°°¬      
        - Classification: [0.00: "Cat", 0.95: "Dog", 0.05: "Rat"] : "Dog"

## Classification and Localaization

### Classification - Qualities 
#### What in this Image
Captcha: Select all the images with Muffins.

![Muffins](https://gperdrizet.github.io/FSA_devops/assets/lesson_33_images/select_all_muffings.png)

Dataset Example:
https://universe.roboflow.com/jacob-solawetz/flowers_classification

- Think back to our Custom CNN to using the CFAR10 dataset

        - Classifies an image on which object it most likely contained
        - Only predicts : Are any ___ Present?
        - Predictions contain NO spacial coordinates
       
- Models could be trained to perform  multi-label predictions

       - ['car', 'red'] --> 'red car'
       - ['lizard', 'green'] --> 'green lizard'


### Localization - Quantities
#### Where in this Image?
Captcha: Select all the images which contain traffic lights.

![Overthinking](https://gperdrizet.github.io/FSA_devops/assets/lesson_33_images/localize_traffic_lights.png)

- Could imagine creating a dataset of one object type with object coordinates
    - Could have mutliple of the SAME object within an image.
            
            - Ball_1, Ball_2, ... , Ball_N-1, Ball_N

        - Training a model to predict the coordinates of the object(s) within an image.
        - Compare predictions to 'true label' coordinates.
- Limitations:
    - Only predicts : Where are ___?
    - Predictions contain NO
- Advantages:
    - Model will find all the locations of an object within an image.
    - If you add up how many locations are found, then the model implicitly learns to 'count'.

            Object A : [x1, x2, y1, y2]
            Object B : [x1, x2, y1, y2]  +
            Object C : [x1, x2, y1, y2]  +
            _______________________________
            3 Object Instances in the Image


## Detection, Segmentation, Pose Estimation, etc.. 

### Object Detection
- Detection:
    Cutting out Rectangular Area that contains objects using a "Guillotine"
    - Tries to Say: here is an object as quickly as possible
    - Fast; High "*Accuracy*", with Low Area "*Precision*"

#### Dataset used to train YOLO : https://cocodataset.org/#explore

https://docs.ultralytics.com/tasks/detect/

Dataset Example: 
https://universe.roboflow.com/bookspines/book-spines-fi8nq

In [None]:
from ultralytics import YOLO    
import torch
import gc # Garbage Collector

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')

In [None]:
# Load a Model
model = YOLO(model='yolo26m.pt')    # Change the letter at the end [n, s, m, l x] for model with more parameters
                                    # Gain 'better' metric perfromance at the cost of speed, bigger is slower
                                    # Here we use the 'Medium' sized model: m

In [None]:
# Inference (Make Predictions)
results = model(source="https://gperdrizet.github.io/FSA_devops/assets/lesson_33_images/select_all_muffings.png", show=True, save=True)

In [None]:
del model
torch.cuda.empty_cache()
gc.collect()

In [None]:
# Access the results
for result in results:
    xywh = result.boxes.xywh  # center-x, center-y, width, height
    xywhn = result.boxes.xywhn  # normalized
    xyxy = result.boxes.xyxy  # top-left-x, top-left-y, bottom-right-x, bottom-right-y
    xyxyn = result.boxes.xyxyn  # normalized
    names = [result.names[cls.item()] for cls in result.boxes.cls.int()]  # class name of each box
    confs = result.boxes.conf  # confidence score of each box


### Objection Segmentation

- Segmentaion:
    Carefully cutting out Exact Objects using an "Exacto Knife"
    - Tries to precisely locate an instance of an object within an image.
    - Slower; High "*Accuracy*", with High Area "*Precision*"

https://docs.ultralytics.com/tasks/segment/

Dataset Example: https://universe.roboflow.com/atheer-algarni-gvico/car-parts-human-in-the-loop

In [None]:
# Load a Segmentaion Model
seg_model = YOLO(model='yolo26m-seg.pt')    # Change the letter in the middle [n, s, m, l x] for model with more parameters
                                            # Gain 'better' metric perfromance at the cost of speed, bigger is slower
                                            # Here we use the 'Medium' sized model: m

In [None]:
seg_results = seg_model(source = "https://www.youtube.com/watch?v=W3zPhqBXaoM", 
                        save = True, 
                        project = "Cattle")     # Save = True to get a visual of what the heck is going on with predictions
                                                # Can easily view with mpv (or whatever method you use to view .avi file types)

In [None]:
# seg_results = seg_model(source=0, 
#                         save=True, 
#                         project = "Video Cam")  # Save = True to get a visual of what the heck is going on with predictions
#                                                 # Can easily view with mpv (or whatever method you use to view .avi file types)
#                                                 # Source of 0 means use the webcam

In [None]:
del seg_model
torch.cuda.empty_cache()
gc.collect()

### Pose Estimation

In [None]:
# Load Pose Model
model = YOLO(model='yolo26m-pose.pt')   # Change the letter at the end [n, s, m, l x] for model with more parameters
                                        # Gain 'better' metric perfromance at the cost of speed, bigger is slower
                                        # Here we use the 'Medium' sized model: m

In [None]:
model.track(source="https://www.youtube.com/watch?v=YzcawvDGe4Y", 
            save=True, 
            project="Busy Street")

In [None]:
model.track(source="https://www.youtube.com/watch?v=2bYX8QcZ8mc", 
            save=True, 
            project="Griddy")

In [None]:
del model
torch.cuda.empty_cache()
gc.collect()

## Training your own YOLO Model - CIFAR-10
### Classification is the easiest kind of dataset to get up and running
Look into 


https://docs.ultralytics.com/datasets/

https://docs.ultralytics.com/usage/callbacks/

Arbitrary keyword arguments for training configuration. Common options include:
- data (str): Path to dataset configuration file.
- epochs (int): Number of training epochs.
- batch (int): Batch size for training.
- imgsz (int): Input image size.
- device (str): Device to run training on (e.g., 'cuda', 'cpu').
- workers (int): Number of worker threads for data loading.
- optimizer (str): Optimizer to use for training.
- lr0 (float): Initial learning rate.
- patience (int): Epochs to wait for no observable improvement for early stopping of training.
- augmentations (list[Callable]): List of augmentation functions to apply during training.

In [None]:
from ultralytics import YOLO

# Load a model
model = YOLO("yolo26n-cls.pt")  # load a pretrained model (recommended for training)

# Train the model 1:33 with batch of 8192
results = model.train(data="cifar10", epochs=3, imgsz=32, batch=2048, device=device)

# Export the model
model.export(format="onnx")

In [None]:
del model
torch.cuda.empty_cache()
gc.collect()

In [None]:
# Load a model
model = YOLO("yolov8n-obb.pt")  # load a pretrained model (recommended for training, but could modify)

# Train the model 1:33 with batch of 8192
# results = model.train(data="./datasets/Books/data.yaml", epochs=3, device=device) # https://universe.roboflow.com/bookspines/book-spines-fi8nq/dataset/6/download  --> download zip file and point the data parameter to the .yaml file

In [None]:
# Export the model
# model.export(format="onnx")