# Getting Started
**Make sure to clear the cell output before check into GitHub**
## Setup environment
If you don't have a .venv (Python 3.xx) Python kernel environment in the top right items of this window, then the first thing to do is setup the Python environment kernel:
- Go to View->Command Palette->Python: Create Environment... and run this command
- Select this Python kernel in top right of this window as your running environment

## Inference
For inference testing, we using pretrained YOLO model from Ultralytics. We need to do the following:
- Pip install the ultralytics library into our environment
- Select the model you want to use. This can be the latest v11 or older but smaller model like v5. There are many different models. For instance YOLOv5xu is an advanced version integrating an anchor-free, objectness-free split head for improved accuracy-speed trade-off, while YOLOv5x6u is a larger model with a focus on higher accuracy.
- Run the predict function. See documentation for information on the parameters

## Notes
We use the tennis ball dataset on RoboFlow to fine tune our YOLO model. As you can see from the results there are no tennis ball detection, since the model was trained on data that do not have tennis balls.
We don't use RoboFlow model because of its architecture of client-server, so we need to run a server on the cloud. Ultralytics, you can download the trained model and include in your application

In [3]:
%pip install ultralytics
import shutil

# This script uses the YOLOv5 model to perform object detection on a video file.
from ultralytics import YOLO 

# Load the YOLOv5 model from Ultralytics
# model = YOLO('yolov5nu')

# Load the YOLOv8 model with allow for partial pretrained classe
# model = YOLO('yolov8n.pt')  # Load the YOLOv8 nano model

# Load the YOLOv8 nano model that we trained earlier
# You can also use 'yolov8x' for a larger model with more parameters
model = YOLO("../tennis_ball_train/models/train/weights/last_v5.pt")  # Load the YOLOv8 nano model
#model = YOLO("../tennis_ball_train/models/train/weights/yolov8.mlpackage")

# confidence threshold for detection, filter out low confidence detections
# conf=0.2 is a good starting point, but you can adjust it based on your needs
# save=True will save the output video with detections
results = model.predict('input/input_video.mp4',conf=0.2, save=True)

print(model)

Note: you may need to restart the kernel to use updated packages.

inference results will accumulate in RAM unless `stream=True` is passed, causing potential out-of-memory
errors for large sources or long-running streams and videos. See https://docs.ultralytics.com/modes/predict/ for help.

Example:
    results = model(source=..., stream=True)  # generator of Results objects
    for r in results:
        boxes = r.boxes  # Boxes object for bbox outputs
        masks = r.masks  # Masks object for segment masks outputs
        probs = r.probs  # Class probabilities for classification outputs

video 1/1 (frame 1/214) /Users/dztran/Projects/TennisMotion/tennis_session/tennis_ball_inference/input/input_video.mp4: 384x640 1 player-back, 1 player-front, 25.9ms
video 1/1 (frame 2/214) /Users/dztran/Projects/TennisMotion/tennis_session/tennis_ball_inference/input/input_video.mp4: 384x640 1 player-back, 1 player-front, 1 tennis-ball, 25.0ms
video 1/1 (frame 3/214) /Users/dztran/Projects/TennisMo

## Interpret inference results
The results are in array of inference result. The names of the object is in the result.name. Such as below 0 = person, ...etc:
```
Class names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
```
As you can see there is no tennis ball as an object label.
Since this model was trained with a large set of objects there are many labels. The bounding box for each object is in result.box such as:
```
xywh: tensor([[551.2131, 841.1858, 144.2126, 180.8654]])
xywhn: tensor([[0.2871, 0.7789, 0.0751, 0.1675]])
xyxy: tensor([[479.1068, 750.7531, 623.3194, 931.6185]])
xyxyn: tensor([[0.2495, 0.6951, 0.3246, 0.8626]])
```
The bounding box we will use is **xyxy**, top left x, y and bottom right x, y


In [4]:

# Print out the results of the object detection. each frame is in each result array
# Print out all the bounding boxes for the first frame
print("Class names:", results[0].names)
print("Bounding boxes for the first frame:")
# print out the bounding boxes for all the detected objects in all the frames
for i, result in enumerate(results):
    print(f"Frame {i+1}:")
    for box in result.boxes:
        print(box)

Class names: {0: 'player-back', 1: 'player-front', 2: 'tennis-ball'}
Bounding boxes for the first frame:
Frame 1:
ultralytics.engine.results.Boxes object with attributes:

cls: tensor([1.])
conf: tensor([0.9147])
data: tensor([[4.7657e+02, 7.4755e+02, 6.2646e+02, 9.3606e+02, 9.1471e-01, 1.0000e+00]])
id: None
is_track: False
orig_shape: (1080, 1920)
shape: torch.Size([1, 6])
xywh: tensor([[551.5142, 841.8048, 149.8863, 188.5043]])
xywhn: tensor([[0.2872, 0.7794, 0.0781, 0.1745]])
xyxy: tensor([[476.5710, 747.5526, 626.4573, 936.0569]])
xyxyn: tensor([[0.2482, 0.6922, 0.3263, 0.8667]])
ultralytics.engine.results.Boxes object with attributes:

cls: tensor([0.])
conf: tensor([0.7920])
data: tensor([[1.0363e+03, 2.0342e+02, 1.1002e+03, 3.0834e+02, 7.9205e-01, 0.0000e+00]])
id: None
is_track: False
orig_shape: (1080, 1920)
shape: torch.Size([1, 6])
xywh: tensor([[1068.2876,  255.8786,   63.9141,  104.9232]])
xywhn: tensor([[0.5564, 0.2369, 0.0333, 0.0972]])
xyxy: tensor([[1036.3306,  203.41