# This notbook is for the `data-stems-joe` potato stems data Session 09 2021-07-06

`data-stems-joe` [Github repository](https://github.com/weharris/data-stems-joe/tree/main)

Ed Harris

2022-01-23

<br/>

---

The purpose of this notebook is to train a computer vision detection model to count and classify stems from potato plant images. 

- [official Yolov5 custom data tutorial](https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data)
- [Kaggle tutorial version](https://www.kaggle.com/ultralytics/yolov5/notebook)

<br/>

Sections:

- 00 Setup environment and requirements
- 01 Data - potato stem detection
- 02 Train Yolov5 model
- 03 Test Yolov5 model

---

# 00 Setup environment and requirements

In [1]:
# trash terminal commands
%pwd

'/home/studio-lab-user/data-stems-joe'

In [2]:
#clone YOLOv5 and 
!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5
%pip install -qr requirements.txt # install dependencies
# %pip install -q roboflow

import torch
import os
from IPython.display import Image, clear_output  # to display images

# Getting some version errors from the requirements.txt
# Might be ok
print(f"Setup complete. Using torch {torch.__version__} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")

Cloning into 'yolov5'...
remote: Enumerating objects: 10897, done.[K
remote: Counting objects: 100% (4/4), done.[K
remote: Compressing objects: 100% (4/4), done.[K
remote: Total 10897 (delta 0), reused 0 (delta 0), pack-reused 10893[K
Receiving objects: 100% (10897/10897), 11.00 MiB | 42.50 MiB/s, done.
Resolving deltas: 100% (7527/7527), done.
/home/studio-lab-user/data-stems-joe/yolov5
You should consider upgrading via the '/home/studio-lab-user/.conda/envs/default/bin/python -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.
Setup complete. Using torch 1.10.1+cu102 (Tesla T4)


# 01 Data - potato stem detection

In order to train our custom model, we need to assemble a dataset of representative images with bounding box annotations around the objects that we want to detect. And we need our dataset to be in YOLOv5 format.


# 02 Train Yolov5 model

Here, we are able to pass a number of arguments:
- **img:** define input image size
- **batch:** determine batch size
- **epochs:** define the number of training epochs. (Note: often, 3000+ are common here!)
- **data:** Our dataset locaiton is saved in the `dataset.location`
- **weights:** specify a path to weights to start transfer learning from. Here we choose the generic COCO pretrained checkpoint.
- **cache:** cache images for faster training

In [3]:
# trash commands
%pwd

'/home/studio-lab-user/data-stems-joe/yolov5'

In [4]:
!python train.py --img 1280 --rect --batch 16 --epochs 30 --data ../yolo-files/2021-07-06-sess09.yaml --weights yolov5s.pt --cache

[34m[1mtrain: [0mweights=yolov5s.pt, cfg=, data=../yolo-files/2021-07-06-sess09.yaml, hyp=data/hyps/hyp.scratch.yaml, epochs=30, batch_size=16, imgsz=1280, rect=True, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache=ram, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
remote: Enumerating objects: 3, done.[K
remote: Counting objects: 100% (3/3), done.[K
remote: Compressing objects: 100% (3/3), done.[K
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (3/3), 2.17 KiB | 2.17 MiB/s, done.
From https://github.com/ultralytics/yolov5
   2e5c67e..9a8ebe6  master     -> origin/master
[34m[1mgithub: [0m⚠️ YOLOv5 is out of date 

# Evaluate Custom YOLOv5 Detector Performance
Training losses and performance metrics are saved to Tensorboard and also to a logfile.

If you are new to these metrics, the one you want to focus on is `mAP_0.5` - learn more about mean average precision [here](https://blog.roboflow.com/mean-average-precision/).

In [None]:
# Start tensorboard
# Launch after you have started training
# logs save in the folder "runs"
%load_ext tensorboard
%tensorboard --logdir runs

#Run Inference  With Trained Weights
Run inference with a pretrained checkpoint on contents of `test/images` folder downloaded from Roboflow.

In [None]:
!python detect.py --weights runs/train/exp/weights/best.pt --img 1280 --rect --conf 0.1 --source {dataset.location}/test/images

In [None]:
#display inference on ALL test images

import glob
from IPython.display import Image, display

for imageName in glob.glob('/content/yolov5/runs/detect/exp/*.jpg'): #assuming JPG
    display(Image(filename=imageName))
    print("\n")

# Conclusion and Next Steps

Congratulations! You've trained a custom YOLOv5 model to recognize your custom objects.

To improve you model's performance, we recommend first interating on your datasets coverage and quality. See this guide for [model performance improvement](https://github.com/ultralytics/yolov5/wiki/Tips-for-Best-Training-Results).

To deploy your model to an application, see this guide on [exporting your model to deployment destinations](https://github.com/ultralytics/yolov5/issues/251).

Once your model is in production, you will want to continually iterate and improve on your dataset and model via [active learning](https://blog.roboflow.com/what-is-active-learning/).

In [None]:
#export your model's weights for future use
from google.colab import files
files.download('./runs/train/exp/weights/best.pt')