<a href="https://colab.research.google.com/github/ShrimpCryptid/deepsea-detector/blob/main/notebooks/Model%20Training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Model Training
#### Peyton Lee and Neha Nagvekar, 5/31/22

Trains a Yolov5x model on the provided annotations, using the pretrained weights from the FathomNet Yolov5x. 



## Environment Setup
We need to download/install the following:
- The [Yolov5](https://github.com/ultralytics/yolov5) repository.
- The [FathomNet YOLOv5x model](https://github.com/fathomnet/models) we'll be getting pretrained weights from
- The [custom RoboFlow dataset](https://app.roboflow.com/uwrov-2022-ml-challenge/deepsea-detect--mate-2022-ml-challenge) we put together, edited from data provided by NOAA Ocean Exploration!

In [None]:
BASE_MODEL_URL = "https://zenodo.org/record/5539915/files/mbari-mb-benthic-33k.pt?download=1"
BASE_MODEL_PATH = "/content/mbari-mb-benthic-33k.pt"

NEW_MODEL = "test_training_datasetv1_modelv1.pt"
NEW_MODEL_PATH = "/content/yolov5/models/{}".format(NEW_MODEL)

# NOTE: You may need to go to
# https://app.roboflow.com/uwrov-2022-ml-challenge/deepsea-detect--mate-2022-ml-challenge
# and generate your own download code. Make sure to select the YOLOv5 export option. 
ROBOFLOW_DATA_URL = ""

In [None]:
# Download the Yolov5 repository.
!git clone https://github.com/ultralytics/yolov5

# Install all base requirements for YOLOv5.
!pip install -r yolov5/requirements.txt wandb


In [None]:
# Download the MBARI FathomNet YOLOv5 Model:
!curl https://zenodo.org/record/5539915/files/mbari-mb-benthic-33k.pt?download=1 -o {BASE_MODEL_PATH}

# Make a copy of the base model-- this will be what we train!
!cp {BASE_MODEL_PATH} {NEW_MODEL_PATH}

In [None]:
# Set up the directory structure and download our dataset from RoboFlow.
!mkdir datasets
# Download and extract the Roboflow data.
!curl -L {ROBOFLOW_DATA_URL} > datasets/roboflow.zip; cd datasets; unzip roboflow.zip; rm roboflow.zip; cd ..

In [None]:
!ls datasets/

In [None]:
# Add the RoboFlow dataset as a .yaml file so YOLO knows where to find the data.
relative_dataset_path = "../datasets"
yaml_contents = """train: {0}/train/images
val: {0}/valid/images
test: {0}/test/images

nc: 9
names: ['annelida', 'arthropoda', 'cnidaria', 'echinodermata', 'fish', 'mollusca', 'other-invertebrates', 'porifera', 'unidentified-biology']""".format(relative_dataset_path)

yaml_file_name = "DeepseaDetectorDataset.yaml"
yaml_file_path = "yolov5/data/{}".format(yaml_file_name)
!touch {yaml_file_path}
with open(yaml_file_path, 'w') as file:
  file.write(yaml_contents)

!cat {yaml_file_path}

## Running the Training

In [None]:
# Feel free to tweak these to suit your needs!
batch_size = 48
freeze = 24  # Freezes this many layers, exclusive. (Layers 0-23 will be frozen.)
image_size = 640  # The height and width of our images. YOLO requires squares!
weights = NEW_MODEL_PATH  # The model weights to train.
data = yaml_file_name  # The name of the dataset .yaml file.
epochs = 12  # The number of times our training dataset is repeated.

# This command is where the magic happens!
!cd yolov5; python3 train.py --batch {batch_size} --freeze {freeze} --weights {weights} \
 --data {data} --epochs {epochs} --cache --img {image_size}

 # Follow the prompts to set up a Weights & Biases experiment (W&B) if you'd like.