# Demo to train a yolov4 object-detection model, for berry detection.

In [12]:
import os

from datadir import datadir

# 0. Create a training dataset using the generate_training_dataset.ipynb example script (Or use the dataset in data/training as an example)
### The detection dataset must be located in a folder with two subfolders (train, valid). Each subfolder contains couples of input (.png) and output (.txt) files sharing the same name (ex: img0.png and img0.txt).

# 1. Generates the files which are necessary to train a yolov4 model

In [2]:
output_dir = datadir  # where these files will be saved

## train.txt and valid.txt

In [6]:
for dataset in ['train', 'valid']:
    with open(output_dir + '/{}.txt'.format(dataset), 'w') as out:
      files = [f for f in os.listdir(datadir + '/training/det_dataset/' + dataset) if f[-4:] == '.png']
      for f in files:
        out.write('{}/{}\n'.format(dataset, f))

## classes.names

In [10]:
with open(output_dir + '/classes.names', 'w') as out:
  out.write('berry') # the class name doesn't matter since there is only one class

## training.data

In [11]:
with open(output_dir + '/training.data', 'w') as out:
  out.write('classes = 1\ntrain = train.txt\nvalid = valid.txt\nnames = classes.names\nbackup = backup')

# 2. Training

On a linux server with a GPU and darknet installed (https://github.com/AlexeyAB/darknet):

1) Create a directory containing:

- train.txt file
- valid.txt file
- classes.names file
- training.data file
- detection.cfg file (contained in examples/data/model)
- the train and valid subfolders (from step 0.)
- a folder called "backup"
    
2) Run the following command (filled with the correct directories): 

    /.../darknet/darknet detector train /.../training.data /.../detection.cfg -map


Notes: 

- training is very slow without a GPU. Instead of using a linux server, this script can be adapted to run on Google Colab (there are many tutorials online, for yolov4 training), wich provides GPU access

- source code of darknet/src/detector.c can be modified to change model save frequency and map computation frequency

- map (mean average precision) is computed regularly on the validation dataset, weights with the best map are automatically saved as ..._best.weights file in the backup folder