# Run Sound

This notebook allows to train and test the audio model, this includes data preparation, and data split.  

To start make sure you have installed all requirements:

```
pip install -r requirements.txt
```


In [1]:
%cd audio_data_engine

/home/guests/msg/Magalhaes/audio_data_engine


## Create dataset

In [2]:
!python create_dataset.py -h

usage: create_dataset.py [-h] [--datasets DATASETS] [--labels LABELS]
                         [--output_dir OUTPUT_DIR]

Create a dataset and splits for cross validation

optional arguments:
  -h, --help            show this help message and exit
  --datasets DATASETS   text file with the paths to all csv files to be used
                        by the final dataset (default:
                        ../rethink/data/all.dataset)
  --labels LABELS       file with expected labels (default:
                        ../rethink/data/schreder.names)
  --output_dir OUTPUT_DIR
                        directory to store the resulting csv file (default:
                        /media/magalhaes/sound/datasets)


In [None]:
!python create_dataset.py 

## Preprocess audios

To improve the training speed we preprocess the audios, extract the mel spectrograms, and save them in a .npy file.

In [None]:
%cd ../rethink

In [5]:
!python preprocess.py -h

usage: preprocess.py [-h] [--sample_rate SAMPLE_RATE] [--n_mels N_MELS]
                     [--image_length IMAGE_LENGTH] [--duration DURATION]
                     [--effi_extractor] [--export_extractor] [--waveform_only]
                     [--labels_path LABELS_PATH] [--output_dir OUTPUT_DIR]
                     [--device DEVICE]
                     csv_file

Preprocesses audio files into 3 channel spectrograms and stores the result in
an npy file

positional arguments:
  csv_file              csv file with the dataset

optional arguments:
  -h, --help            show this help message and exit
  --sample_rate SAMPLE_RATE
                        Sample rate
  --n_mels N_MELS       Number of mel filterbanks which defines the height of
                        the spectrogram
  --image_length IMAGE_LENGTH
                        Width of 3 channel spectrogram after preprocessing
  --duration DURATION   Duration of audios (larger audios will clipped and
                        small

In [None]:
!python preprocess.py <PATH_TO_DATASET_CSV> --export_extractor --device 0

## Dev Train

The config file specifies the parameters for the training, the pretrained weights and other parameters.
This will train the model twice, and compute the cross validation evaluation.

In [None]:
!python train.py \
  --config_path ./config/schreder_yolov7_tiny_distilled.json \
  --exp_name "distilled-yolov7-tiny_$(date +%s)"

Alternatively, you can just run the default finetune (you might need to change some values)

```
cd run
./train.sh ./config/schreder_yolov7_tiny_distilled.json distilled-yolov7-tiny
```

## Full train

In [None]:
!python train.py \
  --config_path ./config/schreder_yolov7_tiny_distilled_full.json \
  --exp_name "distilled-yolov7-tiny_$(date +%s)"

## Evaluate

To evaluate make sure you change the `checkpoint` parameter in the config file to the path of the checkpoint you want to evaluate.

In [None]:
!python evaluate.py \
  --config_path ./config/schreder_yolov7_tiny.json