# The teaching of the YOLOv3 image recognition model

## Initializing the environment

For using the virtualenv environment, please refer to the readme.

## Creating the dataset

In order to obtain our dataset, we use image augmentation on the following [dataset images](https://github.com/marquezo/darknet/tree/master/duckiestuff).

Using the script defined in the __data_fetch__ python file we can create the desired amount of images. Do not forget to __clean your output folders__ before another operation.

In [4]:
import lib

In [2]:
# Creating the trainset
lib.create_voc_augmented_database(images_dir = "../data/trainset", 
                                  output_dir = "../data/img_aug_trainset", 
                                  label_output_dir = "../data/lab_aug_trainset", 
                                  sample_size = 100)

Executing Pipeline:   0%|          | 0/100 [00:00<?, ? Samples/s]

Initialised with 378 image(s) found.
Output directory set to ../data/trainset/../img_aug_trainset.

Processing <PIL.Image.Image image mode=RGB size=640x480 at 0x7F54185D9250>: 100%|██████████| 100/100 [00:02<00:00, 49.59 Samples/s]


../data/img_aug_trainset


In [4]:
# Creating the validset
lib.create_voc_augmented_database(images_dir = "../data/validset", 
                                  output_dir = "../data/img_aug_validset", 
                                  label_output_dir = "../data/lab_aug_validset", 
                                  sample_size = 100)

Executing Pipeline:   0%|          | 0/100 [00:00<?, ? Samples/s]

Initialised with 21 image(s) found.
Output directory set to ../data/validset/../img_aug_validset.

Processing <PIL.Image.Image image mode=RGB size=640x480 at 0x7FAE274FE590>: 100%|██████████| 100/100 [00:02<00:00, 49.29 Samples/s]


../data/img_aug_validset


Get the pretrained base weights

In [14]:
!wget http://download1139.mediafire.com/cxpeithdjlpg/l1b96fk7j18yi7v/backend.h5

--2019-11-19 16:07:04--  http://download1139.mediafire.com/cxpeithdjlpg/l1b96fk7j18yi7v/backend.h5
download1139.mediafire.com (download1139.mediafire.com) feloldása… 205.196.122.80
Csatlakozás a következőhöz: download1139.mediafire.com (download1139.mediafire.com)[205.196.122.80]:80… kapcsolódva.
HTTP kérés elküldve, várakozás válaszra… 200 OK
Hossz: 248671664 (237M) [application/x-hdf]
Mentés ide: „backend.h5.1”


2019-11-19 16:25:40 (218 KB/s) -- „backend.h5.1” mentve [248671664/248671664]



## Example training

Do not forget to clone your external submodule as well. Then first we copy the proper config file from ext/keras-yolo3/config.json which we will use to set our custom preferences.

In [2]:
!git submodule update

In [4]:
# BE CAREFUL, DO NOT OVERWRITE
!cp ext/keras-yolo3/config.json .

### Creating the configuration

Here we have to set the most important settings. They are the followings.

__IMPORTANT: CUSTOMIZE IT TO YOUR OWN PATHS__

* labels: the labels defined in the xml files, one label belongs to one class
* anchors: pair of values describing the windows ratios of the first convolutional layer
* train_image_folder: the folder where the training images are stored
* train_annot_folder: the folder where the training labels are stored
* saved_weights_name: the name of the file where the weights are saved

Hints for fine-tuning:
generating anchors, ... in [this repo](https://github.com/experiencor/keras-yolo3)

Finally do the example training.

In [2]:
!python ext/keras-yolo3/train.py -c config.json

Using TensorFlow backend.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Seen labels: 	{'yellowsign': 322, 'greensign': 117, 'duckie': 228, 'bot': 131}

Given labels: 	['bot', 'duckie', 'greensign', 'yellowsign']

Training on: 	['bot', 'duckie', 'greensign', 'yellowsign']

2019-11-19 16:49:25.207115: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compile

### Evaluation of the prediction

#### Tensorboard

One method to evaluate the prediction is using the __tensorboard__ utility which is shipped with tensorflow by default. With this utility one can analyze the learning parameters and look for optimization possibilities. 

![alt Tensorboard](../docs/tensorboard.bmp)

The tensorboard is also able to visualize the whole architecture. The yolov3 has a lot of layers so the image below shows only a small part of the architecture.

![alt Tensorboard](../docs/yolo-architecture.bmp)

To enable the tensorboard, one should just call the following magic functions.

In [7]:
%load_ext tensorboard
%tensorboard --logdir logs

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


ERROR: Could not find `tensorboard`. Please ensure that your PATH
contains an executable `tensorboard` program, or explicitly specify
the path to a TensorBoard binary by setting the `TENSORBOARD_BINARY`
environment variable.

For me it did not work, so I made the following workaround. I looked for the used tensorboard package and called its main method with the proper parameters.

To search for the main method, one has to look for it using the __pip__ command:

```bash
pip show tensorboard
```

In [None]:
!python ./.venv/lib/python3.7/site-packages/tensorboard/main.py --logdir=logs

#### Creating predictions on real images

In [9]:
!python ext/keras-yolo3/predict.py -c config.json -i "../data/testset/102_000309.jpg"

Using TensorFlow backend.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2019-11-19 23:11:27.822191: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-11-19 23:11:27.840660: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2195555000 Hz
2019-11-19 23:11:27.841086: I tensorflow/compiler/xla/service/servi

The output is written to the __output__ folder. We still see a lot of possibilities to optimize our solution.

![alt Suboptimal solution](output/102_000309.jpg)