# YOLOv4 - Como criar sua própria base de imagens

O nosso conjunto de imagens customizado para fazer o treinamento com o YOLO deve conter:

* Imagens dos objetos que queremos reconhecer, já catalogadas (com os arquivos annotation)
* Arquivos obj.data e obj.names 
* Arquivo .cfg customizado
* Arquivo train.txt (test.txt é opcional)

Há duas maneiras principais para reunir as imagens
* Baixar de um dataset ou repositório como o Open Images Dataset, um dataset da Google que disponibiliza imagens para mais de 600 classes diferentes.
A maneira mais prática atualmente para baixar as imagens do Open Images Dataset é usar a ferramenta [OIDv4 Toolkit](https://github.com/EscVM/OIDv4_ToolKit). 
* A segunda maneira consiste em baixar manualmente as imagens do objeto e catalogá-las utilizando alguma ferramenta de anotação (annotation tool) para obter os txt com as anotações. Esse é um processo manual e que pode ser bastante demorado, portanto recomendamos verificar se a classe escolhida não está no Dataset do Google. Se você não encontrou no Open Images Dataset o objeto que deseja detectar então é necessário baixar manualmente. Por ser um método mais demorado e cansativo nós recomendamos fazer somente caso não tiver outra maneira de conseguir facilmente baixar essa imagens.

# Coletando e catalogando as imagens do objeto para treinamento

## Etapa 1 - Clonando o repositório da ferramenta

In [1]:
!git clone https://github.com/EscVM/OIDv4_ToolKit.git

Cloning into 'OIDv4_ToolKit'...
remote: Enumerating objects: 422, done.[K
remote: Total 422 (delta 0), reused 0 (delta 0), pack-reused 422[K
Receiving objects: 100% (422/422), 34.08 MiB | 34.28 MiB/s, done.
Resolving deltas: 100% (146/146), done.


## Etapa 2 - Acessando o diretório da ferramenta 

In [2]:
ls

[0m[01;34mOIDv4_ToolKit[0m/  [01;34msample_data[0m/


In [3]:
cd OIDv4_ToolKit/

/content/OIDv4_ToolKit


In [4]:
ls

classes.txt  [0m[01;34mimages[0m/  LICENSE  main.py  [01;34mmodules[0m/  README.md  requirements.txt


## Etapa 3 - Instalando todas as bibliotecas necessárias

In [5]:
!pip3 install -r requirements.txt

Collecting awscli
  Downloading awscli-1.20.48-py3-none-any.whl (3.7 MB)
[K     |████████████████████████████████| 3.7 MB 5.1 MB/s 
Collecting botocore==1.21.48
  Downloading botocore-1.21.48-py3-none-any.whl (7.9 MB)
[K     |████████████████████████████████| 7.9 MB 34.4 MB/s 
[?25hCollecting colorama<0.4.4,>=0.2.5
  Downloading colorama-0.4.3-py2.py3-none-any.whl (15 kB)
Collecting docutils<0.16,>=0.10
  Downloading docutils-0.15.2-py3-none-any.whl (547 kB)
[K     |████████████████████████████████| 547 kB 59.4 MB/s 
Collecting s3transfer<0.6.0,>=0.5.0
  Downloading s3transfer-0.5.0-py3-none-any.whl (79 kB)
[K     |████████████████████████████████| 79 kB 6.4 MB/s 
Collecting urllib3
  Downloading urllib3-1.26.7-py2.py3-none-any.whl (138 kB)
[K     |████████████████████████████████| 138 kB 60.1 MB/s 
[?25hCollecting jmespath<1.0.0,>=0.7.1
  Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Installing collected packages: urllib3, jmespath, botocore, s3transfer, docutils, co

## Etapa 4 - Fazendo o download das imagens

### Conjunto de imagens de Treinamento

- https://storage.googleapis.com/openimages/web/index.html

In [6]:
!python main.py downloader --classes Apple Coffee_cup Horse --type_csv train --limit 500 --multiclasses 1

[92m
		   ___   _____  ______            _    _    
		 .'   `.|_   _||_   _ `.         | |  | |   
		/  .-.  \ | |    | | `. \ _   __ | |__| |_  
		| |   | | | |    | |  | |[ \ [  ]|____   _| 
		\  `-'  /_| |_  _| |_.' / \ \/ /     _| |_  
		 `.___.'|_____||______.'   \__/     |_____|
	[0m
[92m
             _____                    _                 _             
            (____ \                  | |               | |            
             _   \ \ ___  _ _ _ ____ | | ___   ____  _ | | ____  ____ 
            | |   | / _ \| | | |  _ \| |/ _ \ / _  |/ || |/ _  )/ ___)
            | |__/ / |_| | | | | | | | | |_| ( ( | ( (_| ( (/ /| |    
            |_____/ \___/ \____|_| |_|_|\___/ \_||_|\____|\____)_|    
                                                          
        [0m
    [INFO] | Downloading ['Apple', 'Coffee cup', 'Horse'] together.[0m
[91m   [ERROR] | Missing the class-descriptions-boxable.csv file.[0m
[94m[DOWNLOAD] | Do you want to download the missing file? 

### Conjunto de imagens de Validação

In [None]:
!python main.py downloader --classes Apple Coffee_cup Horse --type_csv test --limit 100 --multiclasses 1

[92m
		   ___   _____  ______            _    _    
		 .'   `.|_   _||_   _ `.         | |  | |   
		/  .-.  \ | |    | | `. \ _   __ | |__| |_  
		| |   | | | |    | |  | |[ \ [  ]|____   _| 
		\  `-'  /_| |_  _| |_.' / \ \/ /     _| |_  
		 `.___.'|_____||______.'   \__/     |_____|
	[0m
[92m
             _____                    _                 _             
            (____ \                  | |               | |            
             _   \ \ ___  _ _ _ ____ | | ___   ____  _ | | ____  ____ 
            | |   | / _ \| | | |  _ \| |/ _ \ / _  |/ || |/ _  )/ ___)
            | |__/ / |_| | | | | | | | | |_| ( ( | ( (_| ( (/ /| |    
            |_____/ \___/ \____|_| |_|_|\___/ \_||_|\____|\____)_|    
                                                          
        [0m
    [INFO] | Downloading ['Apple', 'Coffee cup', 'Horse'] together.[0m
[91m   [ERROR] | Missing the test-annotations-bbox.csv file.[0m
[94m[DOWNLOAD] | Do you want to download the missing file? [Y/n]

## Etapa 5 - Convertendo os arquivos de anotação

### 1. Colocar as classes no arquivo classes.txt

In [None]:
!cat classes.txt

Apple
Orange
Light switch


In [None]:
!echo -e 'Apple\nCoffee cup\nHorse' > classes.txt

### 2. Baixar o arquivo converter_annotations.py e adicione-o ao diretório

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [None]:
!unzip /content/gdrive/MyDrive/YoloRecursos/recursos/TreinamentoYOLO.zip -d /content/

Archive:  /content/gdrive/MyDrive/YoloRecursos/recursos/TreinamentoYOLO.zip
   creating: /content/TreinamentoYOLO/
  inflating: /content/__MACOSX/._TreinamentoYOLO  
  inflating: /content/TreinamentoYOLO/.DS_Store  
  inflating: /content/__MACOSX/TreinamentoYOLO/._.DS_Store  
  inflating: /content/TreinamentoYOLO/converter_annotations.py  
  inflating: /content/__MACOSX/TreinamentoYOLO/._converter_annotations.py  
  inflating: /content/TreinamentoYOLO/gerar_test.py  
  inflating: /content/__MACOSX/TreinamentoYOLO/._gerar_test.py  
  inflating: /content/TreinamentoYOLO/gerar_train.py  
  inflating: /content/__MACOSX/TreinamentoYOLO/._gerar_train.py  


In [None]:
ls

classes.txt  LICENSE  [0m[01;34mmodules[0m/  README.md
[01;34mimages[0m/      main.py  [01;34mOID[0m/      requirements.txt


In [None]:
!cp /content/TreinamentoYOLO/converter_annotations.py ./

### 3. Executar o arquivo de conversão



In [None]:
!python converter_annotations.py

Subdiretorio atual: train
Convertendo os annotations para a classe:  Apple_Coffee cup_Horse
100% 1500/1500 [00:51<00:00, 29.17it/s]
Subdiretorio atual: test
Convertendo os annotations para a classe:  Apple_Coffee cup_Horse
100% 300/300 [00:08<00:00, 36.75it/s]


## Etapa 6 - Compactar o dataset 

In [None]:
cd OID/Dataset/train/

/content/OIDv4_ToolKit/OID/Dataset/train


In [None]:
ls

[0m[01;34mobj[0m/


In [None]:
!zip -r ../../../obj.zip obj -x obj/Label/*

  adding: obj/ (stored 0%)
  adding: obj/4cf071a2ee293223.txt (deflated 63%)
  adding: obj/db94f36f05251520.txt (deflated 13%)
  adding: obj/34c4e5b123e9f33f.jpg (deflated 0%)
  adding: obj/564a56f8382c2da8.jpg (deflated 1%)
  adding: obj/3ec4e43819ec8da1.txt (deflated 27%)
  adding: obj/7430bf655977fef6.txt (deflated 25%)
  adding: obj/0891fe602db686af.txt (deflated 21%)
  adding: obj/45bfcfe7f9da0ea9.txt (deflated 38%)
  adding: obj/1a946487ffafd61f.jpg (deflated 0%)
  adding: obj/09ed54b36eaa5316.txt (deflated 52%)
  adding: obj/ab883867d67dfdd9.txt (deflated 25%)
  adding: obj/4e9d8b593b08ee5c.txt (deflated 66%)
  adding: obj/703cfb1d44fbb1ae.txt (deflated 22%)
  adding: obj/76a7147ee88a430e.txt (deflated 18%)
  adding: obj/e83f363b1ef1bc2f.jpg (deflated 0%)
  adding: obj/14bb3c45ae1df75c.txt (deflated 35%)
  adding: obj/e90fc43dda1a199c.txt (deflated 40%)
  adding: obj/00730e60ab276b06.jpg (deflated 1%)
  adding: obj/007a0bec00a90a66.jpg (deflated 0%)
  adding: obj/e5586089e48cae7

In [None]:
ls

[0m[01;34mobj[0m/


In [None]:
cd ../../../../

/content


In [None]:
ls

[0m[01;34mgdrive[0m/  [01;34m__MACOSX[0m/  [01;34mOIDv4_ToolKit[0m/  [01;34msample_data[0m/  [01;34mTreinamentoYOLO[0m/


In [None]:
cd OIDv4_ToolKit/

/content/OIDv4_ToolKit


In [None]:
!cp ./obj.zip /content/gdrive/MyDrive/YoloRecursos/recursos

In [None]:
ls

[0m[01;34mvalid[0m/


In [None]:
cd OID/Dataset/test/

[Errno 2] No such file or directory: 'OID/Dataset/test/'
/content/OIDv4_ToolKit/OID/Dataset/test


In [None]:
!zip -r ../../../../valid.zip valid -x valid/Label/*

  adding: valid/ (stored 0%)
  adding: valid/2f7a596e42ad0bbc.txt (deflated 22%)
  adding: valid/414a7712b1d6d9e0.txt (deflated 32%)
  adding: valid/b9e080ff9a0269f2.txt (deflated 22%)
  adding: valid/e63672389c71b155.jpg (deflated 0%)
  adding: valid/55b64dbeec86d55c.jpg (deflated 0%)
  adding: valid/9c77de3793fce148.txt (deflated 40%)
  adding: valid/9fc593bacf733807.txt (deflated 13%)
  adding: valid/20e04b5395c1ee8c.txt (deflated 14%)
  adding: valid/049f7d67144f3e13.txt (deflated 24%)
  adding: valid/547ce4cf05f6e827.jpg (deflated 0%)
  adding: valid/53f14863bd9d7e5b.jpg (deflated 0%)
  adding: valid/82a0e0a3645b31af.txt (deflated 28%)
  adding: valid/6e1b12b9e0c2ac41.jpg (deflated 0%)
  adding: valid/7c9c7d314498fc0e.txt (deflated 24%)
  adding: valid/096ca56a5c51f6d6.jpg (deflated 0%)
  adding: valid/71db4bdad912bd52.jpg (deflated 0%)
  adding: valid/e6966c820300665d.txt (deflated 29%)
  adding: valid/1843a54af7b109e3.txt (deflated 39%)
  adding: valid/1df6e430c821910b.jpg (defl

In [None]:
cd /content/

/content


In [None]:
ls

[0m[01;34mgdrive[0m/  [01;34m__MACOSX[0m/  [01;34mOIDv4_ToolKit[0m/  [01;34msample_data[0m/  [01;34mTreinamentoYOLO[0m/  valid.zip


In [None]:
!cp ./valid.zip /content/gdrive/MyDrive/YoloRecursos/recursos

# Editando os arquivos de configuração necessários para o treinamento

In [None]:
ls

[0m[01;34mgdrive[0m/  [01;34m__MACOSX[0m/  [01;34mOIDv4_ToolKit[0m/  [01;34msample_data[0m/  [01;34mTreinamentoYOLO[0m/  valid.zip


In [None]:
pwd

'/content'

In [None]:
!git clone https://github.com/AlexeyAB/darknet

Cloning into 'darknet'...
remote: Enumerating objects: 15308, done.[K
remote: Total 15308 (delta 0), reused 0 (delta 0), pack-reused 15308[K
Receiving objects: 100% (15308/15308), 13.69 MiB | 4.46 MiB/s, done.
Resolving deltas: 100% (10399/10399), done.


In [None]:
cd darknet/

/content/darknet


In [None]:
!make

mkdir -p ./obj/
mkdir -p backup
chmod +x *.sh
g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -Ofast -c ./src/image_opencv.cpp -o obj/image_opencv.o
g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -Ofast -c ./src/http_stream.cpp -o obj/http_stream.o
[01m[K./src/http_stream.cpp:[m[K In member function ‘[01m[Kbool JSON_sender::write(const char*)[m[K’:
                 int [01;35m[Kn[m[K = _write(client, outputbuf, outlen);
                     [01;35m[K^[m[K
[01m[K./src/http_stream.cpp:[m[K In function ‘[01m[Kvoid set_track_id(detection*, int, float, float, float, int, int, int)[m[K’:
         for (int i = 0; [01;35m[Ki < v.size()[m[K; ++i) {
                         [01;35m[K~~^~~~~~~~~~[m[K
     for (int old_id = 0; [01;35m[Kold_id < old_dets.size()[m[K; ++old_id) {
                          [0

## Etapa 7 - Definindo os arquivos de configuração

### Modificações no .cfg

In [None]:
!cp cfg/yolov4.cfg /content/gdrive/MyDrive/YoloRecursos/recursos/yolov4_custom.cfg

### Modificações no obj.names e obj.data

In [None]:
!touch obj.names
!touch obj.data

In [None]:
!cp obj.names /content/gdrive/MyDrive/YoloRecursos/recursos/obj.names
!cp obj.data /content/gdrive/MyDrive/YoloRecursos/recursos/obj.data

## Etapa 8 - Gerando o arquivo train.txt e test.txt

In [None]:
pwd

'/content/darknet'

In [None]:
cd ..

/content


In [None]:
cd OIDv4_ToolKit/

/content/OIDv4_ToolKit


In [None]:
!unzip obj.zip -d ./data

Archive:  obj.zip
   creating: ./data/obj/
  inflating: ./data/obj/4cf071a2ee293223.txt  
  inflating: ./data/obj/db94f36f05251520.txt  
  inflating: ./data/obj/34c4e5b123e9f33f.jpg  
  inflating: ./data/obj/564a56f8382c2da8.jpg  
  inflating: ./data/obj/3ec4e43819ec8da1.txt  
  inflating: ./data/obj/7430bf655977fef6.txt  
  inflating: ./data/obj/0891fe602db686af.txt  
  inflating: ./data/obj/45bfcfe7f9da0ea9.txt  
  inflating: ./data/obj/1a946487ffafd61f.jpg  
  inflating: ./data/obj/09ed54b36eaa5316.txt  
  inflating: ./data/obj/ab883867d67dfdd9.txt  
  inflating: ./data/obj/4e9d8b593b08ee5c.txt  
  inflating: ./data/obj/703cfb1d44fbb1ae.txt  
  inflating: ./data/obj/76a7147ee88a430e.txt  
  inflating: ./data/obj/e83f363b1ef1bc2f.jpg  
  inflating: ./data/obj/14bb3c45ae1df75c.txt  
  inflating: ./data/obj/e90fc43dda1a199c.txt  
  inflating: ./data/obj/00730e60ab276b06.jpg  
  inflating: ./data/obj/007a0bec00a90a66.jpg  
  inflating: ./data/obj/e5586089e48cae77.jpg  
  inflating: ./da

In [None]:
!unzip /content/valid.zip -d ./data

Archive:  /content/valid.zip
   creating: ./data/valid/
  inflating: ./data/valid/2f7a596e42ad0bbc.txt  
  inflating: ./data/valid/414a7712b1d6d9e0.txt  
  inflating: ./data/valid/b9e080ff9a0269f2.txt  
  inflating: ./data/valid/e63672389c71b155.jpg  
  inflating: ./data/valid/55b64dbeec86d55c.jpg  
  inflating: ./data/valid/9c77de3793fce148.txt  
  inflating: ./data/valid/9fc593bacf733807.txt  
  inflating: ./data/valid/20e04b5395c1ee8c.txt  
  inflating: ./data/valid/049f7d67144f3e13.txt  
  inflating: ./data/valid/547ce4cf05f6e827.jpg  
  inflating: ./data/valid/53f14863bd9d7e5b.jpg  
  inflating: ./data/valid/82a0e0a3645b31af.txt  
  inflating: ./data/valid/6e1b12b9e0c2ac41.jpg  
  inflating: ./data/valid/7c9c7d314498fc0e.txt  
  inflating: ./data/valid/096ca56a5c51f6d6.jpg  
  inflating: ./data/valid/71db4bdad912bd52.jpg  
  inflating: ./data/valid/e6966c820300665d.txt  
  inflating: ./data/valid/1843a54af7b109e3.txt  
  inflating: ./data/valid/1df6e430c821910b.jpg  
  inflating: 

In [None]:
pwd

'/content/OIDv4_ToolKit'

In [None]:
!python /content/TreinamentoYOLO/gerar_train.py

In [None]:
!python /content/TreinamentoYOLO/gerar_test.py

In [None]:
cd data

/content/OIDv4_ToolKit/data


In [None]:
pwd

'/content/OIDv4_ToolKit/data'

In [None]:
!cp train.txt /content/gdrive/MyDrive/YoloRecursos/recursos/train.txt

In [None]:
!cp test.txt /content/gdrive/MyDrive/YoloRecursos/recursos/test.txt