# YOLOv4 - Como criar sua própria base de imagens

O nosso conjunto de imagens customizado para fazer o treinamento com o YOLO deve conter:

* Imagens dos objetos que queremos reconhecer, já catalogadas (com os arquivos annotation)
* Arquivos obj.data e obj.names 
* Arquivo .cfg customizado
* Arquivo train.txt (test.txt é opcional)

Há duas maneiras principais para reunir as imagens
* Baixar de um dataset ou repositório como o Open Images Dataset, um dataset da Google que disponibiliza imagens para mais de 600 classes diferentes.
A maneira mais prática atualmente para baixar as imagens do Open Images Dataset é usar a ferramenta [OIDv4 Toolkit](https://github.com/EscVM/OIDv4_ToolKit). 
* A segunda maneira consiste em baixar manualmente as imagens do objeto e catalogá-las utilizando alguma ferramenta de anotação (annotation tool) para obter os txt com as anotações. Esse é um processo manual e que pode ser bastante demorado, portanto recomendamos verificar se a classe escolhida não está no Dataset do Google. Se você não encontrou no Open Images Dataset o objeto que deseja detectar então é necessário baixar manualmente. Por ser um método mais demorado e cansativo nós recomendamos fazer somente caso não tiver outra maneira de conseguir facilmente baixar essa imagens.

# Coletando e catalogando as imagens do objeto para treinamento

## Etapa 1 - Clonando o repositório da ferramenta

In [1]:
!git clone https://github.com/EscVM/OIDv4_ToolKit.git

Cloning into 'OIDv4_ToolKit'...
remote: Enumerating objects: 422, done.[K
remote: Total 422 (delta 0), reused 0 (delta 0), pack-reused 422[K
Receiving objects: 100% (422/422), 34.08 MiB | 34.72 MiB/s, done.
Resolving deltas: 100% (146/146), done.


## Etapa 2 - Acessando o diretório da ferramenta 

In [2]:
ls

[0m[01;34mOIDv4_ToolKit[0m/  [01;34msample_data[0m/


In [3]:
cd OIDv4_ToolKit/

/content/OIDv4_ToolKit


In [4]:
ls

classes.txt  [0m[01;34mimages[0m/  LICENSE  main.py  [01;34mmodules[0m/  README.md  requirements.txt


## Etapa 3 - Instalando todas as bibliotecas necessárias

In [5]:
!pip3 install -r requirements.txt

Collecting awscli
[?25l  Downloading https://files.pythonhosted.org/packages/61/6e/0536697825b514b3b9e69d3ae3beb0f4abcdd6d04a047c1f715f2a5e0bec/awscli-1.18.223-py2.py3-none-any.whl (3.5MB)
[K     |████████████████████████████████| 3.5MB 4.3MB/s 
Collecting s3transfer<0.4.0,>=0.3.0
[?25l  Downloading https://files.pythonhosted.org/packages/ea/43/4b4a1b26eb03a429a4c37ca7fdf369d938bd60018fc194e94b8379b0c77c/s3transfer-0.3.4-py2.py3-none-any.whl (69kB)
[K     |████████████████████████████████| 71kB 6.9MB/s 
[?25hCollecting docutils<0.16,>=0.10
[?25l  Downloading https://files.pythonhosted.org/packages/22/cd/a6aa959dca619918ccb55023b4cb151949c64d4d5d55b3f4ffd7eee0c6e8/docutils-0.15.2-py3-none-any.whl (547kB)
[K     |████████████████████████████████| 552kB 38.7MB/s 
[?25hCollecting colorama<0.4.4,>=0.2.5; python_version != "3.4"
  Downloading https://files.pythonhosted.org/packages/c9/dc/45cdef1b4d119eb96316b3117e6d5708a08029992b2fee2c143c7a0a5cc5/colorama-0.4.3-py2.py3-none-any.whl


## Etapa 4 - Fazendo o download das imagens

### Conjunto de imagens de Treinamento

- https://storage.googleapis.com/openimages/web/index.html

In [6]:
!python main.py downloader --classes Apple Coffee_cup Horse --type_csv train --limit 500 --multiclasses 1 --yes

[92m
		   ___   _____  ______            _    _    
		 .'   `.|_   _||_   _ `.         | |  | |   
		/  .-.  \ | |    | | `. \ _   __ | |__| |_  
		| |   | | | |    | |  | |[ \ [  ]|____   _| 
		\  `-'  /_| |_  _| |_.' / \ \/ /     _| |_  
		 `.___.'|_____||______.'   \__/     |_____|
	[0m
[92m
             _____                    _                 _             
            (____ \                  | |               | |            
             _   \ \ ___  _ _ _ ____ | | ___   ____  _ | | ____  ____ 
            | |   | / _ \| | | |  _ \| |/ _ \ / _  |/ || |/ _  )/ ___)
            | |__/ / |_| | | | | | | | | |_| ( ( | ( (_| ( (/ /| |    
            |_____/ \___/ \____|_| |_|_|\___/ \_||_|\____|\____)_|    
                                                          
        [0m
    [INFO] | Downloading ['Apple', 'Coffee cup', 'Horse'] together.[0m
[91m   [ERROR] | Missing the class-descriptions-boxable.csv file.[0m
[94m[DOWNLOAD] | Automatic download.[0m
...72%, 0 MB, 613

### Conjunto de imagens de Validação

In [7]:
!python main.py downloader --classes Apple Coffee_cup Horse --type_csv test --limit 100 --multiclasses 1 --yes

[92m
		   ___   _____  ______            _    _    
		 .'   `.|_   _||_   _ `.         | |  | |   
		/  .-.  \ | |    | | `. \ _   __ | |__| |_  
		| |   | | | |    | |  | |[ \ [  ]|____   _| 
		\  `-'  /_| |_  _| |_.' / \ \/ /     _| |_  
		 `.___.'|_____||______.'   \__/     |_____|
	[0m
[92m
             _____                    _                 _             
            (____ \                  | |               | |            
             _   \ \ ___  _ _ _ ____ | | ___   ____  _ | | ____  ____ 
            | |   | / _ \| | | |  _ \| |/ _ \ / _  |/ || |/ _  )/ ___)
            | |__/ / |_| | | | | | | | | |_| ( ( | ( (_| ( (/ /| |    
            |_____/ \___/ \____|_| |_|_|\___/ \_||_|\____|\____)_|    
                                                          
        [0m
    [INFO] | Downloading ['Apple', 'Coffee cup', 'Horse'] together.[0m
[91m   [ERROR] | Missing the test-annotations-bbox.csv file.[0m
[94m[DOWNLOAD] | Automatic download.[0m
...100%, 49 MB, 58921 K

## Etapa 5 - Convertendo os arquivos de anotação

### 1. Colocar as classes no arquivo classes.txt

In [10]:
!cat classes.txt

Apple
Coffee cup
Horse


In [9]:
!echo -e 'Apple\nCoffee cup\nHorse' > classes.txt

### 2. Baixar o arquivo converter_annotations.py e adicione-o ao diretório

In [11]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [12]:
!unzip /content/gdrive/My\ Drive/YOLO/recursos/TreinamentoYOLO.zip -d /content/

Archive:  /content/gdrive/My Drive/YOLO/recursos/TreinamentoYOLO.zip
   creating: /content/TreinamentoYOLO/
  inflating: /content/__MACOSX/._TreinamentoYOLO  
  inflating: /content/TreinamentoYOLO/.DS_Store  
  inflating: /content/__MACOSX/TreinamentoYOLO/._.DS_Store  
  inflating: /content/TreinamentoYOLO/converter_annotations.py  
  inflating: /content/__MACOSX/TreinamentoYOLO/._converter_annotations.py  
  inflating: /content/TreinamentoYOLO/gerar_test.py  
  inflating: /content/__MACOSX/TreinamentoYOLO/._gerar_test.py  
  inflating: /content/TreinamentoYOLO/gerar_train.py  
  inflating: /content/__MACOSX/TreinamentoYOLO/._gerar_train.py  


In [13]:
ls

classes.txt  LICENSE  [0m[01;34mmodules[0m/  README.md
[01;34mimages[0m/      main.py  [01;34mOID[0m/      requirements.txt


In [14]:
!cp /content/TreinamentoYOLO/converter_annotations.py ./

### 3. Executar o arquivo de conversão



In [15]:
!python converter_annotations.py

Subdiretorio atual: test
Convertendo os annotations para a classe:  Apple_Coffee cup_Horse
100% 300/300 [00:07<00:00, 39.02it/s]
Subdiretorio atual: train
Convertendo os annotations para a classe:  Apple_Coffee cup_Horse
100% 1498/1498 [00:55<00:00, 26.92it/s]


## Etapa 6 - Compactar o dataset 

In [16]:
cd OID/Dataset/train/

/content/OIDv4_ToolKit/OID/Dataset/train


In [17]:
ls

[0m[01;34m'Apple_Coffee cup_Horse'[0m/


In [18]:
!zip -r ../../../obj.zip obj -x obj/Label/*

  adding: obj/ (stored 0%)
  adding: obj/5e4ee9d90b0775be.jpg (deflated 0%)
  adding: obj/279e1b52e1e6be9c.jpg (deflated 2%)
  adding: obj/323a0f59b094903e.txt (deflated 32%)
  adding: obj/b3660026e8f65824.txt (deflated 16%)
  adding: obj/27706823aee8d883.jpg (deflated 0%)
  adding: obj/15f908c12e4954ac.jpg (deflated 0%)
  adding: obj/8c54130a698c7d72.jpg (deflated 0%)
  adding: obj/55e7d9ee66c71c38.txt (deflated 42%)
  adding: obj/ed2a0f74f658b538.txt (deflated 41%)
  adding: obj/1568480b95303576.txt (deflated 60%)
  adding: obj/9f0ba82bbf640416.txt (deflated 19%)
  adding: obj/6322a187a4586023.jpg (deflated 0%)
  adding: obj/2964b41560eb1c97.jpg (deflated 1%)
  adding: obj/376c707f4b56aa27.txt (deflated 47%)
  adding: obj/2b9682f4c8dbf8b4.jpg (deflated 0%)
  adding: obj/45d4389ce580445e.jpg (deflated 0%)
  adding: obj/103ca136b9f488fe.jpg (deflated 1%)
  adding: obj/12aecb6181d5493f.txt (deflated 44%)
  adding: obj/79c9bd9ebb337d9a.jpg (deflated 0%)
  adding: obj/7df80a84f8f2d4c5.txt

In [19]:
ls

[0m[01;34mobj[0m/


In [20]:
cd ../../../../

/content


In [21]:
ls

[0m[01;34mgdrive[0m/  [01;34m__MACOSX[0m/  [01;34mOIDv4_ToolKit[0m/  [01;34msample_data[0m/  [01;34mTreinamentoYOLO[0m/


In [22]:
cd OIDv4_ToolKit/

/content/OIDv4_ToolKit


In [23]:
!cp ./obj.zip /content/gdrive/My\ Drive/YOLO/recursos

In [25]:
cd OID/Dataset/test/

/content/OIDv4_ToolKit/OID/Dataset/test


In [26]:
ls

[0m[01;34mvalid[0m/


In [27]:
!zip -r ../../../../valid.zip valid -x valid/Label/*

  adding: valid/ (stored 0%)
  adding: valid/13c317f605023811.txt (deflated 37%)
  adding: valid/a741179a0cf5aab0.txt (deflated 11%)
  adding: valid/5c41d0cacc8754c6.txt (deflated 33%)
  adding: valid/6c6bd55413d228ed.txt (deflated 33%)
  adding: valid/572e46e0829206fb.txt (deflated 8%)
  adding: valid/be5ba48c501a7ea5.txt (deflated 27%)
  adding: valid/07ef43face9d169b.jpg (deflated 1%)
  adding: valid/caec3e08aeb9899d.jpg (deflated 0%)
  adding: valid/32d0b8bec208e66b.jpg (deflated 0%)
  adding: valid/b6024eea086419a9.txt (deflated 56%)
  adding: valid/48c64de65f175030.txt (deflated 53%)
  adding: valid/50bb62d664042d33.jpg (deflated 3%)
  adding: valid/8e0486b14d9e5ce9.txt (deflated 30%)
  adding: valid/b457dcb5712a6f52.jpg (deflated 0%)
  adding: valid/96dc858fc4e47379.txt (deflated 39%)
  adding: valid/836faef04adc1e8f.jpg (deflated 0%)
  adding: valid/21824a91ec537703.txt (deflated 64%)
  adding: valid/f9cf7cec793a2988.jpg (deflated 0%)
  adding: valid/d3384be251a3e467.jpg (defla

In [28]:
cd /content/

/content


In [29]:
ls

[0m[01;34mgdrive[0m/  [01;34m__MACOSX[0m/  [01;34mOIDv4_ToolKit[0m/  [01;34msample_data[0m/  [01;34mTreinamentoYOLO[0m/  valid.zip


In [30]:
!cp ./valid.zip /content/gdrive/My\ Drive/YOLO/recursos

# Editando os arquivos de configuração necessários para o treinamento

In [31]:
ls

[0m[01;34mgdrive[0m/  [01;34m__MACOSX[0m/  [01;34mOIDv4_ToolKit[0m/  [01;34msample_data[0m/  [01;34mTreinamentoYOLO[0m/  valid.zip


In [32]:
pwd

'/content'

In [33]:
!git clone https://github.com/AlexeyAB/darknet

Cloning into 'darknet'...
remote: Enumerating objects: 14691, done.[K
remote: Total 14691 (delta 0), reused 0 (delta 0), pack-reused 14691[K
Receiving objects: 100% (14691/14691), 13.27 MiB | 19.33 MiB/s, done.
Resolving deltas: 100% (9995/9995), done.


In [34]:
cd darknet/

/content/darknet


In [35]:
!make

mkdir -p ./obj/
mkdir -p backup
chmod +x *.sh
g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -Ofast -c ./src/image_opencv.cpp -o obj/image_opencv.o
g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -Ofast -c ./src/http_stream.cpp -o obj/http_stream.o
[01m[K./src/http_stream.cpp:[m[K In member function ‘[01m[Kbool JSON_sender::write(const char*)[m[K’:
                 int [01;35m[Kn[m[K = _write(client, outputbuf, outlen);
                     [01;35m[K^[m[K
[01m[K./src/http_stream.cpp:[m[K In function ‘[01m[Kvoid set_track_id(detection*, int, float, float, float, int, int, int)[m[K’:
         for (int i = 0; [01;35m[Ki < v.size()[m[K; ++i) {
                         [01;35m[K~~^~~~~~~~~~[m[K
     for (int old_id = 0; [01;35m[Kold_id < old_dets.size()[m[K; ++old_id) {
                          [0

## Etapa 7 - Definindo os arquivos de configuração

### Modificações no .cfg

In [37]:
!cp cfg/yolov4.cfg /content/gdrive/My\ Drive/YOLO/recursos/yolov4_custom.cfg

### Modificações no obj.names e obj.data

In [38]:
!touch obj.names
!touch obj.data

In [39]:
!cp obj.names /content/gdrive/My\ Drive/YOLO/recursos/obj.names
!cp obj.data /content/gdrive/My\ Drive/YOLO/recursos/obj.data

## Etapa 8 - Gerando o arquivo train.txt e test.txt

In [40]:
pwd

'/content/darknet'

In [41]:
cd ..

/content


In [42]:
cd OIDv4_ToolKit/

/content/OIDv4_ToolKit


In [43]:
!unzip obj.zip -d ./data

Archive:  obj.zip
   creating: ./data/obj/
  inflating: ./data/obj/5e4ee9d90b0775be.jpg  
  inflating: ./data/obj/279e1b52e1e6be9c.jpg  
  inflating: ./data/obj/323a0f59b094903e.txt  
  inflating: ./data/obj/b3660026e8f65824.txt  
  inflating: ./data/obj/27706823aee8d883.jpg  
  inflating: ./data/obj/15f908c12e4954ac.jpg  
  inflating: ./data/obj/8c54130a698c7d72.jpg  
  inflating: ./data/obj/55e7d9ee66c71c38.txt  
  inflating: ./data/obj/ed2a0f74f658b538.txt  
  inflating: ./data/obj/1568480b95303576.txt  
  inflating: ./data/obj/9f0ba82bbf640416.txt  
  inflating: ./data/obj/6322a187a4586023.jpg  
  inflating: ./data/obj/2964b41560eb1c97.jpg  
  inflating: ./data/obj/376c707f4b56aa27.txt  
  inflating: ./data/obj/2b9682f4c8dbf8b4.jpg  
  inflating: ./data/obj/45d4389ce580445e.jpg  
  inflating: ./data/obj/103ca136b9f488fe.jpg  
  inflating: ./data/obj/12aecb6181d5493f.txt  
  inflating: ./data/obj/79c9bd9ebb337d9a.jpg  
  inflating: ./data/obj/7df80a84f8f2d4c5.txt  
  inflating: ./da

In [44]:
!unzip /content/valid.zip -d ./data

Archive:  /content/valid.zip
   creating: ./data/valid/
  inflating: ./data/valid/13c317f605023811.txt  
  inflating: ./data/valid/a741179a0cf5aab0.txt  
  inflating: ./data/valid/5c41d0cacc8754c6.txt  
  inflating: ./data/valid/6c6bd55413d228ed.txt  
  inflating: ./data/valid/572e46e0829206fb.txt  
  inflating: ./data/valid/be5ba48c501a7ea5.txt  
  inflating: ./data/valid/07ef43face9d169b.jpg  
  inflating: ./data/valid/caec3e08aeb9899d.jpg  
  inflating: ./data/valid/32d0b8bec208e66b.jpg  
  inflating: ./data/valid/b6024eea086419a9.txt  
  inflating: ./data/valid/48c64de65f175030.txt  
  inflating: ./data/valid/50bb62d664042d33.jpg  
  inflating: ./data/valid/8e0486b14d9e5ce9.txt  
  inflating: ./data/valid/b457dcb5712a6f52.jpg  
  inflating: ./data/valid/96dc858fc4e47379.txt  
  inflating: ./data/valid/836faef04adc1e8f.jpg  
  inflating: ./data/valid/21824a91ec537703.txt  
  inflating: ./data/valid/f9cf7cec793a2988.jpg  
  inflating: ./data/valid/d3384be251a3e467.jpg  
  inflating: 

In [45]:
pwd

'/content/OIDv4_ToolKit'

In [46]:
!python /content/TreinamentoYOLO/gerar_train.py

In [47]:
!python /content/TreinamentoYOLO/gerar_test.py

In [48]:
cd data

/content/OIDv4_ToolKit/data


In [49]:
pwd

'/content/OIDv4_ToolKit/data'

In [50]:
!cp train.txt /content/gdrive/My\ Drive/YOLO/recursos/train.txt

In [51]:
!cp test.txt /content/gdrive/My\ Drive/YOLO/recursos/test.txt