# Laborátorio Visão Computacional

Nesse lab tentarei demonstrar um pouco de como seria possível fazer um computador ver. Para tal, irei usar a biblioteca [**ultralytics**](https://docs.ultralytics.com/pt).

## O Que Fizemos

- Treinamento de Modelos
  - Modelos Treinados
    - Detecção do **Nome, Data de Nascimento, Data de Expedição, RG e CPF** para **documentos de identidade**

In [1]:
from ultralytics import YOLO

### Treinamento de Modelos

As imagens utilizadas no treinamento foram retiradas de [BID Dataset](https://github.com/ricardobnjunior/Brazilian-Identity-Document-Dataset).

A ferramenta para marcação de imagens foi a [CVAT.ai](https://www.cvat.ai/)

Separei 100 imagens dos documentos de RG, escolhi aquelas em que o documento estava na orientação *correta*. Fiz as marcações necessárias. Separei 70 imagens para treino, 30 para teste e validação.

Os modelos gerados foram:

- Modelo detecção RG Verso: **\dataset\documentos-pessoais\rg-verso\modelo_20240522**

---
Utilziando o [**YOLOv8**](https://docs.ultralytics.com/modes/) poderemos treinar um modelo com nossas imagens de maneira relativamente simples seguindo o seguinte pipeline
- **Train**: Aperfeiçoar um modelo com dados personalizados.
- **Val**: Verificar o desempenho do modelo.
- **Predict**: Usar dados diferentes e ver o resultado do treinamento
- **Export**: Exportar o modelo para uso posterior.
- **Track**: Detectar os objetos em tempo real.
- **Benchmark**: Analisar o modelo em diferentes ambientes. 

Meu objetivo não será desbravar todo o framework, e sim testar algumas coisas.


#### Train 

[Treinar](https://docs.ultralytics.com/modes/train/) nada mais é do que fazer o modelo aprender a detectar os objetos que marcamos em nossas imagens. Esse modo possui alguns hiperparâmetros, configurações.

- model
- data
- epochs
- time
- patience
- batch
- imgsz
- divice
- project
- name
- exist_ok
- pretrained
- optimizer
- verbose
- seed
- single_cls
- rect
- resume
- freeze
- 

In [2]:
model = YOLO("yolov8m.yaml").load("yolov8m.pt") # Controi a partir de um YAML e tranfere os pessos

Transferred 475/475 items from pretrained weights


Agora, treinamos o modelo com o yaml do projeto

In [3]:
result = model.train(data="./datasets/documentos-pessoais/rg-verso/documentos-pessoais-rg-verso.yaml", epochs=30)

New https://pypi.org/project/ultralytics/8.2.19 available  Update with 'pip install -U ultralytics'
Ultralytics YOLOv8.2.14  Python-3.11.4 torch-2.3.0+cpu CPU (Intel Xeon Gold 5118 2.30GHz)
[34m[1mengine\trainer: [0mtask=detect, mode=train, model=yolov8m.yaml, data=./datasets/documentos-pessoais/rg-verso/documentos-pessoais-rg-verso.yaml, epochs=30, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train10, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False,

[34m[1mtrain: [0mScanning \\BR004\Usuários$\rodrigo.goncalves\Estudos\Projetos\rgbrain-lab-ocr\rgbrain_lab_ocr\datasets\documentos-pessoais\rg-verso\train\labels... 70 images, 0 backgrounds, 0 corrupt: 100%|██████████| 70/70 [00:01<00:00, 61.77it/s]


[34m[1mtrain: [0mNew cache created: \\BR004\Usurios$\rodrigo.goncalves\Estudos\Projetos\rgbrain-lab-ocr\rgbrain_lab_ocr\datasets\documentos-pessoais\rg-verso\train\labels.cache


[34m[1mval: [0mScanning \\BR004\Usuários$\rodrigo.goncalves\Estudos\Projetos\rgbrain-lab-ocr\rgbrain_lab_ocr\datasets\documentos-pessoais\rg-verso\valid\labels... 30 images, 0 backgrounds, 0 corrupt: 100%|██████████| 30/30 [00:00<00:00, 69.79it/s]


[34m[1mval: [0mNew cache created: \\BR004\Usurios$\rodrigo.goncalves\Estudos\Projetos\rgbrain-lab-ocr\rgbrain_lab_ocr\datasets\documentos-pessoais\rg-verso\valid\labels.cache
Plotting labels to runs\detect\train10\labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.001111, momentum=0.9) with parameter groups 77 weight(decay=0.0), 84 weight(decay=0.0005), 83 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 0 dataloader workers
Logging results to [1mruns\detect\train10[0m
Starting training for 30 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       1/30         0G      2.429      5.159      1.602         55        640: 100%|██████████| 5/5 [09:38<00:00, 115.61s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):   0%|          | 0/1 [00:00<?, ?it/s]



                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:42<00:00, 42.46s/it]

                   all         30        141      0.144     0.0778      0.061     0.0256






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       2/30         0G      1.253      2.745      1.084         47        640: 100%|██████████| 5/5 [07:09<00:00, 85.82s/it] 
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):   0%|          | 0/1 [00:00<?, ?it/s]



                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:25<00:00, 25.53s/it]


                   all         30        141      0.251      0.229       0.13     0.0824

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       3/30         0G      1.096      1.867     0.9998         55        640: 100%|██████████| 5/5 [04:25<00:00, 53.18s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):   0%|          | 0/1 [00:00<?, ?it/s]



                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:25<00:00, 25.97s/it]


                   all         30        141      0.657      0.562      0.664      0.422

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       4/30         0G     0.9688      1.247     0.9643         48        640: 100%|██████████| 5/5 [03:33<00:00, 42.63s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:24<00:00, 24.59s/it]


                   all         30        141      0.793       0.79      0.869      0.508

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       5/30         0G     0.9775      1.132     0.9677         42        640: 100%|██████████| 5/5 [03:50<00:00, 46.15s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:21<00:00, 21.73s/it]


                   all         30        141      0.844      0.882       0.94       0.68

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       6/30         0G     0.9603      1.066      0.964         42        640: 100%|██████████| 5/5 [04:57<00:00, 59.41s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:25<00:00, 25.32s/it]


                   all         30        141      0.897      0.867      0.967      0.639

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       7/30         0G     0.8929     0.8843     0.9491         44        640: 100%|██████████| 5/5 [02:56<00:00, 35.24s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:22<00:00, 22.63s/it]


                   all         30        141      0.961      0.899      0.979      0.732

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       8/30         0G     0.9007     0.7741     0.9248         53        640: 100%|██████████| 5/5 [03:36<00:00, 43.33s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:26<00:00, 26.32s/it]

                   all         30        141       0.95      0.926       0.98      0.702






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       9/30         0G      0.891     0.7913     0.9376         41        640: 100%|██████████| 5/5 [03:05<00:00, 37.10s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:22<00:00, 22.73s/it]


                   all         30        141      0.947      0.947      0.972      0.714

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      10/30         0G     0.8589     0.6875     0.9129         67        640: 100%|██████████| 5/5 [03:00<00:00, 36.05s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:21<00:00, 21.32s/it]

                   all         30        141      0.933      0.915      0.975      0.725






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      11/30         0G     0.8445     0.6393      0.924         55        640: 100%|██████████| 5/5 [02:44<00:00, 32.85s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:20<00:00, 20.68s/it]


                   all         30        141      0.958      0.985       0.98      0.729

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      12/30         0G     0.8769     0.6533     0.9291         48        640: 100%|██████████| 5/5 [03:05<00:00, 37.02s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:18<00:00, 18.26s/it]


                   all         30        141      0.956      0.977      0.983      0.736

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      13/30         0G     0.8197     0.6418     0.9077         44        640: 100%|██████████| 5/5 [02:42<00:00, 32.56s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:20<00:00, 20.47s/it]

                   all         30        141      0.978      0.969      0.983      0.733






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      14/30         0G     0.8718      0.639     0.9073         37        640: 100%|██████████| 5/5 [03:53<00:00, 46.61s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:29<00:00, 29.08s/it]

                   all         30        141      0.976       0.98      0.982      0.772






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      15/30         0G     0.8256     0.5904     0.9031         51        640: 100%|██████████| 5/5 [03:21<00:00, 40.37s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:19<00:00, 19.10s/it]

                   all         30        141      0.952      0.984      0.985      0.769






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      16/30         0G     0.9024     0.6082     0.9044         41        640: 100%|██████████| 5/5 [03:01<00:00, 36.36s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:20<00:00, 20.63s/it]

                   all         30        141      0.972      0.984      0.984      0.751






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      17/30         0G      0.974     0.6622     0.9372         47        640: 100%|██████████| 5/5 [03:12<00:00, 38.52s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:20<00:00, 20.82s/it]

                   all         30        141       0.99      0.979      0.983      0.751






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      18/30         0G     0.8743     0.6119     0.8935         54        640: 100%|██████████| 5/5 [03:33<00:00, 42.77s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:23<00:00, 23.90s/it]

                   all         30        141       0.99      0.986      0.984      0.723






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      19/30         0G     0.9087     0.6113     0.9233         34        640: 100%|██████████| 5/5 [03:59<00:00, 47.96s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:22<00:00, 22.45s/it]


                   all         30        141      0.975      0.985      0.983      0.733

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      20/30         0G     0.8452     0.5982     0.9375         30        640: 100%|██████████| 5/5 [03:45<00:00, 45.20s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:20<00:00, 20.94s/it]


                   all         30        141      0.955       0.99      0.983      0.752
Closing dataloader mosaic

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      21/30         0G     0.8651     0.5647     0.9409         29        640: 100%|██████████| 5/5 [03:35<00:00, 43.20s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:24<00:00, 24.28s/it]


                   all         30        141       0.98      0.989      0.984      0.758

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      22/30         0G     0.8143     0.5556      0.934         29        640: 100%|██████████| 5/5 [05:35<00:00, 67.06s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:28<00:00, 29.00s/it]


                   all         30        141       0.99      0.992      0.983      0.758

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      23/30         0G     0.8216     0.5373     0.9308         30        640: 100%|██████████| 5/5 [04:01<00:00, 48.32s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:26<00:00, 26.06s/it]


                   all         30        141       0.99      0.989      0.983      0.756

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      24/30         0G     0.8104     0.5153     0.8994         28        640: 100%|██████████| 5/5 [03:59<00:00, 47.88s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:24<00:00, 24.24s/it]


                   all         30        141      0.987      0.994      0.983       0.76

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      25/30         0G     0.8209     0.5044     0.9085         28        640: 100%|██████████| 5/5 [03:46<00:00, 45.23s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:26<00:00, 26.95s/it]


                   all         30        141      0.988      0.994      0.983      0.756

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      26/30         0G      0.827     0.5042     0.9132         29        640: 100%|██████████| 5/5 [05:12<00:00, 62.57s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:28<00:00, 28.71s/it]


                   all         30        141      0.989      0.994      0.983      0.772

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      27/30         0G     0.7906     0.4749     0.8995         28        640: 100%|██████████| 5/5 [04:51<00:00, 58.23s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:29<00:00, 29.65s/it]


                   all         30        141       0.99      0.994      0.984      0.775

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      28/30         0G     0.8109     0.4816     0.8951         29        640: 100%|██████████| 5/5 [05:34<00:00, 66.83s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:30<00:00, 30.76s/it]


                   all         30        141       0.99      0.994      0.984      0.777

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      29/30         0G     0.7519     0.4684     0.8897         28        640: 100%|██████████| 5/5 [03:41<00:00, 44.33s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:26<00:00, 26.98s/it]

                   all         30        141       0.99      0.994      0.984      0.776






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      30/30         0G     0.7376     0.4492     0.8756         28        640: 100%|██████████| 5/5 [03:30<00:00, 42.14s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:22<00:00, 22.38s/it]

                   all         30        141       0.99      0.994      0.984      0.777






30 epochs completed in 2.400 hours.
Optimizer stripped from runs\detect\train10\weights\last.pt, 52.0MB
Optimizer stripped from runs\detect\train10\weights\best.pt, 52.0MB

Validating runs\detect\train10\weights\best.pt...
Ultralytics YOLOv8.2.14  Python-3.11.4 torch-2.3.0+cpu CPU (Intel Xeon Gold 5118 2.30GHz)
YOLOv8m summary (fused): 218 layers, 25842655 parameters, 0 gradients, 78.7 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1/1 [00:21<00:00, 21.20s/it]


                   all         30        141       0.99      0.994      0.984      0.778
           nome_pessoa         30         29      0.966          1      0.965      0.784
                    rg         30         30      0.998          1      0.995      0.781
                   cpf         30         21      0.997          1      0.995      0.774
       data_nascimento         30         31      0.997      0.968       0.97      0.747
        data_expedicao         30         30      0.996          1      0.995      0.803
Speed: 2.4ms preprocess, 606.3ms inference, 0.0ms loss, 0.7ms postprocess per image
Results saved to [1mruns\detect\train10[0m


Depois, verificamos se o modelo obteve um bom resultado

In [4]:
metrics = model.val()

metrics.box.map

Ultralytics YOLOv8.2.14  Python-3.11.4 torch-2.3.0+cpu CPU (Intel Xeon Gold 5118 2.30GHz)
YOLOv8m summary (fused): 218 layers, 25842655 parameters, 0 gradients, 78.7 GFLOPs


[34m[1mval: [0mScanning \\BR004\Usuários$\rodrigo.goncalves\Estudos\Projetos\rgbrain-lab-ocr\rgbrain_lab_ocr\datasets\documentos-pessoais\rg-verso\valid\labels.cache... 30 images, 0 backgrounds, 0 corrupt: 100%|██████████| 30/30 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:33<00:00, 16.93s/it]


                   all         30        141      0.991      0.994      0.984      0.779
           nome_pessoa         30         29      0.966          1      0.965      0.784
                    rg         30         30      0.998          1      0.995      0.782
                   cpf         30         21      0.997          1      0.995      0.772
       data_nascimento         30         31      0.997      0.968      0.969      0.744
        data_expedicao         30         30      0.995          1      0.995      0.812
Speed: 3.0ms preprocess, 1003.2ms inference, 0.0ms loss, 1.5ms postprocess per image
Results saved to [1mruns\detect\train102[0m


0.7787683478425725

Com tudo isso pronto, agora é a nossa hora de ver os resultado obtidos

In [5]:
results = model(['datasets/documentos-pessoais/rg-verso/examples/00029411_in.jpg'])

for result in results:
    boxes = result.boxes
    masks = result.masks
    keypoints = result.keypoints
    probs = result.probs
    obb = result.obb
    result.show()
    result.save("teste.jpg")


0: 448x640 1 nome_pessoa, 1 rg, 1 cpf, 1 data_nascimento, 1 data_expedicao, 2109.5ms
Speed: 17.0ms preprocess, 2109.5ms inference, 51.0ms postprocess per image at shape (1, 3, 448, 640)


Para finalizar, devemos salvar o modelo.

In [6]:
model.export()

Ultralytics YOLOv8.2.14  Python-3.11.4 torch-2.3.0+cpu CPU (Intel Xeon Gold 5118 2.30GHz)

[34m[1mPyTorch:[0m starting from 'runs\detect\train10\weights\best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 9, 8400) (49.6 MB)

[34m[1mTorchScript:[0m starting export with torch 2.3.0+cpu...
[34m[1mTorchScript:[0m export success  26.2s, saved as 'runs\detect\train10\weights\best.torchscript' (99.1 MB)

Export complete (36.2s)
Results saved to [1m\\BR004\Usurios$\rodrigo.goncalves\Estudos\Projetos\rgbrain-lab-ocr\rgbrain_lab_ocr\runs\detect\train10\weights[0m
Predict:         yolo predict task=detect model=runs\detect\train10\weights\best.torchscript imgsz=640  
Validate:        yolo val task=detect model=runs\detect\train10\weights\best.torchscript imgsz=640 data=./datasets/documentos-pessoais/rg-verso/documentos-pessoais-rg-verso.yaml  
Visualize:       https://netron.app


'runs\\detect\\train10\\weights\\best.torchscript'