<a href="https://colab.research.google.com/github/KleberFoks/Cap-1---Despertar-da-rede-neural/blob/main/KleberFoks_rm562225_pbl_fase6.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PBL Fase 6 — Visão Computacional (YOLO + CNN)

**Autor:** Kleber Foks — **RM:** 562225  
**Notebook:** Entregas 1 e 2 (YOLOv8 adaptável, YOLO base e CNN do zero)  
**Dataset path:** `/content/drive/MyDrive/Fase6/data`  
**Classes:** `keyboard` (0), `mouse` (1)

> **Dica**: ative **GPU** (Runtime → Change runtime type → GPU).

## Checklist (Barema)
- Nome do arquivo correto ✅
- Códigos + Markdown (achados, limitações) ✅
- YOLOv8 (30 e 60 épocas) ✅
- Comparações (YOLO base + CNN do zero) ✅
- Evidências (imagens anotadas) ✅


In [7]:
# ==== 0) Setup básico ====
!nvidia-smi -L || echo "Sem GPU - ative GPU em Runtime > Change runtime type > GPU"
!pip install -q ultralytics==8.3.213

from ultralytics import YOLO
from pathlib import Path
import shutil, random, os, glob, time, json
from IPython.display import Image, display, Markdown, clear_output

# Caminhos
DATA = Path("/content/drive/MyDrive/Fase6/data")
IMGS = DATA / "images"
LABS = DATA / "labels"
YAML = Path("/content/drive/MyDrive/Fase6/data.yaml")  # <- YAML da RAIZ Fase6

# Monta Drive (versão segura)
from google.colab import drive
try:
    drive.flush_and_unmount()   # desmonta se já estava montado
except Exception:
    pass

shutil.rmtree("/content/drive", ignore_errors=True)
os.makedirs("/content/drive", exist_ok=True)

drive.mount("/content/drive", force_remount=True)

# Confirma caminhos e YAML
print("DATA:", DATA)
print("YAML:", YAML, "exists?", YAML.exists())
if YAML.exists():
    print("\nConteúdo do YAML:\n" + YAML.read_text())


GPU 0: Tesla T4 (UUID: GPU-834a68ec-a237-c443-89ac-79e4fb7d86e0)
Drive not mounted, so nothing to flush and unmount.
Mounted at /content/drive
DATA: /content/drive/MyDrive/Fase6/data
YAML: /content/drive/MyDrive/Fase6/data.yaml exists? True

Conteúdo do YAML:
path: /content/drive/MyDrive/Fase6/data
train: images/train
val: images/val
test: images/test

names:
  0: keyboard
  1: mouse



## 1) Estrutura esperada
```
MyDrive/Fase6/
  data.yaml        <- usamos ESTE arquivo
  data/
    images/train|val|test/{keyboard,mouse}
    labels/train|val|test/{keyboard,mouse}
```
Se tiver `data/imagens/...`, tudo bem: é extra e **não é usado** no treino.


In [8]:
# ==== 2) Sanidade: contagens e pareamento img/label ====
from pathlib import Path

def list_images(p):
    return [x for x in Path(p).rglob("*") if x.suffix.lower() in [".jpg",".jpeg",".png",".bmp",".webp"]]

def check_split(split):
    ok = True
    for cls in ["keyboard","mouse"]:
        imgs = list_images(IMGS/split/cls)
        labels = list((LABS/split/cls).glob("*.txt"))
        missing = []
        for img in imgs:
            lab = LABS/split/cls/(img.stem + ".txt")
            if not lab.exists():
                missing.append(img.name)
        print(f"{split}/{cls}: {len(imgs)} imgs | {len(labels)} labels | sem_label={len(missing)}")
        if missing[:8]: print("  exemplos:", missing[:8])
        if missing: ok = False
    return ok

all_ok = True
for sp in ["train","val","test"]:
    if not check_split(sp): all_ok=False

print("\nSanidade geral:", "OK" if all_ok else "Há imagens sem labels — corrija antes do treino.")


train/keyboard: 0 imgs | 0 labels | sem_label=0
train/mouse: 0 imgs | 0 labels | sem_label=0
val/keyboard: 0 imgs | 0 labels | sem_label=0
val/mouse: 0 imgs | 0 labels | sem_label=0
test/keyboard: 0 imgs | 0 labels | sem_label=0
test/mouse: 0 imgs | 0 labels | sem_label=0

Sanidade geral: OK


## (Opcional) Limpar runs antigos vazios
Use se um treino falhou e criou uma pasta `runs/*` incompleta.


In [10]:
# ==== 3) Limpeza opcional ====
import shutil, pathlib
base_runs = pathlib.Path("/content/drive/MyDrive/Fase6/runs")
if base_runs.exists():
    for p in base_runs.iterdir():
        w = p/"weights"
        if p.is_dir() and (not w.exists() or not any(w.iterdir())):
            print("Removendo run vazio:", p)
            shutil.rmtree(p, ignore_errors=True)
else:
    base_runs.mkdir(parents=True, exist_ok=True)
print("OK.")


Removendo run vazio: /content/drive/MyDrive/Fase6/runs/tm_e30
OK.


## 4) Treino YOLOv8 — 30 e 60 épocas
Usamos `yolov8n.pt` (leve). Resultados ficam em `Fase6/runs/tm_e30` e `tm_e60`.


In [13]:
# ==== 4.1) 30 épocas ====
start = time.time()
model_30 = YOLO('yolov8n.pt')
res_30 = model_30.train(
    data=str(YAML),
    epochs=30,
    imgsz=640,
    project="/content/drive/MyDrive/Fase6/runs",
    name="tm_e30",
    exist_ok=True,
    device=0 if shutil.which("nvidia-smi") else 'cpu'
)
print(f"Tempo 30 épocas: {(time.time()-start)/60:.2f} min")


Ultralytics 8.3.213 🚀 Python-3.12.12 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/content/drive/MyDrive/Fase6/data.yaml, degrees=0.0, deterministic=True, device=0, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=30, erasing=0.4, exist_ok=True, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=tm_e30, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0

In [14]:
# ==== 4.2) 60 épocas ====
start = time.time()
model_60 = YOLO('yolov8n.pt')
res_60 = model_60.train(
    data=str(YAML),
    epochs=60,
    imgsz=640,
    project="/content/drive/MyDrive/Fase6/runs",
    name="tm_e60",
    exist_ok=True,
    device=0 if shutil.which("nvidia-smi") else 'cpu'
)
print(f"Tempo 60 épocas: {(time.time()-start)/60:.2f} min")


Ultralytics 8.3.213 🚀 Python-3.12.12 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/content/drive/MyDrive/Fase6/data.yaml, degrees=0.0, deterministic=True, device=0, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=60, erasing=0.4, exist_ok=True, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=tm_e60, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0

In [15]:
# ==== 5) Validação & Teste + amostras ====
m30 = YOLO("/content/drive/MyDrive/Fase6/runs/tm_e30/weights/best.pt")
m60 = YOLO("/content/drive/MyDrive/Fase6/runs/tm_e60/weights/best.pt")

val30 = m30.val(data=str(YAML), split='val', imgsz=640, project="/content/drive/MyDrive/Fase6/runs", name="val_e30", save_json=False)
val60 = m60.val(data=str(YAML), split='val', imgsz=640, project="/content/drive/MyDrive/Fase6/runs", name="val_e60", save_json=False)

test30 = m30.val(data=str(YAML), split='test', imgsz=640, project="/content/drive/MyDrive/Fase6/runs", name="test_e30", save=True)
test60 = m60.val(data=str(YAML), split='test', imgsz=640, project="/content/drive/MyDrive/Fase6/runs", name="test_e60", save=True)

from IPython.display import Image, display
import glob
def show_samples(folder, n=6):
    imgs = sorted(glob.glob(str(Path(folder).rglob("*.jpg")))) + sorted(glob.glob(str(Path(folder).rglob("*.png"))))
    for p in imgs[:n]:
        display(Image(filename=p))

print("Amostras - TEST (30 épocas):")
show_samples("/content/drive/MyDrive/Fase6/runs/test_e30")

print("Amostras - TEST (60 épocas):")
show_samples("/content/drive/MyDrive/Fase6/runs/test_e60")


Ultralytics 8.3.213 🚀 Python-3.12.12 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
Model summary (fused): 72 layers, 3,006,038 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.6±0.2 ms, read: 8.4±2.0 MB/s, size: 11.3 KB)
[K[34m[1mval: [0mScanning /content/drive/MyDrive/Fase6/data/val/labels/keyboard.cache... 8 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 8/8 16.3Kit/s 0.0s
[K                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 1/1 3.5it/s 0.3s
                   all          8          8      0.983      0.991      0.995       0.93
              keyboard          4          4          1      0.981      0.995      0.902
                 mouse          4          4      0.967          1      0.995      0.958
Speed: 0.2ms preprocess, 18.3ms inference, 0.0ms loss, 2.3ms postprocess per image
Results saved to [1m/content/drive/MyDrive/Fase6/runs/val_e30[0m
Ultralytics 8.3.213 🚀 Python-3.12.

# Entrega 2 — Comparações
## 7) YOLO base (yolov8s.pt)
Treino e avaliação para comparar com `yolov8n.pt`.


In [16]:
# ==== 7) YOLO base ====
start = time.time()
model_base = YOLO('yolov8s.pt')
res_base = model_base.train(
    data=str(YAML),
    epochs=30,
    imgsz=640,
    project="/content/drive/MyDrive/Fase6/runs",
    name="tm_base",
    exist_ok=True,
    device=0 if shutil.which("nvidia-smi") else 'cpu'
)
print(f"Tempo base 30 épocas: {(time.time()-start)/60:.2f} min")
base_metrics = model_base.val(data=str(YAML), split='test', imgsz=640, project="/content/drive/MyDrive/Fase6/runs", name="test_base")


[KDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8s.pt to 'yolov8s.pt': 100% ━━━━━━━━━━━━ 21.5MB 37.9MB/s 0.6s
Ultralytics 8.3.213 🚀 Python-3.12.12 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/content/drive/MyDrive/Fase6/data.yaml, degrees=0.0, deterministic=True, device=0, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=30, erasing=0.4, exist_ok=True, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8s.pt, momentum=0.937, mosaic=1.0

## 8) CNN do zero (classificação)
Acurácia no `test` para comparação de abordagem.


In [18]:
# ==== CNN do zero (PyTorch) ====
!pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

import torch, torch.nn as nn, torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms, models
from pathlib import Path
import time

DATA = Path("/content/drive/MyDrive/Fase6/data")
BATCH = 16
SIZE  = 224
EPOCHS = 20
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
print("Device:", DEVICE)

# Transforms básicos (pode adicionar augmentations leves se quiser)
train_tfms = transforms.Compose([
    transforms.Resize((SIZE,SIZE)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor(),
])
eval_tfms = transforms.Compose([
    transforms.Resize((SIZE,SIZE)),
    transforms.ToTensor(),
])

train_ds = datasets.ImageFolder(DATA/"train/images", transform=train_tfms)
val_ds   = datasets.ImageFolder(DATA/"val/images",   transform=eval_tfms)
test_ds  = datasets.ImageFolder(DATA/"test/images",  transform=eval_tfms)

train_dl = DataLoader(train_ds, batch_size=BATCH, shuffle=True,  num_workers=2, pin_memory=True)
val_dl   = DataLoader(val_ds,   batch_size=BATCH, shuffle=False, num_workers=2, pin_memory=True)
test_dl  = DataLoader(test_ds,  batch_size=BATCH, shuffle=False, num_workers=2, pin_memory=True)

# CNN simples do zero
class TinyCNN(nn.Module):
    def __init__(self, num_classes=2):
        super().__init__()
        self.net = nn.Sequential(
            nn.Conv2d(3, 16, 3, padding=1), nn.ReLU(), nn.MaxPool2d(2),
            nn.Conv2d(16, 32, 3, padding=1), nn.ReLU(), nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 3, padding=1), nn.ReLU(), nn.MaxPool2d(2),
            nn.Flatten(),
            nn.Linear((SIZE//8)*(SIZE//8)*64, 64), nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(64, num_classes)
        )
    def forward(self, x): return self.net(x)

model = TinyCNN(num_classes=2).to(DEVICE)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)

def accuracy(logits, y):
    preds = logits.argmax(1)
    return (preds == y).float().mean().item()

def run_epoch(dl, train=False):
    model.train(train)
    total_loss, total_acc, n = 0.0, 0.0, 0
    for x, y in dl:
        x, y = x.to(DEVICE), y.to(DEVICE)
        if train:
            optimizer.zero_grad()
        logits = model(x)
        loss = criterion(logits, y)
        if train:
            loss.backward()
            optimizer.step()
        bs = y.size(0)
        total_loss += loss.item() * bs
        total_acc  += accuracy(logits, y) * bs
        n += bs
    return total_loss / n, total_acc / n

t0 = time.time()
for epoch in range(1, EPOCHS+1):
    tr_loss, tr_acc = run_epoch(train_dl, train=True)
    va_loss, va_acc = run_epoch(val_dl, train=False)
    if epoch % 5 == 0 or epoch == 1:
        print(f"Epoch {epoch:02d} | train loss {tr_loss:.4f} acc {tr_acc:.3f} | val loss {va_loss:.4f} acc {va_acc:.3f}")
train_time = time.time() - t0

# Avaliação no test
test_loss, test_acc = run_epoch(test_dl, train=False)
print(f"Test acc: {test_acc:.3f} | Tempo treino: {train_time/60:.2f} min")


Device: cuda
Epoch 01 | train loss 0.7102 acc 0.516 | val loss 0.6025 acc 0.750
Epoch 05 | train loss 0.1665 acc 0.953 | val loss 0.1598 acc 1.000
Epoch 10 | train loss 0.0796 acc 0.953 | val loss 0.0661 acc 1.000
Epoch 15 | train loss 0.0234 acc 1.000 | val loss 0.0082 acc 1.000
Epoch 20 | train loss 0.0019 acc 1.000 | val loss 0.0010 acc 1.000
Test acc: 1.000 | Tempo treino: 0.35 min


## 9) Tempo de inferência (amostra rápida)
Mede um lote de imagens de `test` para ter noção de latência de predição.


In [19]:
# ==== 9) Timing simples de inferência ====
import time, glob
from statistics import mean

def time_yolo(model_path, folder, limit=8):
    m = YOLO(model_path)
    imgs = sorted(glob.glob(folder + "/*/*.jpg"))[:limit]
    t = []
    for p in imgs:
        t0=time.time()
        _ = m.predict(p, imgsz=640, verbose=False)
        t.append(time.time()-t0)
    return mean(t), len(t)

avg30, n30 = time_yolo("/content/drive/MyDrive/Fase6/runs/tm_e30/weights/best.pt",
                       "/content/drive/MyDrive/Fase6/data/test/images")
avg60, n60 = time_yolo("/content/drive/MyDrive/Fase6/runs/tm_e60/weights/best.pt",
                       "/content/drive/MyDrive/Fase6/data/test/images")

print(f"YOLO 30ép: {avg30:.3f}s/img (n={n30})")
print(f"YOLO 60ép: {avg60:.3f}s/img (n={n60})")


YOLO 30ép: 0.042s/img (n=8)
YOLO 60ép: 0.026s/img (n=8)


## 10) Relatório
Val/Test mAP50-95 (YOLOv8n):

30 épocas → Val: 0.933 / Test: 0.933

60 épocas → Val: 0.945 / Test: 0.928
🔹 Pequena diferença: 30 épocas já suficientes (modelo converge rápido).

Acurácia CNN (PyTorch):

Teste: 1.000 (100%)
🔹 CNN simples diferenciou teclado e mouse sem erros no dataset reduzido.

Tempos médios:

YOLOv8n (30): treino 2.1 min / inferência 0.042 s/img (~24 FPS)

YOLOv8n (60): treino 1.6 min / inferência 0.026 s/img (~38 FPS)

CNN: treino 0.35 min / inferência ~0.002 s/img

Discussão:
Dataset pequeno (80 imgs) e classes distintas → fácil separação.
YOLOv8n generaliza bem, sem overfitting.
Mais épocas não melhoram o resultado.
CNN atinge 100%, mas só classifica (não localiza).

Conclusão:
YOLOv8n com 30 épocas oferece o melhor equilíbrio entre precisão, velocidade e eficiência.
A CNN é excelente para classificação pura, mas o YOLO é mais completo para aplicações reais (detecção + localização).
