## Rozpoznawanie samochodów w czasie rzeczywistym z użyciem sieci neuronowych

---

W tym etapie użyje częściowo gotowego zbioru danych i użyje go do trenowania modelu. 

Użyty zbiór danych: https://www.kaggle.com/datasets/sshikamaru/car-object-detection/data

Dane uczące wrzuciłem do katalogu "samochody/images/train/", testowe do "samochody/images/train/"

Użyta architektura : https://github.com/ultralytics/yolov5

---

### Zacznę od wczytania potrzebnych bibliotek

In [3]:
import torch
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import os
from ultralytics import YOLO
import PIL

Na początku wczytam nowy, niewytrenowany model z architekturą YOLOv5s

In [10]:
y = open('yolov5s.yaml', "r", encoding="utf-8")
print(y.read())
y.close()

model = YOLO("yolov5s.yaml")

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# Parameters
nc: 80 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
anchors:
  - [10, 13, 16, 30, 33, 23] # P3/8
  - [30, 61, 62, 45, 59, 119] # P4/16
  - [116, 90, 156, 198, 373, 326] # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [
    [-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
    [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
    [-1, 3, C3, [128]],
    [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
    [-1, 6, C3, [256]],
    [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
    [-1, 9, C3, [512]],
    [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
    [-1, 3, C3, [1024]],
    [-1, 1, SPPF, [1024, 5]], # 9
  ]

# YOLOv5 v6.0 head
head: [
    [-1, 1, Conv, [512, 1, 1]],
    [-1, 1, nn.Upsample, [None, 2, "nearest"]],
    [[-1, 6], 1, Concat, [1]], # cat backbone P4
    [-1, 3, C3, [512, False]], # 13

    [-1, 1, Conv, [256, 1, 1]],
    [-1, 1, nn.Upsample, [None, 2, 

Następnie musze przygotować zbiór danych w odpowiednim formacie do trenowania.

W tym przypadku bounding boxy dla danych uczących są gotowe, jedynie należy przeprowadzić transformację do odpowiedniego formatu.

In [6]:
csv_path = "samochody\\train_solution_bounding_boxes (1).csv"
df = pd.read_csv(csv_path)
df.head()

Unnamed: 0,image,xmin,ymin,xmax,ymax
0,vid_4_1000.jpg,281.259045,187.035071,327.727931,223.225547
1,vid_4_10000.jpg,15.163531,187.035071,120.329957,236.43018
2,vid_4_10040.jpg,239.192475,176.764801,361.968162,236.43018
3,vid_4_10020.jpg,496.483358,172.363256,630.02026,231.539575
4,vid_4_10060.jpg,16.63097,186.54601,132.558611,238.386422


Dla danych uczących:

In [None]:
for _, row in df.iterrows():
    it = "/images/train/"
    lt = "/labels/train/"
    image_path = "samochody" + it + row['image']
    xmin, ymin, xmax, ymax = row[['xmin', 'ymin', 'xmax', 'ymax']]
    
    img = PIL.Image.open(image_path)
    img_width, img_height = img.size
    
    x_center = ((xmin + xmax) / 2) / img_width
    y_center = ((ymin + ymax) / 2) / img_height
    width = (xmax - xmin) / img_width
    height = (ymax - ymin) / img_height
    
    txt_path = "samochody" + lt + row['image'].split('.')[0] + ".txt"
    with open(txt_path, 'a') as f:
        f.write(f"{0} {x_center} {y_center} {width} {height}\n")

Dla danych testowych bounding boxy stworzyłem używając label-studio, ponieważ nie ma ich w oryginalnym zbiorze.

Następnie tworze plik konfiguracyjny zbioru danych

In [9]:
c = open('dataset.yaml', "r", encoding="utf-8")
print(c.read())
c.close()

# moj_dataset.yaml
path: samochody/
train: images/train/
val: images/test/
nc: 1  
names: ['samochód']  # Nazwy klas


Następnie przechodze do trenowania modelu

In [None]:
results = model.train(data="dataset.yaml", epochs=10, batch=8)

Ultralytics 8.3.91  Python-3.12.8 torch-2.6.0+cu126 CUDA:0 (NVIDIA GeForce GTX 970, 4096MiB)
[34m[1mengine\trainer: [0mtask=detect, mode=train, model=yolov5s.yaml, data=dataset.yaml, epochs=10, time=None, patience=100, batch=8, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train2, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True

[34m[1mtrain: [0mScanning C:\Users\Adam\Desktop\venv\pytorch\samochody\labels\train.cache... 355 images, 646 backgrounds, 0 corrupt: 100%|██████████| 1001/1001 [00:00<?, ?it/s]




[34m[1mval: [0mScanning C:\Users\Adam\Desktop\venv\pytorch\samochody\labels\test... 175 images, 83 backgrounds, 0 corrupt: 100%|██████████| 175/175 [00:00<00:00, 224.65it/s]

[34m[1mval: [0mNew cache created: C:\Users\Adam\Desktop\venv\pytorch\samochody\labels\test.cache





Plotting labels to runs\detect\train2\labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.002, momentum=0.9) with parameter groups 69 weight(decay=0.0), 76 weight(decay=0.0005), 75 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 4 dataloader workers
Logging results to [1mruns\detect\train2[0m
Starting training for 10 epochs...
Closing dataloader mosaic

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       1/10      2.14G      3.704      13.67      3.549          0        640: 100%|██████████| 126/126 [01:02<00:00,  2.00it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  4.77it/s]

                   all        175        156          0          0          0          0






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       2/10      2.17G      2.449      5.386      2.597          0        640: 100%|██████████| 126/126 [01:01<00:00,  2.07it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  4.63it/s]

                   all        175        156      0.263      0.359      0.215     0.0546






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       3/10      2.19G      2.331      3.364      2.476          1        640: 100%|██████████| 126/126 [01:00<00:00,  2.09it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  4.83it/s]

                   all        175        156      0.319      0.455      0.289       0.11






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       4/10      2.24G      2.033       2.58       2.14          0        640: 100%|██████████| 126/126 [01:00<00:00,  2.10it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  4.85it/s]

                   all        175        156      0.425      0.378      0.324      0.121






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       5/10      2.27G      2.056       2.18      2.179          1        640: 100%|██████████| 126/126 [01:00<00:00,  2.09it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  4.84it/s]

                   all        175        156      0.451      0.395      0.335      0.108






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       6/10      2.27G      1.991      1.808      2.112          1        640: 100%|██████████| 126/126 [01:00<00:00,  2.09it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  4.99it/s]

                   all        175        156      0.506      0.525      0.481       0.18






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       7/10      2.27G      1.904      1.759      1.996          1        640: 100%|██████████| 126/126 [01:00<00:00,  2.10it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  4.98it/s]

                   all        175        156      0.579      0.519      0.512      0.192






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       8/10      2.27G      1.772       1.69      1.881          1        640: 100%|██████████| 126/126 [01:00<00:00,  2.10it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  5.08it/s]

                   all        175        156      0.579      0.455      0.469       0.19






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       9/10      2.27G      1.736      1.621      1.857          0        640: 100%|██████████| 126/126 [01:00<00:00,  2.10it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  5.05it/s]

                   all        175        156      0.714      0.513      0.589      0.243






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      10/10       2.3G      1.701      1.403      1.826          1        640: 100%|██████████| 126/126 [01:00<00:00,  2.08it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  5.04it/s]

                   all        175        156        0.7      0.513      0.624      0.271






10 epochs completed in 0.183 hours.
Optimizer stripped from runs\detect\train2\weights\last.pt, 18.5MB
Optimizer stripped from runs\detect\train2\weights\best.pt, 18.5MB

Validating runs\detect\train2\weights\best.pt...
Ultralytics 8.3.91  Python-3.12.8 torch-2.6.0+cu126 CUDA:0 (NVIDIA GeForce GTX 970, 4096MiB)
YOLOv5s summary (fused): 84 layers, 9,111,923 parameters, 0 gradients, 23.8 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 11/11 [00:02<00:00,  4.88it/s]


                   all        175        156      0.701      0.513      0.622      0.271
Speed: 0.3ms preprocess, 8.3ms inference, 0.0ms loss, 1.1ms postprocess per image
Results saved to [1mruns\detect\train2[0m


Zapisuje model

In [None]:
model.save("yolo_moj_model.pt")

Dalszy trening będe wykonywał za pomocy skryptu trening.py

In [11]:
t = open('trening.py', "r", encoding="utf-8")
print(t.read())
t.close()

if __name__ == '__main__':
	import torch
	import pandas as pd
	import numpy as np
	from matplotlib import pyplot as plt
	from ultralytics import YOLO
	import datetime

	obecna_data = datetime.datetime.now()
	sformatowana_data = obecna_data.strftime("%d%m%Y-%H%M")
	nazwa_pliku =  "yolo_moj_model.pt"

	model = YOLO(nazwa_pliku)
	#device = torch.device('cpu')
	#model.to(device)
	bname = "yolo_moj_model_BACKUP" + sformatowana_data + ".pt"

	print(bname)

	model.save(bname) # tworze backup na wszelki wypadek

	#parametry treningu
	e = 20
	b = 4

	#trening
	model.train(data="dataset.yaml", epochs=e, batch=b)
	#koniec treningu

	model.save(nazwa_pliku) # nadpisuje


---