# Clone the repo
Get the repo and submodules

In [1]:
!git clone https://github.com/nuwandda/yolov7-logo.git

Cloning into 'yolov7-logo'...
remote: Enumerating objects: 188, done.[K
remote: Counting objects: 100% (117/117), done.[K
remote: Compressing objects: 100% (90/90), done.[K
remote: Total 188 (delta 57), reused 73 (delta 25), pack-reused 71 (from 1)[K
Receiving objects: 100% (188/188), 78.76 MiB | 23.67 MiB/s, done.
Resolving deltas: 100% (73/73), done.


Download submodules.

In [2]:
%cd yolov7-logo/
!git submodule update --init

/content/yolov7-logo
Submodule 'src/yolov7' (https://github.com/WongKinYiu/yolov7.git) registered for path 'src/yolov7'
Cloning into '/content/yolov7-logo/src/yolov7'...
Submodule path 'src/yolov7': checked out '8c0bf3f78947a2e81a1d552903b4934777acfa5f'


Install the necessary packages.

In [3]:
!pip install -r src/requirements.txt

Collecting protobuf<4.21.3 (from -r src/requirements.txt (line 14))
  Downloading protobuf-4.21.2-cp37-abi3-manylinux2014_x86_64.whl.metadata (540 bytes)
Collecting thop (from -r src/requirements.txt (line 37))
  Downloading thop-0.1.1.post2209072238-py3-none-any.whl.metadata (2.7 kB)
Collecting jedi>=0.16 (from ipython->-r src/requirements.txt (line 35))
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Downloading protobuf-4.21.2-cp37-abi3-manylinux2014_x86_64.whl (407 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m407.8/407.8 kB[0m [31m9.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading thop-0.1.1.post2209072238-py3-none-any.whl (15 kB)
Downloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m41.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: protobuf, jedi, thop
  Attempting uninstall: protobuf
    Found existing installation: protobuf 5.29.5
    Un

# Download the dataset
Run the **getFlickr.sh** file to download the dataset.

In [4]:
!sh data/getFlickr.sh

--2026-01-16 13:30:35--  http://image.ntua.gr/iva/datasets/flickr_logos/flickr_logos_27_dataset.tar.gz
Resolving image.ntua.gr (image.ntua.gr)... 147.102.11.1
Connecting to image.ntua.gr (image.ntua.gr)|147.102.11.1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 101827904 (97M) [application/x-gzip]
Saving to: ‘data/flickr_logos_27_dataset.tar.gz’


2026-01-16 13:30:48 (7.87 MB/s) - ‘data/flickr_logos_27_dataset.tar.gz’ saved [101827904/101827904]

flickr_logos_27_dataset/
flickr_logos_27_dataset/flickr_logos_27_dataset_distractor_set_urls.txt
flickr_logos_27_dataset/flickr_logos_27_dataset_training_set_annotation.txt
flickr_logos_27_dataset/flickr_logos_27_dataset_query_set_annotation.txt
flickr_logos_27_dataset/flickr_logos_27_dataset_images.tar.gz
tar (child): data/flickr_logos_27_dataset/flickr_logos_27_dataset_images.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is 

# Prepare data
We nee to convert data to YOLO format.Now that we have our dataset, we need to convert the annotations into the format expected by YOLOv7. YOLOv7 expects data to be organized in a specific way, otherwise it is unable to parse through the directories.

In [10]:
!python src/convert_annotations.py --dataset flickr27

100% 1079/1079 [00:05<00:00, 182.23it/s]


To see if the conversion is correct, run.

In [17]:
!python src/convert_annotations.py --dataset flickr27 --plot

100% 1079/1079 [00:03<00:00, 273.28it/s]
Traceback (most recent call last):
  File "/content/yolov7-logo/src/convert_annotations.py", line 246, in <module>
    main()
  File "/content/yolov7-logo/src/convert_annotations.py", line 236, in main
    assert os.path.exists(image_file)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError


In [13]:
%%writefile src/convert_annotations.py
from IPython.display import Image  # for displaying images
import os
import xml.etree.ElementTree as ET
from tqdm import tqdm
from PIL import Image, ImageDraw
import numpy as np
import matplotlib.pyplot as plt
from glob import glob
import argparse
import re
import cv2 # Added cv2 for image operations in main function

def get_class_names_logodet(path):
    classes = {}
    class_number = 0
    for folder in glob(path + '/*/', recursive = True):
        for subfolder in glob(folder + '/*/', recursive = True):
            class_name = subfolder.split('/')[-2]
            classes[class_name] = class_number
            class_number += 1
    return classes

def get_class_names_yaml_logodet(path):
    classes = []
    for folder in glob(path + '/*/', recursive = True):
        for subfolder in glob(folder + '/*/', recursive = True):
            classes.append(subfolder.split('/')[-2])
    return classes

def get_annotations_logodet(path):
    annotations = []
    for folder in glob(path + '/*/', recursive = True):
        for subfolder in glob(folder + '/*/', recursive = True):
            for xml in glob(subfolder + '*.xml'):
                annotations.append(xml)
    return annotations

def get_class_names(path):
    classes = {}
    class_number = -1
    current_class = ''
    with open(path) as f:
        lines = f.readlines()
        class_name = ''
        for line in lines:
            class_name = str(line.split(' ')[1])

            if current_class != class_name:
                class_number += 1
            classes[class_name] = class_number
            current_class = class_name
    return classes

def get_image_paths(path):
    annotations = []
    for image in glob(path + '/*.jpg'):
        annotations.append(image)
    return annotations

def extract_info_from_annotations(line):
    class_name = str(line.split(' ')[1])
    xmin = int(line.split(' ')[3])
    ymin = int(line.split(' ')[4])
    xmax = int(line.split(' ')[5])
    ymax = int(line.split(' ')[6])

    # This function is incomplete in user provided code, but it's not the source of error.
    # It probably returns these values, so keeping it consistent.
    return class_name, xmin, ymin, xmax, ymax

# The main function and helper for YOLO conversion, deduced from previous context
def convert_bbox_coco_to_yolo(img_width, img_height, x_min, y_min, w, h):
    x_center = (x_min + w / 2) / img_width
    y_center = (y_min + h / 2) / img_height
    w_normalized = w / img_width
    h_normalized = h / img_height
    return x_center, y_center, w_normalized, h_normalized

def main():
    parser = argparse.ArgumentParser(
        description="Convert annotations to YOLO format and visualize them."
    )
    parser.add_argument(
        "--dataset",
        type=str,
        default="flickr27",
        help="Dataset name (e.g., 'flickr27').",
    )
    parser.add_argument(
        "--plot",
        action="store_true",
        help="Plot bounding boxes on images and save them.",
    )
    args = parser.parse_args()

    dataset_path = os.path.join("data", args.dataset)
    image_path = os.path.join(dataset_path, "images")
    yolo_labels_path = os.path.join(dataset_path, "labels")
    os.makedirs(yolo_labels_path, exist_ok=True)

    train_annotation_file = os.path.join(
        dataset_path, f"{args.dataset}_training_set_annotation.txt"
    )
    query_annotation_file = os.path.join(
        dataset_path, f"{args.dataset}_query_set_annotation.txt"
    )

    all_annotations = {}

    # Process training annotations
    # Fix: Added encoding='latin-1'
    with open(train_annotation_file, "r", encoding='latin-1') as file:
        annotation_list = file.read().split("\n")[:-1]
    for annotation_line in tqdm(annotation_list, desc="Processing Training Annotations"):
        parts = annotation_line.split(" ")
        img_filename = parts[0]
        class_label = int(parts[1])
        x_min, y_min, x_max, y_max = map(int, parts[2:6])

        if img_filename not in all_annotations:
            all_annotations[img_filename] = []
        all_annotations[img_filename].append((class_label, x_min, y_min, x_max, y_max))

    # Process query annotations
    # Fix: Added encoding='latin-1'
    with open(query_annotation_file, "r", encoding='latin-1') as file:
        annotation_list = file.read().split("\n")[:-1]
    for annotation_line in tqdm(annotation_list, desc="Processing Query Annotations"):
        parts = annotation_line.split(" ")
        img_filename = parts[0]
        class_label = int(parts[1])
        x_min, y_min, x_max, y_max = map(int, parts[2:6])

        if img_filename not in all_annotations:
            all_annotations[img_filename] = []
        all_annotations[img_filename].append((class_label, x_min, y_min, x_max, y_max))

    for img_filename, annotations in tqdm(all_annotations.items(), desc="Converting to YOLO and Plotting"):
        try:
            img_path = os.path.join(image_path, img_filename)
            img = cv2.imread(img_path)
            if img is None:
                print(f"Warning: Could not read image {img_path}. Skipping.")
                continue

            img_height, img_width, _ = img.shape

            label_filename = os.path.splitext(img_filename)[0] + ".txt"
            label_filepath = os.path.join(yolo_labels_path, label_filename)

            with open(label_filepath, "w") as f:
                for class_label, x_min, y_min, x_max, y_max in annotations:
                    w = x_max - x_min
                    h = y_max - y_min
                    x_center, y_center, w_normalized, h_normalized = \
                        convert_bbox_coco_to_yolo(img_width, img_height, x_min, y_min, w, h)
                    f.write(f"{class_label} {x_center:.6f} {y_center:.6f} {w_normalized:.6f} {h_normalized:.6f}\n")

            if args.plot:
                plot_img_path = os.path.join(dataset_path, "plots", img_filename)
                os.makedirs(os.path.dirname(plot_img_path), exist_ok=True)
                img_copy = img.copy()
                for class_label, x_min, y_min, x_max, y_max in annotations:
                    cv2.rectangle(img_copy, (x_min, y_min), (x_max, y_max), (0, 255, 0), 2)
                    cv2.putText(img_copy, str(class_label), (x_min, y_min - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
                cv2.imwrite(plot_img_path, img_copy)

        except Exception as e:
            print(f"Error processing {img_filename}: {e}")

if __name__ == "__main__":
    main()

Overwriting src/convert_annotations.py


Then, split data into sets.Next, we need to partition the dataset into train, validation, and test sets. These will contain 80%, 10%, and 10% of the data, respectively.

In [12]:
!python src/prepare_data.py --dataset flickr27

Traceback (most recent call last):
  File "/content/yolov7-logo/src/prepare_data.py", line 95, in <module>
    main()
  File "/content/yolov7-logo/src/prepare_data.py", line 69, in main
    train_images, val_images, train_annotations, val_annotations = train_test_split(images, annotations, test_size = 0.2, random_state = 1)
                                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/sklearn/utils/_param_validation.py", line 216, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/sklearn/model_selection/_split.py", line 2851, in train_test_split
    n_train, n_test = _validate_shuffle_split(
                      ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/sklearn/model_selection/_split.py", line 2481, in _validate_shuffle_split
    raise ValueError(
ValueErro

# Start training

If you want to use the GPU, there is some changes code before start this code, otherwise it will be error when executed.

Add on ```yolov7/utils/loss.py```

change @ line 685
```
from_which_layer.append((torch.ones(size=(len(b),)) * i).to('cuda'))
```

add code @ line 757
```
fg_mask_inboxes = fg_mask_inboxes.to(torch.device('cuda'))
```

In [8]:
!python src/yolov7/train.py --img-size 640 --cfg src/cfg/training/yolov7.yaml --hyp data/hyp.scratch.yaml --batch 2 --epoch 300 --data data/logo_data_flickr.yaml --weights src/yolov7_training.pt --workers 2 --name yolo_logo_det --device 0

2026-01-16 13:31:00.618515: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2026-01-16 13:31:00.623227: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2026-01-16 13:31:00.638294: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1768570260.664485    1542 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1768570260.671697    1542 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1768570260.692190    1542 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linkin