# The Mapillary Traffic Sign Dataset for Detection and Classification on a Global Scale - paper review

## Challenges
- traffic signs are easily confused with other object classes in street
- reflection, low light condition, damages, and occlusion
- fine–grained classification
- traffic signs are relatively small in size

## Dataset Statistics
- images: 52,453 fully-anotated 47,547 partialy-anotated
- sign categories: 313 + 1 (other sign)
- total signs: 257 543

| train  | dev   | test   |
|--------|-------|--------|
| 36 589 | 5 320 | 10 544 |

- distribution plots are present in the paper

##  Annotation Process 
The annotations were done by 15 experts trained on this task. The authors continuously controlled the quality of annotations. At least two annotators must have seen each image. To further validate the quality of annotations, they runed separate annotation experiment over smaller subset of images and cross-checked the results showing only minor differences.

### 1. Selection
The images were selected using the following criteria:
- uniform geographical distribution of images around the world (weighted by continent population)
- to cover images of different quality, captured under varying conditions
- to include as many signs as possible per image
- to compensate for the long-tailed distribution of potential traffic
sign classes

### 2. Annotation
The annotation pipeline consisted of 3 steps:
1. Image Approval: the annotators should have ensured that the data fulfil the dataset criteria since the pre-selection was automatically
2. Sign Localization: The bounding boxes were pre-generated automatically. The annotators were asked to verify and adjust the bounding boxes to fit all traffic signs in the image.
3. Sign Classification: The annotators were asked to provide a correct class label for show sign (determined by box). This was not trivial since they used 313 classes. Thereby, the signs were pre-annotated automatically using a proposal network.

## Baseline
- Faster R-CNN with ResNet50 and ResNet101 back-bones
- two tasks: detection only and detection + classification
- ResNet50: 83.4 mAP over all 313 classes
- their best performing approach used 2 stage pipeline: 1. binary object detection, 2. multi-class classification using a decoupled shallow classification network


In [2]:
import os
import numpy as np
import json
import shutil
from tqdm import tqdm
import matplotlib.pyplot as plt
import cv2

In [13]:
PATH = "C:\\Users\\tlust\\Downloads\\mtsd"
DOUBLE_STEP = False

# input files
splits_path = os.path.join(PATH, "splits")
images_path = os.path.join(PATH, "images")
annotations_path = os.path.join(PATH, "annotations")

# output files
detect_path = os.path.join(PATH, "yolov8", "detect")
coco_path = os.path.join(PATH, "coco")
cls_path = os.path.join(PATH, "yolov8", "classify")

# YOLOv8

## Detection dataset
For a single-step process, classify directly into the full taxonomy; if not, employ binary detection.

In [4]:
labels = []

# statistics
rejected = 0
total = 0
sign_distr = {}

for split in ['train', 'val']:
    print("Processing {} split...".format(split))

    # 0. create output directories if not exists
    out_dir = os.path.join(detect_path, split)
    if not os.path.exists(out_dir):
        os.makedirs(os.path.join(out_dir, "images"))
        os.makedirs(os.path.join(out_dir, "labels"))

    with open(os.path.join(splits_path, split + ".txt")) as f:
        ids = f.readlines()

    for id in tqdm(ids, total=len(ids)):
        total += 1    

        # 1. set and validate paths
        id = id.strip()
        img_path = os.path.join(images_path, f"{id}.jpg")
        ann_path = os.path.join(annotations_path, f"{id}.json")
        out_img_path = os.path.join(out_dir, "images", f"{id}.jpg")
        out_ann_path = os.path.join(out_dir, "labels", f"{id}.txt")  

        # 1.2. skip if image or annotation does not exists
        if (not os.path.exists(img_path)) or (not os.path.exists(ann_path)):
            rejected += 1
            continue

        # 2. copy the image
        shutil.copy(img_path, out_img_path)

        # 3. create YOLOv8 annotation
        with open(ann_path, 'r') as f:
            ann = json.load(f)
            
        with open(out_ann_path, "a") as f:  
            for obj in ann['objects']:
                # 3.1 get label index
                if DOUBLE_STEP:
                    obj['label'] = 'traffic-sign'
                    # in case of double detect only most general label
                    #cat = obj['label'].split('--')
                    #if len(cat) == 1 and cat[0] == 'other-sign':
                    #    continue
                    #obj['label'] = f"{cat[0]}--{cat[1]}"
                else:
                    # remove --gN suffix
                    obj['label'] = '--'.join(obj['label'].split('--')[:-1])

                        
                    
                if obj['label'] not in labels:
                    labels.append(obj['label'])
                label = labels.index(obj['label'])

                # 3.2 set sign distribution
                sign_distr[label] = sign_distr[label] + 1 if label in sign_distr else 1

                # 3.3 get bounding box
                bbox = obj['bbox']
                x_center = np.clip(((bbox['xmin'] + bbox['xmax']) / 2) / ann['width'], 0, 1)
                y_center = np.clip(((bbox['ymin'] + bbox['ymax'] ) / 2) / ann['height'], 0, 1)
                width = np.clip((bbox['xmax'] - bbox['xmin']) / ann['width'], 0, 1)
                height = np.clip((bbox['ymax'] - bbox['ymin']) / ann['height'], 0, 1)
                obj_ann = f"{label} {x_center} {y_center} {width} {height} \n"

                # 3.4 write annotation
                if width > 0 and height > 0:
                    f.write(obj_ann)

# 4. create dataset.yaml
with open(os.path.join(detect_path, "dataset.yaml"), "a") as f:
    f.write("path: {detect_path}\n")
    f.write(f"train: {os.path.join('train', 'images')}\n")
    f.write(f"val: {os.path.join('val', 'images')}\n")
    f.write(f"names:\n")
    for ix, label in enumerate(labels):
        f.write(f"  {ix}: {label}\n")

Processing train split...


100%|██████████| 36589/36589 [01:21<00:00, 447.39it/s]


Processing val split...


100%|██████████| 5320/5320 [00:12<00:00, 441.32it/s]


Processing test split...


100%|██████████| 10544/10544 [00:00<00:00, 16688.36it/s]


## Classification dataset
If a two-stage pipeline is chosen, then construct a classification dataset that exclusively includes extracted signs, sorted by their respective labels.

In [4]:
if DOUBLE_STEP:
    labels = []

    # statistics
    rejected = 0
    total = 0
    sign_distr = {}


    for split in ['train', 'val', 'test']:
        print("Processing {} split...".format(split))
   
        with open(os.path.join(splits_path, split + ".txt")) as f:
            ids = f.readlines()

        for id in tqdm(ids, total=len(ids)):
            total += 1

            # 1. set and validate paths
            id = id.strip()
            img_path = os.path.join(images_path, f"{id}.jpg")
            ann_path = os.path.join(annotations_path, f"{id}.json")
    
            # 1.2. skip if image or annotation does not exists
            if (not os.path.exists(img_path)) or (not os.path.exists(ann_path)):
                rejected += 1
                continue

            # 2. load the image
            img = cv2.imread(img_path)

            # 3. extract traffic sign and create classification dataset                
            with open(ann_path, 'r') as f:
                ann = json.load(f)

            for obj in ann['objects']:
                # 3.1 skip if other-sign
                if obj['label'] == 'other-sign':
                    continue
                # remove --gN suffix
                obj['label'] = '--'.join(obj['label'].split('--')[:-1])

                # 3.2 get sign path and create directory if not exists
                sign_dir = os.path.join(cls_path, split, obj['label'])
                if not os.path.exists(sign_dir):
                    os.makedirs(sign_dir)

                # 3.3 increment sign counter
                if obj['label'] not in labels:
                    labels.append(obj['label'])
                label = labels.index(obj['label'])            
                sign_distr[label] = sign_distr[label] + 1 if label in sign_distr else 1

                # 3.4 get bounding box
                sign = img[
                    int(obj['bbox']['ymin']):int(obj['bbox']['ymax']), 
                    int(obj['bbox']['xmin']):int(obj['bbox']['xmax'])
                ]

                # 3.4 save sign
                if sign.shape[0] > 0 and sign.shape[1] > 0 and sign.shape[2] > 0:
                    sign_path = os.path.join(sign_dir, f"{id}_{obj['key']}.jpg")
                    cv2.imwrite(sign_path, sign)

Processing train split...


100%|██████████| 36589/36589 [29:34<00:00, 20.62it/s]


Processing val split...


100%|██████████| 5320/5320 [04:20<00:00, 20.39it/s]


Processing test split...


100%|██████████| 10544/10544 [00:00<00:00, 20023.83it/s]


## 2. view dataset statistics

In [None]:
print("Images - total: {}".format(total))
print("Images - rejected: {}".format(rejected) + " ({:.2f}%)".format(rejected / total * 100))
print("Signs: {}".format(np.sum(sign_distr.values())))
for ix, value in sign_distr.items():
    print(f"{labels[ix]}: {value}")

# remove 'other-label' from statistics
sign_distr_cpy = sign_distr.copy()
#sign_distr_cpy[1] = 0 # other-label
plt.bar(labels, sign_distr_cpy.values())

# COCO

In [5]:
! pip install globox

Collecting globox
  Obtaining dependency information for globox from https://files.pythonhosted.org/packages/50/6b/85af78fd335b8c232f8cb201b214e11d2dedc5041ed780226b845a2510a9/globox-2.4.2-py3-none-any.whl.metadata
  Downloading globox-2.4.2-py3-none-any.whl.metadata (10 kB)
Collecting numpy<2.0.0,>=1.26.0 (from globox)
  Obtaining dependency information for numpy<2.0.0,>=1.26.0 from https://files.pythonhosted.org/packages/07/34/748ec8c81235277f62cc04488052fe28b8b69280e7275bbb8dc143cd7791/numpy-1.26.2-cp39-cp39-win_amd64.whl.metadata
  Downloading numpy-1.26.2-cp39-cp39-win_amd64.whl.metadata (61 kB)
     ---------------------------------------- 0.0/61.2 kB ? eta -:--:--
     ------------------------- ------------ 41.0/61.2 kB 991.0 kB/s eta 0:00:01
     ---------------------------------------- 61.2/61.2 kB 1.1 MB/s eta 0:00:00
Collecting rich<14.0.0,>=13.3.5 (from globox)
  Obtaining dependency information for rich<14.0.0,>=13.3.5 from https://files.pythonhosted.org/packages/be/be/152

In [17]:
from globox import AnnotationSet

# create dirs if not exists
img_dir = os.path.join(coco_path, "images")
ann_dir = os.path.join(coco_path, "annotations")
if not os.path.exists(ann_dir) or not os.path.exists(img_dir):
    os.makedirs(ann_dir)
    os.makedirs(img_dir)

for split in ['train', 'val']:
    yolo_img_dir = os.path.join(detect_path, split, "images")

    # copy images
    for img_name in os.listdir(yolo_img_dir):
        shutil.copy(
            os.path.join(yolo_img_dir, img_name),
            os.path.join(img_dir, img_name)
        )

    # convert annotations
    yolo = AnnotationSet.from_yolo_v5(
        folder=os.path.join(detect_path, split, "labels"),
        image_folder=yolo_img_dir
    )
    yolo.save_coco(os.path.join(ann_dir, split + ".json"), auto_ids=True)