# nuImages devkit tutorial

Welcome to the nuImages tutorial.
This demo assumes the database itself is available at `/data/sets/nuimages`, and loads a mini version of the dataset.

## A Gentle Introduction to nuImages

In this part of the tutorial, let us go through a top-down introduction of our database. Our dataset is structured as a relational database with tables, tokens and foreign keys. The tables are the following:

1. `log` - Log from which the sample was extracted.
2. `sample` - An annotated camera image with an associated timestamp and past and future images and pointclouds.
3. `sample_data` - An image or pointcloud associated with a sample.
4. `ego_pose` - The vehicle ego pose and timestamp associated with a sample_data.
5. `sensor` - General information about a sensor, e.g. `CAM_BACK_LEFT`.
6. `calibrated_sensor` - Calibration information of a sensor in a log.
7. `category` - Taxonomy of object and surface categories (e.g. `vehicle.car`, `flat.driveable_surface`). 
8. `attribute` - Property of an object that can change while the category remains the same.
9. `object_ann` - Bounding box and mask annotation of an object (e.g. car, adult).
10. `surface_ann` - Mask annotation of a surface (e.g. `flat.driveable surface` and `vehicle.ego`).

The database schema is visualized below. For more information see the [schema page](https://github.com/nutonomy/nuscenes-devkit/blob/master/docs/schema_nuimages.md).
![](https://www.nuscenes.org/public/images/nuimages-schema.svg)

## Google Colab (optional)

<br>
<a href="https://colab.research.google.com/github/nutonomy/nuscenes-devkit/blob/master/python-sdk/tutorials/nuimages_tutorial.ipynb">
    <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" align="left">
</a>
<br>
    
If you are running this notebook in Google Colab, you can uncomment the cell below and run it; everything will be set up nicely for you. Otherwise, manually set up everything.

In [3]:
!mkdir -p /data/sets/nuimages  # Make the directory to store the nuImages dataset in.
!wget https://www.nuscenes.org/data/nuimages-v1.0-mini.tgz  # Download the nuImages mini split.

!tar -xf nuimages-v1.0-mini.tgz -C /data/sets/nuimages  # Uncompress the nuImages mini split.
!pip install nuscenes-devkit &> /dev/null  # Install nuImages.

mkdir: cannot create directory ‘/data’: Permission denied
--2024-04-23 18:35:16--  https://www.nuscenes.org/data/nuimages-v1.0-mini.tgz
Resolving www.nuscenes.org (www.nuscenes.org)... 2600:9000:21c7:bc00:1b:5ef:6040:93a1, 2600:9000:21c7:2a00:1b:5ef:6040:93a1, 2600:9000:21c7:3200:1b:5ef:6040:93a1, ...
Connecting to www.nuscenes.org (www.nuscenes.org)|2600:9000:21c7:bc00:1b:5ef:6040:93a1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 117929607 (112M) [application/x-tar]
Saving to: ‘nuimages-v1.0-mini.tgz’


2024-04-23 18:35:17 (84,9 MB/s) - ‘nuimages-v1.0-mini.tgz’ saved [117929607/117929607]

tar: /data/sets/nuimages: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now


## Initialization
To initialize the dataset class, we run the code below. We can change the dataroot parameter if the dataset is installed in a different folder. We can also omit it to use the default setup. These will be useful further below.

In [29]:
%matplotlib inline
%load_ext autoreload
%autoreload 2
from nuimages import NuImages

#nuim = NuImages(dataroot='/home/robolab/Downloads/nuimages-v1.0-all-metadata', version='v1.0-train', verbose=True, lazy=True)
#nuim = NuImages(dataroot='/home/robolab/Downloads/nuimages-v1.0-all-metadata', version='v1.0-test', verbose=True, lazy=True)
nuim = NuImages(dataroot='/home/robolab/Downloads/nuimages-v1.0-all-metadata', version='v1.0-val', verbose=True, lazy=True)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
Loading nuImages tables for version v1.0-val...
Done loading in 0.000 seconds (lazy=True).


## Tables

As described above, the NuImages class holds several tables. Each table is a list of records, and each record is a dictionary. For example the first record of the category table is stored at:

In [30]:
nuim.category[0]

Loaded 25 category(s) in 0.001s,


{'token': '63a94dfa99bb47529567cd90d3b58384',
 'name': 'animal',
 'description': 'All animals, e.g. cats, rats, dogs, deer, birds.'}

To see the list of all tables, simply refer to the `table_names` variable:

In [31]:
nuim.table_names

['attribute',
 'calibrated_sensor',
 'category',
 'ego_pose',
 'log',
 'object_ann',
 'sample',
 'sample_data',
 'sensor',
 'surface_ann']

### Indexing

Since all tables are lists of dictionaries, we can use standard Python operations on them. A very common operation is to retrieve a particular record by its token. Since this operation takes linear time, we precompute an index that helps to access a record in constant time.

Let us select the first image in this dataset version and split:

In [32]:
sample_idx = 0
sample = nuim.sample[sample_idx]
sample

Loaded 16445 sample(s) in 0.017s,


{'token': '0001c87ad88847fb860b100a9ae77fc7',
 'timestamp': 1533193855887005,
 'log_token': 'e61048eefd0c4277b4bc077ef019e439',
 'key_camera_token': '4ce15c90a085476db9023d85206e4042'}

## Rendering

To render an image we use the `render_image()` function. We can see the boxes and masks for each object category, as well as the surface masks for ego vehicle and driveable surface. We use the following colors:
- vehicles: orange
- bicycles and motorcycles: red
- pedestrians: blue
- cones and barriers: gray
- driveable surface: teal / green

At the top left corner of each box, we see the name of the object category (if `with_category=True`). We can also set `with_attributes=True` to print the attributes of each object (note that we can only set `with_attributes=True` to print the attributes of each object when `with_category=True`). In addition, we can specify if we want to see surfaces and objects, or only surfaces, or only objects, or neither by setting `with_annotations` to `all`, `surfaces`, `objects` and `none` respectively.

Let us make the image bigger for better visibility by setting `render_scale=2`. We can also change the line width of the boxes using `box_line_width`. By setting it to -1, the line width adapts to the `render_scale`. Finally, we can render the image to disk using `out_path`.

In [36]:
def bbox2yolo(sizee, boxx):
    dw = 1. / sizee[0]
    dh = 1. / sizee[1]
    x_center = (boxx[0] + boxx[1]) / 2.0
    y_center = (boxx[2] + boxx[3]) / 2.0
    width = boxx[1] - boxx[0]
    height = boxx[3] - boxx[2]
    
    x_center = x_center * dw
    width = width * dw
    y_center = y_center * dh
    height = height * dh


    
    
    
    return (x_center, y_center, width, height)

#now get the 'category_name' and assign a number instead 
classes = ['barrier', 'bicycle', 'bus', 'car', 'construction_vehicle', 'motorcycle', 'pedestrian', 'traffic_cone', 'trailer', 'truck']
#make a dictionary where each class got a number
class_dict = {classes[i]: i for i in range(len(classes))}
print(class_dict)

{'barrier': 0, 'bicycle': 1, 'bus': 2, 'car': 3, 'construction_vehicle': 4, 'motorcycle': 5, 'pedestrian': 6, 'traffic_cone': 7, 'trailer': 8, 'truck': 9}


In [37]:
#let's make a directory for the labels and images
#labels_dir = '/home/robolab/Downloads/yolov9/train/labels' #for the train
#images_dir = '/home/robolab/Downloads/yolov9/train/images'

#labels_dir = '/home/robolab/Downloads/yolov9/test/labels' #for the test
#images_dir = '/home/robolab/Downloads/yolov9/test/images'  

labels_dir = '/home/robolab/Downloads/yolov9/val/labels' #for the val
images_dir = '/home/robolab/Downloads/yolov9/val/images'  

import numpy as np
import torch
import torchvision.transforms as T
import concurrent.futures
#now make it if it doesn't exist
import os
if not os.path.exists(labels_dir):
    os.makedirs(labels_dir)
if not os.path.exists(images_dir):
    os.makedirs(images_dir)


def process_sample(i):
    sample = nuim.sample[i]
    key_camera_token = sample['key_camera_token']
    
    # Render image using PyTorch
    im, obj = nuim.render_image_2(
        key_camera_token,
        annotation_type='objects',
        with_category=True,
        with_attributes=True,
        box_line_width=-1,
        render_scale=5
    )
    
    w, h = im.size
    
    # Convert image to PyTorch tensor
    transform = T.ToTensor()
    im_tensor = transform(im).unsqueeze(0).cuda()
    
    labels_data = []
    
    for obj_info in obj:
        xmin, ymin, xmax, ymax = obj_info['bbox']
        
        # Convert bbox to YOLO format
        bb = bbox2yolo((w, h), (xmin, xmax, ymin, ymax))
        
        category_name = obj_info['category_name']
        
        for class_name in classes:
            if class_name in category_name:
                labels_data.append(f"{class_dict[class_name]} {' '.join(map(str, bb))}\n")
                break
                
    with open(os.path.join(labels_dir, f"{i}.txt"), 'w') as f:
        f.writelines(labels_data)
    
    im.save(os.path.join(images_dir, f"{i}.png"))

# Set device to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

amount_samples = 8721810 # change to amount of samples in data, or just way to much and run until error

# Parallel processing
with concurrent.futures.ThreadPoolExecutor() as executor:
    executor.map(process_sample, range(amount_samples))  

Loaded 213185 sample_data(s) in 0.732s,
Loaded 213185 sample_data(s) in 2.947s,
Loaded 213185 sample_data(s) in 3.477s,
Loaded 213185 sample_data(s) in 4.786s,
Loaded 213185 sample_data(s) in 4.882s,
Loaded 213185 sample_data(s) in 4.811s,
Loaded 213185 sample_data(s) in 8.840s,
Loaded 213185 sample_data(s) in 9.425s,
Loaded 213185 sample_data(s) in 9.585s,
Loaded 213185 sample_data(s) in 11.262s,
Loaded 213185 sample_data(s) in 12.321s,
Loaded 213185 sample_data(s) in 12.320s,
Loaded 213185 sample_data(s) in 12.449s,
Loaded 213185 sample_data(s) in 12.878s,
Loaded 213185 sample_data(s) in 13.045s,
Loaded 213185 sample_data(s) in 14.554s,
Loaded 213185 sample_data(s) in 15.934s,
Loaded 213185 sample_data(s) in 16.009s,
Loaded 213185 sample_data(s) in 16.153s,
Loaded 213185 sample_data(s) in 16.775s,
Loaded 213185 sample_data(s) in 16.850s,
Loaded 213185 sample_data(s) in 16.789s,
Loaded 213185 sample_data(s) in 16.947s,
Loaded 213185 sample_data(s) in 17.725s,
Loaded 213185 sample_data

In [None]:
import os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from PIL import Image

#now get the 'category_name' and assign a number instead 
classes = ['barrier', 'bicycle', 'bus', 'car', 'construction_vehicle', 'motorcycle', 'pedestrian', 'traffic_cone', 'trailer', 'truck']
#make a dictionary where each class got a number
class_dict = {classes[i]: i for i in range(len(classes))}




def yolo2bbox(img_size, yolo_box):
    """
    Convert YOLO format bounding box to bounding box coordinates.
    """
    x, y, w, h = yolo_box
    dw = 1. / img_size[0]
    dh = 1. / img_size[1]
    xmin = (x - w / 2.0) / dw
    xmax = (x + w / 2.0) / dw
    ymin = (y - h / 2.0) / dh
    ymax = (y + h / 2.0) / dh
    return xmin, ymin, xmax, ymax

# Randomly select and visualize a few images with bounding boxes
num_images_to_display = 5
selected_indices = np.random.choice(range(50), num_images_to_display, replace=False)  # change to 93000 on real dataset

for i in selected_indices:
    image_path = os.path.join(images_dir, f"{i}.png")
    label_path = os.path.join(labels_dir, f"{i}.txt")
    
    # Load image
    image = Image.open(image_path)

    #now show the image without the bounding boxes first
    plt.imshow(image)
    plt.show()
    

    w, h = image.size
    
    # Load labels
    with open(label_path, 'r') as f:
        lines = f.readlines()
    
    fig, ax = plt.subplots(1)
    
    # Display the image
    ax.imshow(image)
    
    # Plot bounding boxes
    for line in lines:
        parts = line.strip().split()
        class_id = int(parts[0])
        yolo_box = tuple(map(float, parts[1:]))
        
        class_name = [key for key, value in class_dict.items() if value == class_id][0]
        xmin, ymin, xmax, ymax = yolo2bbox((w, h), yolo_box)
        
        # Create a Rectangle patch
        rect = patches.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin, linewidth=1, edgecolor='r', facecolor='none')
        
        # Add the patch to the Axes
        ax.add_patch(rect)
        
        # Annotate the bounding box with class name
        plt.text(xmin, ymin - 5, class_name, color='r', fontsize=8, ha='center', va='bottom')
    
    plt.title(f"Image {i}")
    plt.show()