## Generate data list info

We follow the format of `nuscenes` dataset: 

nuscenes_database/xxxxx.bin: point cloud data included in each 3D bounding box of the training dataset

nuscenes_infos_train.pkl: training dataset, a dict contains two keys: metainfo and data_list. metainfo contains the basic information for the dataset itself, such as categories, dataset and info_version, while data_list is a list of dict, each dict (hereinafter referred to as info) contains all the detailed information of single sample as follows:

`info[‘sample_idx’]`: The index of this sample in the whole dataset.

info[‘token’]: Sample data token.

info[‘timestamp’]: Timestamp of the sample data.

info[‘ego2global’]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list)

info[‘lidar_points’]: A dict containing all the information related to the lidar points.

- info[‘lidar_points’][‘lidar_path’]: The filename of the lidar point cloud data.

- info[‘lidar_points’][‘num_pts_feats’]: The feature dimension of point.

- info[‘lidar_points’][‘lidar2ego’]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list)

info[‘lidar_sweeps’]: A list contains sweeps information (The intermediate lidar frames without annotations)

- info[‘lidar_sweeps’][i][‘lidar_points’][‘data_path’]: The lidar data path of i-th sweep.
    
- info[‘lidar_sweeps’][i][‘lidar_points’][‘lidar2ego’]: The transformation matrix from this lidar sensor to ego vehicle. (4x4 list)
    
- info[‘lidar_sweeps’][i][‘lidar_points’][‘ego2global’]: The transformation matrix from the ego vehicle to global coordinates. (4x4 list)
    
- info[‘lidar_sweeps’][i][‘lidar2sensor’]: The transformation matrix from the main lidar sensor to the current sensor (for collecting the sweep data). (4x4 list)
    
- info[‘lidar_sweeps’][i][‘timestamp’]: Timestamp of the sweep data.
    
- info[‘lidar_sweeps’][i][‘sample_data_token’]: The sweep sample data token.

info[‘images’]: A dict contains six keys corresponding to each camera: 'CAM_FRONT', 'CAM_FRONT_RIGHT', 'CAM_FRONT_LEFT', 'CAM_BACK', 'CAM_BACK_LEFT', 'CAM_BACK_RIGHT'. Each dict contains all data information related to corresponding camera.

- info[‘images’][‘CAM_XXX’][‘img_path’]: The filename of the image.
    
- info[‘images’][‘CAM_XXX’][‘cam2img’]: The transformation matrix recording the intrinsic parameters when projecting 3D points to each image plane. (3x3 list)
    
- info[‘images’][‘CAM_XXX’][‘sample_data_token’]: Sample data token of image.
    
- info[‘images’][‘CAM_XXX’][‘timestamp’]: Timestamp of the image.
    
- info[‘images’][‘CAM_XXX’][‘cam2ego’]: The transformation matrix from this camera sensor to ego vehicle. (4x4 list)
    
- info[‘images’][‘CAM_XXX’][‘lidar2cam’]: The transformation matrix from lidar sensor to this camera. (4x4 list)

info[‘instances’]: It is a list of dict. Each dict contains all annotation information of single instance. For the i-th instance:

- info[‘instances’][i][‘bbox_3d’]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, l, w, h, yaw) order.

- info[‘instances’][i][‘bbox_label_3d’]: A int indicate the label of instance and the -1 indicate ignore.

- info[‘instances’][i][‘velocity’]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), a list has shape (2.).

- info[‘instances’][i][‘num_lidar_pts’]: Number of lidar points included in each 3D bounding box.

- info[‘instances’][i][‘num_radar_pts’]: Number of radar points included in each 3D bounding box.

- info[‘instances’][i][‘bbox_3d_isvalid’]: Whether each bounding box is valid. In general, we only take the 3D boxes that include at least one lidar or radar point as valid boxes.

info[‘cam_instances’]: It is a dict containing keys 'CAM_FRONT', 'CAM_FRONT_RIGHT', 'CAM_FRONT_LEFT', 'CAM_BACK', 'CAM_BACK_LEFT', 'CAM_BACK_RIGHT'. For vision-based 3D object detection task, we split 3D annotations of the whole scenes according to the camera they belong to. For the i-th instance:

- info[‘cam_instances’][‘CAM_XXX’][i][‘bbox_label’]: Label of instance.

- info[‘cam_instances’][‘CAM_XXX’][i][‘bbox_label_3d’]: Label of instance.

- info[‘cam_instances’][‘CAM_XXX’][i][‘bbox’]: 2D bounding box annotation (exterior rectangle of the projected 3D box), a list arrange as [x1, y1, x2, y2].

- info[‘cam_instances’][‘CAM_XXX’][i][‘center_2d’]: Projected center location on the image, a list has shape (2,), .

- info[‘cam_instances’][‘CAM_XXX’][i][‘depth’]: The depth of projected center.

- info[‘cam_instances’][‘CAM_XXX’][i][‘velocity’]: Velocities of 3D bounding boxes (no vertical measurements due to inaccuracy), a list has shape (2,).

- info[‘cam_instances’][‘CAM_XXX’][i][‘attr_label’]: The attr label of instance. We maintain a default attribute collection and mapping for attribute classification.

- info[‘cam_instances’][‘CAM_XXX’][i][‘bbox_3d’]: List of 7 numbers representing the 3D bounding box of the instance, in (x, y, z, l, h, w, yaw) order.

info[‘pts_semantic_mask_path’]：The filename of the lidar point cloud semantic segmentation annotation.


In [17]:
import os
from pathlib import Path
from os import path as osp
import pickle
import numpy as np
import mmcv
from PIL import Image
import json

def get_id_cross_reference(root_dir, cross_file):
    """
    Get the cross reference of patient_id between LIDC data and synthetic data
    """
    cross_dict = dict()

    cross_file = f'{root_dir}/{cross_file}'
    with open(cross_file, 'r') as file: 
        for line in file: 
            lidc_ref, data_ref = line.strip().split(",")
            cross_dict[data_ref.strip()] = lidc_ref
    
    return cross_dict



def get_cam_data(root_dir, images_dir, patient_id):
    """
    Get image path for each camera
    """

    images_dir = f'{root_dir}/{images_dir}/Patient{patient_id:04}'

    info = dict()

    for cam in range(10):
        info_cam = dict()

        cam_name = (f'CAM_{cam:02}').upper()
        info_cam['data_path'] = f'{images_dir}/Image_{cam:02}.png'
        info_cam['type'] = cam_name
        info_cam['sample_data_token'] = f'{patient_id:03}cam{cam:02}'
        info_cam['timestamp'] = 0

        # Get cam intrinsic value
        cam_intrinsic_file = f'{root_dir}/Cams/Patient{patient_id:04}/{patient_id:04}_{cam:02}.txt'
        with open(cam_intrinsic_file, 'r') as file:
            for line in file:
                cam_intrinsic = [float(value) for value in line.strip().split(',')[:9]]
        info_cam['cam_intrinsic'] = np.array(cam_intrinsic).reshape((3, 3))

        info[cam_name] = info_cam
    
    return info


def get_3d_annotation(root_dir, anno3d_dir, patient_id):
    """
    Get 3d annotation from raw .txt file
    Return gt_boxed and gt_label
    For LIDC we only have 1 label
    """

    anno_3d_file = f'{root_dir}/{anno3d_dir}/Patient_{patient_id:04}_bbox3d.txt'

    gt_boxes = []
    with open(anno_3d_file, "r") as file:
        for line in file:

            # Convert the comma-separated values into floats
            # the coordinates in the txt file were stored as Y, X, Z (as well as the extension)
            # we need to convert it to X, Y, Z, dX, dY, dZ
            row = [float(value) for value in line.strip().split(",")]
            row = np.array(row)[[1, 0, 2, 4, 3, 5]]
            row = np.append(row, 0)  # add yaw value

            gt_boxes.append(row)

    
    gt_boxes = np.array(gt_boxes)
    gt_names = np.array(['nodule'] * len(gt_boxes), dtype='<U32')

    return gt_boxes, gt_names


def get_2d_annotation(root_dir, anno2d_dir, patient_id):
    """
    Get 2d annotation from raw .txt file
    """
    anno_2d_dir = f'{root_dir}/{anno2d_dir}/Patient{patient_id:04}'

    cam_instances = dict()

    for cam in range(10):
        bbox2d_file_path = f'{anno_2d_dir}/Cam_{cam:02}_bbox2d.txt'
        cam_name = (f'CAM_{cam:02}:').upper()

        with open(bbox2d_file_path, "r") as file:
            cam_instance = []
            for line in file:
                instance_data = {}
                row = [value for value in line.strip().split(",")]
                instance_data['bbox'] = row[:4]
                instance_data['bbox_label'] = 1
                instance_data['bbox_label_3d'] = 1

            cam_instance.append(instance_data)
        cam_instances[cam_name] = cam_instance
    return cam_instances


In [20]:
# MAIN

# Arguments
root_path = 'data/lidc'
# Root directory to start walking
# since we're in sandbox folder. 
root_dir = f'../{root_path}'

info_prefix = 'lidc'
version = 'v1.0'
dataset_name = ' lidc'
out_dir = root_dir
images_dir = 'Images'
anno_3d_dir = 'Labels3d'
anno_2d_dir = 'Labels2d'


db_info_save_path = osp.join(out_dir, f'{info_prefix}_dbinfos.pkl')
info_train_path = osp.join(out_dir, f'{info_prefix}_infos_train.pkl')
info_train_2d_anno_path = osp.join(out_dir, f'{info_prefix}_infos_train_2d_anno.coco.json')
info_val_path = osp.join(out_dir, f'{info_prefix}_infos_val.pkl')
error_log_path = osp.join(out_dir, f'{info_prefix}_error_logs.txt')



In [None]:

# Get the cross reference
cross_file = 'patients_processed.txt'
cross_ref = get_id_cross_reference(root_dir, cross_file)
nb_patient = len(cross_ref)


# Initalize list
lidc_infos_train = dict()  # Store all data info
logs = dict()  # Store error logs (if any)


# Build metainfo of the dataset
metadata = {
    'categories': {'normal': 0,'nodule': 1},
    'dataset': 'lidc',
    'version': 'v1.0',
    'info_version': '1.0'
}
lidc_infos_train['metadata'] = metadata


# Build the ground truth 3D database
infos = []  # Store all datalist (inside db_infos)

# !TODO: shuffle and split the data. 


for i in range(nb_patient):

    info_data = dict()

    info_data['token'] = i

    try: 
        # Build patient_id meta data 
        info_data['sample_id'] = i
        info_data['lidc_id_ref'] = cross_ref[str(i)]

        # Build cams data
        info_data['cams'] = get_cam_data(root_dir, images_dir=images_dir, patient_id=i)

        # Build 3D annotation data
        info_data['gt_boxes'], info_data['gt_names'] = get_3d_annotation(root_dir, anno_3d_dir, i)

        # Set valid flag = True b.c we dont have any lidars_ptd or radars_pts
        info_data['valid_flag'] = np.array([True] * len(info_data['gt_boxes']))

        # Build 2D annotation paths
        # info_data['cam_instances'] = get_2d_annotation(root_dir, anno_2d_dir, i)

        # Write to datalist
        infos.append(info_data)

    except Exception as e: 
        logs[f'Patient_{i:04}'] = str(e)
        continue

lidc_infos_train['infos'] = infos

# Write to disk
with open(info_train_path, 'wb') as f:
    pickle.dump(lidc_infos_train, f)

# Write error log file
with open(error_log_path, 'w') as f:
    for key, value in logs.items():
        f.write(f'{key}, {value}\n')


# Generate 2D Annotation file

We will store the anno_2d file in COCO format to match with current MV2D data structure. 

In [35]:
# 2D annotation path
anno_2d_dir = f'{root_dir}/Labels2d'
anno_2d_path = Path(anno_2d_dir)


anno_2d_files = [file for file in anno_2d_path.rglob('*.txt') 
             if file.is_file() and len(file.relative_to(anno_2d_path).parts) == 2]
anno_2d_files = sorted(anno_2d_files)


lidc_infos_train_2d_anno = dict()


# Create label data
lidc_infos_train_2d_anno['categories'] = [{'id': 0, 'name': 'normal'}, {'id': 1, 'name': 'nodule'}]

# Build Annotation data

annotations = []
images = []
anno_id = 0

for path in anno_2d_files:

    image = dict()

    # Get info about image
    # multiple annotations for 1 image are recorded seperately
    patient_id = path.parts[-2][-4:]
    cam_id = path.parts[-1].split('_')[1]
    # TODO: check if image file exist
    file_name = f'{root_dir}/{images_dir}/Patient{patient_id}/Image_{cam_id}.png'
    image['file_name'] = file_name
    image['id'] = f'{patient_id}cam{cam_id}'

    # Get info about cam intrinsic
    cam_intrinsics = get_cam_data(root_dir, images_dir, int(patient_id))
    image['cam_intrinsic'] = cam_intrinsics[f'CAM_{cam_id}']
    image['width'] = 1024
    image['height'] = 1024

    images.append(image)
    

    # Get info about bbox 3D
    # TODO: check if the nb_bbox_2d = nb_bbox_3d
    bbox_3ds, _ = get_3d_annotation(root_dir, anno_3d_dir, int(patient_id))

    # Get info about bbox
    with open(path, 'r') as file:
        for idx, line in enumerate(file):
            anno2d = dict()

            row = [value for value in line.strip().split(",")]
            x, y, dx, dy = [float(x) for x in row[:4]]
            category_name = row[4]

            anno2d['file_name'] = file_name
            anno2d['image_id'] = f'{patient_id}cam{cam_id}'
            anno2d['area'] = dx * dy
            anno2d['category_name'] = category_name
            anno2d['category_id'] = 1
            anno2d['bbox'] = [x, y, dx, dy]
            anno2d['iscrowd'] = 0
            anno2d['bbox_cam3d'] = bbox_3ds[idx]
            # TODO: check if we need center2d info - list of 3
            anno2d['center2d'] = [0, 0, 0]
            anno2d['id'] = anno_id

            # Additional information
            # subtlety, internalStructure, calcification, sphericity, margin, lobulation, spiculation, texture, malignancy
            add_info = ['subtlety', 'internalStructure', 'calcification', 'sphericity', 'margin', 'lobulation', 'spiculation', 'texture', 'malignancy']
            for i, info_type in enumerate(add_info, start=5):
                anno2d[info_type] = row[i]

            # Write info to list
            annotations.append(anno2d)
            anno_id += 1

lidc_infos_train_2d_anno['annotations'] = annotations
lidc_infos_train_2d_anno['images'] = images




In [36]:
# lidc_infos_train_2d_anno

# Write to disk
class NumpyEncoder(json.JSONEncoder):
    '''Handle n.array when writing to JSON'''
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()  # Convert ndarray to list
        return super(NumpyEncoder, self).default(obj)
    
with open(info_train_2d_anno_path, 'w') as f:
    # pickle.dump(lidc_infos_train_2d_anno, f)
    json.dump(lidc_infos_train_2d_anno, f, cls=NumpyEncoder, indent=4)

# Test data infos all

In [4]:
with open(info_train_path, 'rb') as f:
    lidc_infos_train = pickle.load(f)

len(lidc_infos_train['infos'])

791

In [5]:
with open(info_train_2d_anno_path, 'rb') as f:
    lidc_infos_train_2d_anno = pickle.load(f)

len(lidc_infos_train_2d_anno['annotations'])

61048

## Pretty-print annotation file

In [None]:
# Hierarchy for lidc_infos_train

from rich.console import Console
from rich.tree import Tree

def dict_to_tree(data, tree=None, parent_key=None):
    if tree is None:
        tree = Tree("[bold blue]Root[/bold blue]")

    for key, value in data.items():
        # Special case for 'infos' key
        if key == "infos" and isinstance(value, list) and value:
            branch = tree.add(f"[bold yellow]{key}[/bold yellow] [full list of patient, only show infos for Patient0000]")
            if isinstance(value[0], dict):
                dict_to_tree(value[0], branch, key)
            else:
                branch.add(str(value[0]))
        elif isinstance(value, dict):
            branch = tree.add(f"[bold yellow]{key}[/bold yellow]")
            dict_to_tree(value, branch, key)
        else:
            tree.add(f"{key}: {value}")

    return tree

test = lidc_infos_train['infos'][0]
console = Console(record=True)
tree = dict_to_tree(lidc_infos_train)
console.print(tree)

# Save to a file
with open("lidc_infos_tree_output.txt", "w") as f:
    f.write(console.export_text())

In [37]:
# Hierarchy for lidc_infos_train_2d_anno

def dict_to_tree_2d(data, tree=None, parent_key=None):
    if tree is None:
        tree = Tree("[bold blue]Root[/bold blue]")

    for key, value in data.items():
        # Special case for 'infos' key
        if key == "annotations" and isinstance(value, list) and value:
            branch = tree.add(f"[bold yellow]{key}[/bold yellow] (full list of annotations, only show the 1st item)")
            branch = branch.add(f"annotation[0]")
            if isinstance(value[0], dict):
                dict_to_tree(value[0], branch, key)
            else:
                branch.add(str(value[0]))
        elif key == "images" and isinstance(value, list) and value:
            branch = tree.add(f"[bold yellow]{key}[/bold yellow] (full list of images, only show the 1st item)")
            branch = branch.add(f"images[0]")
            if isinstance(value[0], dict):
                dict_to_tree(value[0], branch, key)
            else:
                branch.add(str(value[0]))
        elif isinstance(value, dict):
            branch = tree.add(f"[bold yellow]{key}[/bold yellow]")
            dict_to_tree(value, branch, key)
        else:
            tree.add(f"{key}: {value}")

    return tree

# test = lidc_infos_train['infos'][0]
console = Console(record=True)
tree = dict_to_tree_2d(lidc_infos_train_2d_anno)
console.print(tree)

# Save to a file
with open("lidc_infos_2d_anno_tree_output.txt", "w") as f:
    f.write(console.export_text())

In [29]:
lidc_infos_train_2d_anno

{'categories': [{'id': 1, 'name': 'nodule'}, {'id': 0, 'name': 'none'}],
 'annotations': [{'file_name': '../data/LIDC_Projection_Dataset/Images/Patient0000/Image_00.png',
   'image_id': '0000cam00',
   'area': 3780.0,
   'category_name': 'nodule',
   'category_id': 1,
   'bbox': [369.0, 567.0, 54.0, 70.0],
   'iscrowd': 0,
   'bbox_cam3d': array([34.453125, 59.0625  , 28.828125, 48.75    , 28.125   , 17.5     ,
           0.      ]),
   'center2d': [0, 0, 0],
   'id': 0,
   'subtlety': '5',
   'internalStructure': '1',
   'calcification': '6',
   'sphericity': '3',
   'margin': '3',
   'lobulation': '3',
   'spiculation': '4',
   'texture': '5',
   'malignancy': '5'},
  {'file_name': '../data/LIDC_Projection_Dataset/Images/Patient0000/Image_00.png',
   'image_id': '0000cam00',
   'area': 3312.0,
   'category_name': 'nodule',
   'category_id': 1,
   'bbox': [371.0, 569.0, 46.0, 72.0],
   'iscrowd': 0,
   'bbox_cam3d': array([30.234375, 61.171875, 29.53125 , 51.25    , 29.53125 , 15.    

In [43]:
anno_2d_all = [d for d in lidc_infos_train_2d_anno['annotations'] if 'Patient0000/Image_00.png' in d['file_name']]
len(anno_2d_all)
p00_bboxes = [d['bbox'] for d in anno_2d_all]
p00_bboxes

[[369.0, 567.0, 54.0, 70.0],
 [371.0, 569.0, 46.0, 72.0],
 [371.0, 569.0, 52.0, 66.0],
 [363.0, 569.0, 60.0, 66.0]]

In [None]:
p00_bboxes_convert = np.array([[d[0], d[1], d[0] + d[2], d[1] - d[3]] for d in p00_bboxes])
p00_bboxes_convert

array([[369., 567., 423., 497.],
       [371., 569., 417., 497.],
       [371., 569., 423., 503.],
       [363., 569., 423., 503.]])

In [30]:
test_data = lidc_infos_train['infos'][0]
test_img = test_data['cams']['CAM_01']['data_path']
test_data

{'token': 0,
 'sample_id': 0,
 'lidc_id_ref': 'LIDC-IDRI-0001',
 'cams': {'CAM_00': {'data_path': '../data/LIDC_Projection_Dataset/Images/Patient0000/Image_00.png',
   'type': 'CAM_00',
   'sample_data_token': '000cam00',
   'timestamp': 0,
   'cam_intrinsic': array([[ 600.      ,    0.      , -400.390625],
          [   0.      ,  600.      , -400.390625],
          [   0.      ,    0.      ,    1.      ]])},
  'CAM_01': {'data_path': '../data/LIDC_Projection_Dataset/Images/Patient0000/Image_01.png',
   'type': 'CAM_01',
   'sample_data_token': '000cam01',
   'timestamp': 0,
   'cam_intrinsic': array([[ 600.      ,    0.      , -400.390625],
          [   0.      ,  600.      , -400.390625],
          [   0.      ,    0.      ,    1.      ]])},
  'CAM_02': {'data_path': '../data/LIDC_Projection_Dataset/Images/Patient0000/Image_02.png',
   'type': 'CAM_02',
   'sample_data_token': '000cam02',
   'timestamp': 0,
   'cam_intrinsic': array([[ 600.      ,    0.      , -400.390625],
       

In [59]:
# mmcv.imshow(test_img)
test_img = test_anno2d['file_name']
x, y, dx, dy = test_anno2d['bbox']
test_bbox = np.array([x, y, x + dx, y - dy]).reshape((1, 4))
img = Image.open(test_img)
width, height = img.size
# width, height
# img

# mmcv.imshow(test_img)
mmcv.imshow_bboxes(test_img, p00_bboxes_convert, colors='red')

array([[[127, 127, 127],
        [127, 127, 127],
        [127, 127, 127],
        ...,
        [127, 127, 127],
        [127, 127, 127],
        [127, 127, 127]],

       [[127, 127, 127],
        [127, 127, 127],
        [127, 127, 127],
        ...,
        [127, 127, 127],
        [127, 127, 127],
        [127, 127, 127]],

       [[126, 126, 126],
        [126, 126, 126],
        [126, 126, 126],
        ...,
        [127, 127, 127],
        [126, 126, 126],
        [126, 126, 126]],

       ...,

       [[128, 128, 128],
        [128, 128, 128],
        [128, 128, 128],
        ...,
        [129, 129, 129],
        [128, 128, 128],
        [128, 128, 128]],

       [[129, 129, 129],
        [129, 129, 129],
        [129, 129, 129],
        ...,
        [129, 129, 129],
        [129, 129, 129],
        [129, 129, 129]],

       [[129, 129, 129],
        [129, 129, 129],
        [129, 129, 129],
        ...,
        [129, 129, 129],
        [129, 129, 129],
        [129, 129, 129]]