# Explicitly Split KITTI into Train and Val

The KITTI dataset comes with a `trainval` folder of data. This small example notebook shows you how to use the `KittiObjectDataset` object to split the train and val splits into their own respective folders. It is not necessary to run this example to do anything of interest with KITTI, but it can help understand how the data are organized.

In [1]:
import avstack
import avapi
import os
import shutil
from tqdm import tqdm

Cannot import rss library


## Split data

In [23]:
KITTI_obj_base = os.path.realpath("../data/KITTI/object")  # use your own path or add a symbolic link
KDM = avapi.kitti.KittiObjectDataset(KITTI_obj_base, 'training')  # the original "training" has both "train" and "val"

folders = ["train_only", "val_only"]  # these directories will be created
imset_names = ["train.txt", "val.txt"]  # these were downloaded with the ImageSet download
subfolders = ['velodyne', 'image_2', 'image_3', 'planes', 'label_2', 'calib']
exts       = ['.bin',     '.png',    '.png',    '.txt',   '.txt',    '.txt']

copy_data = False  # if true, actually copies. If false, makes a symbolic link for each file

for imset, fol in zip(imset_names, folders):
    os.makedirs(os.path.join(KITTI_obj_base, fol), exist_ok=True)
    for sub in subfolders:
        os.makedirs(os.path.join(fol, sub), exist_ok=True)

    # Read imagesets from downloaded files to know which files
    with open(os.path.join(KITTI_obj_base, 'ImageSets', imset), 'r') as f:
        idxs_this_split = [int(s.strip()) for s in f.readlines()]

    # Write image sets to the new name
    fol_split = fol.split('/')
    write_str = '\n'.join(['%06d'%i for i in idxs_this_split])
    with open(os.path.join(KITTI_obj_base, *fol_split[:-1], 'ImageSets', fol_split[-1] + '.txt'), 'w') as f:
        f.write(write_str)
              
    # Copy or symlink the data
    for sub, ext in zip(subfolders, exts):
        subdir = os.path.join(KITTI_obj_base, fol, sub)
        print(f'{"Copying" if copy_data else "Making symolic links with"} {subdir}')
        os.makedirs(subdir, exist_ok=True)
        for idx in tqdm(idxs_this_split):
            fname = '%06d'%idx + ext
            src = os.path.join(KDM.split_path, sub, fname)
            dest = os.path.join(subdir, fname)
            if copy_data:
                shutil.copy2(src, dest)
            else:
                os.symlink(src, dest)

Making symolic links with /data/spencer/KITTI/object/train_only/velodyne


100%|█████████████████████████████████████████████████| 3712/3712 [00:00<00:00, 173915.42it/s]


Making symolic links with /data/spencer/KITTI/object/train_only/image_2


100%|█████████████████████████████████████████████████| 3712/3712 [00:00<00:00, 174531.49it/s]


Making symolic links with /data/spencer/KITTI/object/train_only/image_3


100%|█████████████████████████████████████████████████| 3712/3712 [00:00<00:00, 175473.72it/s]


Making symolic links with /data/spencer/KITTI/object/train_only/planes


100%|█████████████████████████████████████████████████| 3712/3712 [00:00<00:00, 174850.99it/s]


Making symolic links with /data/spencer/KITTI/object/train_only/label_2


100%|█████████████████████████████████████████████████| 3712/3712 [00:00<00:00, 175628.11it/s]


Making symolic links with /data/spencer/KITTI/object/train_only/calib


100%|█████████████████████████████████████████████████| 3712/3712 [00:00<00:00, 174945.29it/s]


Making symolic links with /data/spencer/KITTI/object/val_only/velodyne


100%|█████████████████████████████████████████████████| 3769/3769 [00:00<00:00, 175320.87it/s]


Making symolic links with /data/spencer/KITTI/object/val_only/image_2


100%|█████████████████████████████████████████████████| 3769/3769 [00:00<00:00, 174751.08it/s]


Making symolic links with /data/spencer/KITTI/object/val_only/image_3


100%|█████████████████████████████████████████████████| 3769/3769 [00:00<00:00, 174569.68it/s]


Making symolic links with /data/spencer/KITTI/object/val_only/planes


100%|█████████████████████████████████████████████████| 3769/3769 [00:00<00:00, 172262.22it/s]


Making symolic links with /data/spencer/KITTI/object/val_only/label_2


100%|█████████████████████████████████████████████████| 3769/3769 [00:00<00:00, 172282.87it/s]


Making symolic links with /data/spencer/KITTI/object/val_only/calib


100%|█████████████████████████████████████████████████| 3769/3769 [00:00<00:00, 175953.12it/s]


## Test Splits

In [25]:
KDM_train = avapi.kitti.KittiObjectDataset(KITTI_obj_base, 'train_only')
print(f'Train has {len(KDM_train.frames)} frames')
KDM_val = avapi.kitti.KittiObjectDataset(KITTI_obj_base, 'val_only')
print(f'Val has {len(KDM_val.frames)} frames')

Train has 3712 frames
Val has 3769 frames
