## Preprocessing the data

In [0]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [0]:
%cd /content/drive/My Drive
!mkdir project
%cd project

/content/drive/My Drive
/content/drive/My Drive/project


**Download the kitti dataset**

In [0]:
!wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip
!wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip

--2019-11-30 23:28:38--  https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip
Resolving s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)... 52.219.74.191
Connecting to s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)|52.219.74.191|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12569945557 (12G) [application/zip]
Saving to: ‘data_object_image_2.zip’


2019-11-30 23:32:03 (58.7 MB/s) - ‘data_object_image_2.zip’ saved [12569945557/12569945557]

--2019-11-30 23:32:04--  https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip
Resolving s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)... 52.219.74.60
Connecting to s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)|52.219.74.60|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5601213 (5.3M) [application/zip]
Saving to: ‘data_object_label_2.zip’


2019-11-30 23:32:05 (47.2 MB/s) - ‘data_object_label_2.zip’ sa

In [0]:
%%capture
!unzip data_object_image_2.zip
!unzip data_object_label_2.zip

In [0]:
!mkdir kitti_data 
!mkdir kitti_data/checkpoints
%cd kitti_data

/content/drive/My Drive/project/kitti_data


**Create a list of full path names of images and labels**

In [0]:
!find '/content/drive/My Drive/project/training/image_2/' -name "*png" | sort > images.txt 
!find '/content/drive/My Drive/project/training/label_2/' -name "*txt" | sort > labels.txt

**Split training and validation set**

In [0]:
import random
import numpy as np

def train_val_split(img_file, ytrue_file, train_scale, val_scale):
    """Given a two files containing the list of images and ground truth path,
    Split them into train set and validation set.
    """
    with open(img_file) as imgs:
        img_names = imgs.read().splitlines()
    imgs.close()
    with open(ytrue_file) as ytrues:
        ytrue_names = ytrues.read().splitlines()
    ytrues.close()

    shuffled = list(zip(img_names, ytrue_names))
    random.shuffle(shuffled)
    img_names, ytrue_names = zip(*shuffled)

    train_end_idx = int(np.floor(len(img_names) * train_scale))
    val_end_idx =  int(np.floor(len(img_names) * (train_scale + val_scale)))

    assert len(img_names) == len(ytrue_names)
    # Generate the train set
    with open("img_train.txt", 'w') as img_train:
        img_train.write("\n". join(img_names[0:train_end_idx]))
    img_train.close()
    with open("ytrue_train.txt", 'w') as ytrue_train:
        ytrue_train.write("\n". join(ytrue_names[0:train_end_idx]))
    ytrue_train.close()
    # Generate the validation set
    with open("img_val.txt", 'w') as img_val:
        img_val.write("\n". join(img_names[train_end_idx:val_end_idx]))
    img_val.close()
    with open("ytrue_val.txt", 'w') as ytrue_val:
        ytrue_val.write("\n". join(ytrue_names[train_end_idx:val_end_idx]))
    ytrue_val.close()
    print("Training set and validation set splitted")

In [0]:
train_val_split("images.txt", "labels.txt", 0.8, 0.2)

Training set and validation set splitted


**Download kitti raw data for video demo**

In [0]:
%cd /content/drive/My Drive/project/
!wget https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0009/2011_09_26_drive_0009_extract.zip

/content/drive/My Drive/project
--2019-11-30 23:48:53--  https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0009/2011_09_26_drive_0009_extract.zip
Resolving s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)... 52.219.73.187
Connecting to s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)|52.219.73.187|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2664742329 (2.5G) [application/zip]
Saving to: ‘2011_09_26_drive_0009_extract.zip’


2019-11-30 23:49:41 (53.0 MB/s) - ‘2011_09_26_drive_0009_extract.zip’ saved [2664742329/2664742329]



In [0]:
%%capture
!unzip 2011_09_26_drive_0009_extract.zip

In [0]:
%cd /content/drive/My Drive/project/kitti_data
!find '/content/drive/My Drive/project/2011_09_26/2011_09_26_drive_0009_extract/image_02/' -name "*png" | sort > images_video.txt
!mkdir video_img
!mkdir patch_matching_img

/content/drive/My Drive/project/kitti_data
