## Notebook setup

1. accelerator type set to `GPU`
1. mount `kaggle-l5kit-110` data volume
1. turn off internet because it is required if you want to submission the notebook (also all dependencies are within the data valume)

## Install `l5kit`

In [None]:
!pip install --no-index -q --use-feature=2020-resolver -f ../input/kaggle-l5kit-110 l5kit 

Note that if you just do `pip install torch` with Internet (i.e. install from pypi index), it'll by default install the cuda 10.2 version which isn't compatible with cuda 10.1 version that comes with Kaggle GPU docker image. Here the dataset `kaggle-l5kit-110` has been setup to download cuda 10.1 version so you can proceed with no issues.

We can see the GPU and cuda version works correctly now.

In [None]:
!nvcc --version

In [None]:
import l5kit
import torch
import torchvision
l5kit.__version__, torch.__version__, torchvision.__version__, torch.cuda.is_available()

## Package imports and setups

In [None]:
import os
import l5kit
import torch
import zarr
import pandas as pd
import numpy as np
from l5kit.data import LocalDataManager, ChunkedDataset
from l5kit.dataset import AgentDataset, EgoDataset

### set env variable for data and initialize DataManager

In [None]:
os.environ["L5KIT_DATA_FOLDER"] = "../input/lyft-motion-prediction-autonomous-vehicles"

In [None]:
dm = LocalDataManager()

## Datasets

And print a brief summary for all of them

In [None]:
%%time
sample_dataset = ChunkedDataset(dm.require('scenes/sample.zarr'))
sample_dataset.open()
print(sample_dataset)

In [None]:
%%time
train_dataset = ChunkedDataset(dm.require('scenes/train.zarr'))
train_dataset.open()
print(train_dataset)

In [None]:
%%time
val_dataset = ChunkedDataset(dm.require('scenes/validate.zarr'))
val_dataset.open()
print(val_dataset)

In [None]:
%%time
test_dataset = ChunkedDataset(dm.require('scenes/test.zarr'))
test_dataset.open()
print(test_dataset)

It seems that the average scene time is different (10.0s in test versus 24.8s in sample, train, and validate)

In [None]:
del train_dataset
del val_dataset

## Explore sample dataset

In [None]:
sample_dataset.scenes

In [None]:
sample_dataset.frames

In [None]:
sample_dataset.tl_faces

In [None]:
sample_dataset.agents

### Agent labels

In [None]:
agents_df = pd.DataFrame.from_records(sample_dataset.agents, columns = ['centroid', 'extent', 'yaw', 'velocity', 'track_id', 'label_probabilities'])
agents_df

What are their track ids?

In [None]:
agents_df.track_id.value_counts()

Classifications?

In [None]:
from l5kit.data import PERCEPTION_LABELS

agents_df['label_probabilities'].map(np.argmax).map(lambda i: PERCEPTION_LABELS[i]).value_counts()

And softmax distributions?

In [None]:
agents_df['label_probabilities'].map(np.max).map(lambda p: int(p * 10) / 10.).value_counts()

### Scenes

In [None]:
scene_df = pd.DataFrame.from_records(sample_dataset.scenes, columns=('frame_index_interval', 'host', 'start_time', 'end_time'))
scene_df

Scene hosts?

In [None]:
scene_df['host'].value_counts()

### Frames

In [None]:
frames_df = pd.DataFrame.from_records(sample_dataset.frames, columns = ['timestamp', 'agent_index_interval', 'traffic_light_faces_index_interval', 'ego_translation', 'ego_rotation'])
frames_df

In [None]:
tl_faces_df = pd.DataFrame.from_records(sample_dataset.tl_faces, columns = ['face_id', 'traffic_light_id', 'traffic_light_face_status'])
tl_faces_df

In [None]:
del sample_dataset

## Explore Test Dataset

In [None]:
test_dataset.scenes

In [None]:
test_dataset.frames

In [None]:
test_dataset.agents

In [None]:
test_dataset.tl_faces

### Test Scenes

In [None]:
scene_df = pd.DataFrame.from_records(test_dataset.scenes, columns=('frame_index_interval', 'host', 'start_time', 'end_time'))
scene_df

### Test Agents (sample 50000)

In [None]:
agents_df = pd.DataFrame.from_records(zarr.array(test_dataset.agents[:50000]), 
                                      columns = ['centroid', 'extent', 'yaw', 'velocity', 'track_id', 'label_probabilities'])
agents_df

And their tracks

In [None]:
agents_df.track_id.value_counts()

### Test Frames (sample 10000)

In [None]:
frames_df = pd.DataFrame.from_records(zarr.array(test_dataset.frames[:10000]), columns = ['timestamp', 'agent_index_interval', 'traffic_light_faces_index_interval', 'ego_translation', 'ego_rotation'])
frames_df

## AgentDataSet

In [None]:
CONFIG_DATA = {
    "format_version": 4,
    "model_params": {
        "model_architecture": "resnet50",
        # max is 99, but set to 101 never the less
        "history_num_frames": 101,
        "history_step_size": 1,
        "history_delta_time": 0.1,
        "future_num_frames": 50,
        "future_step_size": 1,
        "future_delta_time": 0.1,
    },
    "raster_params": {
        "raster_size": [256, 256],
        "pixel_size": [0.5, 0.5],
        "ego_center": [0.25, 0.5],
        "map_type": "py_semantic",
        "satellite_map_key": "aerial_map/aerial_map.png",
        "semantic_map_key": "semantic_map/semantic_map.pb",
        "dataset_meta_key": "meta.json",
        "filter_agents_threshold": 0.5,
        "disable_traffic_light_faces": False,
    },
    "test_dataloader": {
        "key": "scenes/test.zarr",
        "batch_size": 16,
        "shuffle": False,
        "num_workers": 4,
    },
}

In [None]:
test_mask = np.load(f"../input/lyft-motion-prediction-autonomous-vehicles/scenes/mask.npz")["arr_0"]

In [None]:
from l5kit.rasterization import build_rasterizer

rast = build_rasterizer(CONFIG_DATA, dm)

In [None]:
agent_dataset = AgentDataset(
    CONFIG_DATA, test_dataset, rast, agents_mask=test_mask
)

In [None]:
len(agent_dataset)

In [None]:
from tqdm.notebook import tqdm
from itertools import islice

items = []
track_ids = []
for i in tqdm(islice(agent_dataset, 20)):
    track_ids.append(i['track_id'])
    items.append(i)

In [None]:
len(track_ids), len(set(track_ids))

In [None]:
items[0].keys()

In [None]:
items[0]['track_id']

In [None]:
[len(item['history_availabilities']) for item in items]

**notice how the tailing 2 items (101 - 99) are empty**

In [None]:
items[0]['history_positions']

In [None]:
items[0]['history_yaws']