# <span style="color:blue"> Lyft 3D Object Detection for Autonomous Vehicles </span>

<br>

<img src="https://s3-prod.crainsnewyork.com/s3fs-public/MAIN-Lyft%20pink%20cars_Buck%20Ennis_i_i.jpg" height="500" width="500"> 
 
**Self-driving technology** presents a rare opportunity to improve the quality of life in many of our communities. Avoidable collisions, single-occupant commuters, and vehicle emissions are choking cities, while infrastructure strains under rapid urban growth. Autonomous vehicles are expected to redefine transportation and unlock a myriad of societal, environmental, and economic benefits. You can apply your data analysis skills in this competition to advance the state of self-driving technology.

![](https://storage.googleapis.com/kaggle-media/competitions/Lyft-Kaggle/Kaggle-01.png)


**This dataset** aims to democratize access to such data, and foster innovation in higher-level autonomy functions for everyone, everywhere. By conducting a competition, we hope to encourage the research community to focus on hard problems in this space—namely, 3D object detection over semantic maps.

In **this competition**, you will build and optimize algorithms based on a large-scale dataset. This dataset features the raw sensor camera inputs as perceived by a fleet of multiple, high-end, autonomous vehicles in a restricted geographic area. 


##  References

- [Lyft: Quick EDA and creating useful files](https://www.kaggle.com/xhlulu/lyft-quick-eda-and-creating-useful-files) by @xhlulu
- **[Official Devkit for the public 2019 Lyft Level 5 AV Dataset](https://github.com/lyft/nuscenes-devkit) by @iglovikov**

# Data


You will need the **LIDAR, image**, map and data files for both train and test (```test_images.zip```, ```test_lidar.zip```, etc.). You may also need the train.csv, which includes the sample annotations in the form expected for submissions. The ```sample_submission.csv``` file contains all of the sample ```Ids``` for the test set.

The data files (```test_data.zip```, ```train_data.zip```) are in JSON format.

<br>

- **train_data.zip** and **test_data.zip** - contains JSON files with multiple tables. The most important is ```sample_data.json```, which contains the primary identifiers used in the competition, as well as links to key image / lidar information.
    
- **train_images.zip** and **test_images.zip** - contains .jpeg files corresponding to samples in ```sample_data.json```
- **train_lidar.zip** and **test_lidar.zip** - contains .jpeg files corresponding to samples in ```sample_data.json```
- **train_maps.zip** and **test_maps.zip** - contains maps of the entire sample area.
- **train.csv** - contains all ```sample_tokens``` in the train set, as well as annotations in the required format for all train set objects.
- **sample_submission.csv** - contains all ```sample_tokens``` in the test set, with empty predictions.


In [None]:
!ls ../input/3d-object-detection-for-autonomous-vehicles

# Lyft Level 5 AV dataset and nuScenes devkit tutorial

### <span style="color:red"> IMPORTANT </span>

> This is a modification of the official devkit tutorial: https://github.com/lyft/nuscenes-devkit . I modified the code for running it here! and I'll add new stuff. Check the official repository for more impormation.


Welcome to the Level 5 AV dataset & nuScenes SDK tutorial!

This notebook is based on the original nuScenes tutorial notebook (https://www.nuscenes.org/) and was adjusted for the Level 5 AV dataset.

## Introduction to the dataset structure

In this part of the tutorial, let us go through a top-down introduction of our database. Our dataset comprises of elemental building blocks that are the following:

1. `scene` - 25-45 seconds snippet of a car's journey.
2. `sample` - An annotated snapshot of a scene at a particular timestamp.
3. `sample_data` - Data collected from a particular sensor.
4. `sample_annotation` - An annotated instance of an object within our interest.
5. `instance` - Enumeration of all object instance we observed.
6. `category` - Taxonomy of object categories (e.g. vehicle, human). 
7. `attribute` - Property of an instance that can change while the category remains the same.
8. `visibility` - (currently not used)
9. `sensor` - A specific sensor type.
10. `calibrated sensor` - Definition of a particular sensor as calibrated on a particular vehicle.
11. `ego_pose` - Ego vehicle poses at a particular timestamp.
12. `log` - Log information from which the data was extracted.
13. `map` - Map data that is stored as binary semantic masks from a top-down view.

Let's get started! Make sure that you have a local copy of a dataset (for download instructions, see https://level5.lyft.com/dataset/). Then, adjust `dataroot` below to point to your local dataset path. If everything is set up correctly, you should be able to execute the following cell successfully.

In [None]:
!pip install lyft-dataset-sdk

In [None]:
# Load the SDK
%matplotlib inline
from lyft_dataset_sdk.lyftdataset import LyftDataset

Thanks @rishabhiitbhu for this comment: https://www.kaggle.com/seshurajup/lyft-level-5-av-dataset-notebook-from-github#625566

In [None]:
# Load the dataset
# Adjust the dataroot parameter below to point to your local dataset path.
# The correct dataset path contains at least the following four folders (or similar): images, lidar, maps, v1.0.1-train
!ln -s /kaggle/input/3d-object-detection-for-autonomous-vehicles/train_images images
!ln -s /kaggle/input/3d-object-detection-for-autonomous-vehicles/train_maps maps
!ln -s /kaggle/input/3d-object-detection-for-autonomous-vehicles/train_lidar lidar

In [None]:
level5data = LyftDataset(data_path='.', json_path='/kaggle/input/3d-object-detection-for-autonomous-vehicles/train_data', verbose=True)

### 1. Scene

Let's take a look at the scenes that we have in the loaded database. This example dataset only has one scene, but there are many more to come.

In [None]:
level5data.list_scenes()

Let's look at a scene's **metadata**

In [None]:
my_scene = level5data.scene[0]
my_scene

### 2. Sample

We define `sample` as an ***annotated keyframe of a scene at a given timestamp***. A keyframe is a frame where the time-stamps of data from all the sensors should be very close to the time-stamp of the sample it points to.

Now, let us look at the first annotated sample in this scene.

In [None]:
my_sample_token = my_scene["first_sample_token"]
# my_sample_token = level5data.get("sample", my_sample_token)["next"]  # proceed to next sample

level5data.render_sample(my_sample_token)

Let's examine its **metadata** (click `output`)

In [None]:
my_sample = level5data.get('sample', my_sample_token)
my_sample

A useful method is  `list_sample()` which lists all related `sample_data` keyframes and `sample_annotation` associated with a `sample` which we will discuss in detail in the subsequent parts.

In [None]:
level5data.list_sample(my_sample['token'])

Instead of looking at camera and lidar data separately, we can also project the lidar pointcloud into camera images:

In [None]:
level5data.render_pointcloud_in_image(sample_token = my_sample["token"],
                                      dot_size = 1,
                                      camera_channel = 'CAM_FRONT')

### 3. Sample_data

The dataset contains data that is collected from a full sensor suite. Hence, for each snapshot of a scene, we provide references to a family of data that is collected from these sensors. 

We provide a `data` key to access these:

In [None]:
my_sample['data']

Notice that the keys are referring to the different sensors that form our sensor suite. Let's take a look at the metadata of a `sample_data` taken from `CAM_FRONT`.

In [None]:
sensor_channel = 'CAM_FRONT'  # also try this e.g. with 'LIDAR_TOP'
my_sample_data = level5data.get('sample_data', my_sample['data'][sensor_channel])
my_sample_data

We can also render the `sample_data` at a particular sensor. 

In [None]:
level5data.render_sample_data(my_sample_data['token'])

### 4. Sample_annotation

`sample_annotation` refers to any ***bounding box defining the position of an object seen in a sample***. All location data is given with respect to the global coordinate system. Let's examine an example from our `sample` above.

In [None]:
my_annotation_token = my_sample['anns'][16]
my_annotation =  my_sample_data.get('sample_annotation', my_annotation_token)
my_annotation

We can also render an annotation to have a closer look.

In [None]:
level5data.render_annotation(my_annotation_token)

### 5. Instance

Object instance are instances that need to be detected or tracked by an AV (e.g a particular vehicle, pedestrian). Let us examine an instance metadata

In [None]:
my_instance = level5data.instance[100]
my_instance

We generally track an instance across different frames in a particular scene. However, we do not track them across different scenes. In this example, we have 16 annotated samples for this instance across a particular scene.

In [None]:
instance_token = my_instance['token']
level5data.render_instance(instance_token)

An instance record takes note of its first and last annotation token. Let's render them

In [None]:
print("First annotated sample of this instance:")
level5data.render_annotation(my_instance['first_annotation_token'])

In [None]:
print("Last annotated sample of this instance")
level5data.render_annotation(my_instance['last_annotation_token'])

### 6. Category

A `category` is the object assignment of an annotation.  Let's look at the category table we have in our database. The table contains the taxonomy of different object categories and also list the subcategories (delineated by a period). 

In [None]:
level5data.list_categories()

A category record contains the name and the description of that particular category.

In [None]:
level5data.category[2]

### 7. Attribute

An `attribute` is a property of an instance that may change throughout different parts of a scene while the category remains the same. Here we list the provided attributes and the number of annotations associated with a particular attribute.

In [None]:
level5data.list_attributes()

Let's take a look at an example how an attribute may change over one scene

In [None]:
for my_instance in level5data.instance:
    first_token = my_instance['first_annotation_token']
    last_token = my_instance['last_annotation_token']
    nbr_samples = my_instance['nbr_annotations']
    current_token = first_token

    i = 0
    found_change = False
    while current_token != last_token:
        current_ann = level5data.get('sample_annotation', current_token)
        current_attr = level5data.get('attribute', current_ann['attribute_tokens'][0])['name']

        if i == 0:
            pass
        elif current_attr != last_attr:
            print("Changed from `{}` to `{}` at timestamp {} out of {} annotated timestamps".format(last_attr, current_attr, i, nbr_samples))
            found_change = True

        next_token = current_ann['next']
        current_token = next_token
        last_attr = current_attr
        i += 1

### 8. Sensor

The Level 5 dataset consists of data collected from our full sensor suite which consists of:
- 1 x LIDAR, (up to three in final dataset)
- 7 x cameras, 

In [None]:
level5data.sensor

Every `sample_data` has a record on which `sensor` the data is collected from (note the "channel" key)

In [None]:
level5data.sample_data[10]

### 9. Calibrated_sensor

`calibrated_sensor` consists of the definition of a particular sensor (lidar/camera) as calibrated on a particular vehicle. Let us look at an example.

In [None]:
level5data.calibrated_sensor[0]

Note that the `translation` and the `rotation` parameters are given with respect to the ego vehicle body frame. 

### 10. ego_pose

`ego_pose` contains information about the location (encoded in `translation`) and the orientation (encoded in `rotation`) of the ego vehicle body frame, with respect to the global coordinate system.

In [None]:
level5data.ego_pose[0]

### 11. log

The `log` table contains log information from which the data was extracted. A `log` record corresponds to one journey of our ego vehicle along a predefined route. Let's check the number of logs and the metadata of a log.

In [None]:
print("Number of `logs` in our loaded database: {}".format(len(level5data.log)))

In [None]:
level5data.log[0]

Notice that it contains a variety of information such as the date and location of the log collected. It also gives out information about the map from where the data was collected. Note that one log can contain multiple non-overlapping scenes.

### 12. Map

Map information is currently stored in a 2D rasterized image. Let's check the number of maps and metadata of a map.

In [None]:
print("There are {} maps masks in the loaded dataset".format(len(level5data.map)))

In [None]:
level5data.map[0]

### Memory!!

The map can e.g. be displayed in the background of top-down views:
> I don't run this, we need more RAM...

In [None]:
sensor_channel = 'LIDAR_TOP'
#my_sample_data = level5data.get('sample_data', my_sample['data'][sensor_channel])
# The following call can be slow and requires a lot of memory
#level5data.render_sample_data(my_sample_data['token'], underlay_map = True)

## Dataset and Devkit Basics

Let's get a bit **technical.**

The NuScenes class holds several tables. Each table is a list of records, and each record is a dictionary. For example the first record of the category table is stored at:

In [None]:
level5data.category[0]

The category table is simple: it holds the fields `name` and `description`. It also has a `token` field, which is a unique record identifier. Since the record is a dictionary, the token can be accessed like so:

In [None]:
cat_token = level5data.category[0]['token']
cat_token

If you know the `token` for any record in the DB you can retrieve the record by doing

In [None]:
level5data.get('category', cat_token)

_As you can notice, we have recovered the same record!_

OK, that was easy. Let's try something harder. Let's look at the `sample_annotation` table.

In [None]:
level5data.sample_annotation[0]

This also has a `token` field (they all do). In addition, it has several fields of the format [a-z]*\_token, _e.g._ instance_token. These are foreign keys in database speak, meaning they point to another table. 
Using `level5data.get()` we can grab any of these in constant time.

Note that in our dataset, we don't provide `num_lidar_pts` and set it to `-1` to indicate this.

In [None]:
one_instance = level5data.get('instance', level5data.sample_annotation[0]['instance_token'])
one_instance

This points to the `instance` table. This table enumerate the object _instances_ we have encountered in each 
scene. This way we can connect all annotations of a particular object.

If you look carefully at the tables, you will see that the sample_annotation table points to the instance table, 
but the instance table doesn't list all annotations that point to it. 

So how can we recover all sample_annotations for a particular object instance? There are two ways:

1. `Use level5data.field2token()`. Let's try it:

In [None]:
ann_tokens = level5data.field2token('sample_annotation', 'instance_token', one_instance['token'])

This returns a list of all sample_annotation records with the `'instance_token'` == `one_instance['token']`. Let's store these in a set for now

In [None]:
ann_tokens_field2token = set(ann_tokens)

ann_tokens_field2token

The `level5data.field2token()` method is generic and can be used in any similar situation.

2. For certain situation, we provide some reverse indices in the tables themselves. This is one such example. 

The instance record has a field `first_annotation_token` which points to the first annotation in time of this instance. 
Recovering this record is easy.

In [None]:
ann_record = level5data.get('sample_annotation', one_instance['first_annotation_token'])
ann_record

Now we can traverse all annotations of this instance using the "next" field. Let's try it. 

In [None]:
ann_tokens_traverse = set()
ann_tokens_traverse.add(ann_record['token'])
while not ann_record['next'] == "":
    ann_record = level5data.get('sample_annotation', ann_record['next'])
    ann_tokens_traverse.add(ann_record['token'])

Finally, let's assert that we recovered the same ann_records as we did using level5data.field2token:

In [None]:
print(ann_tokens_traverse == ann_tokens_field2token)

## Reverse indexing and short-cuts

The dataset tables are normalized, meaning that each piece of information is only given once.
For example, there is one `map` record for each `log` record. Looking at the schema you will notice that the `map` table has a `log_token` field, but that the `log` table does not have a corresponding `map_token` field. But there are plenty of situations where you have a `log`, and want to find the corresponding `map`! So what to do? You can always use the `level5data.field2token()` method, but that is slow and inconvenient. The devkit therefore adds reverse mappings for some common situations including this one.

Further, there are situations where one needs to go through several tables to get a certain piece of information. 
Consider, for example, the category name of a `sample_annotation`. The `sample_annotation` table doesn't hold this information since the category is an instance level constant. Instead the `sample_annotation` table points to a record in the `instance` table. This, in turn, points to a record in the `category` table, where finally the `name` fields stores the required information.

Since it is quite common to want to know the category name of an annotation, we add a `category_name` field to the `sample_annotation` table during initialization of the NuScenes class.

In this section, we list the short-cuts and reverse indices that are added to the `NuScenes` class during initialization. These are all created in the `NuScenes.__make_reverse_index__()` method.

### Reverse indices
The devkit adds two reverse indices by default.
* A `map_token` field is added to the `log` records.
* The `sample` records have shortcuts to all `sample_annotations` for that record as well as `sample_data` key-frames. Confer `level5data.list_sample()` method in the previous section for more details on this.

### Shortcuts

The sample_annotation table has a "category_name" shortcut.

_Using shortcut:_

In [None]:
catname = level5data.sample_annotation[0]['category_name']

_Not using shortcut:_

In [None]:
ann_rec = level5data.sample_annotation[0]
inst_rec = level5data.get('instance', ann_rec['instance_token'])
cat_rec = level5data.get('category', inst_rec['category_token'])

print(catname == cat_rec['name'])

The sample_data table has "channel" and "sensor_modality" shortcuts:

In [None]:
# Shortcut
channel = level5data.sample_data[0]['channel']

# No shortcut
sd_rec = level5data.sample_data[0]
cs_record = level5data.get('calibrated_sensor', sd_rec['calibrated_sensor_token'])
sensor_record = level5data.get('sensor', cs_record['sensor_token'])

print(channel == sensor_record['channel'])

## Data Visualizations

We provide list and rendering methods. These are meant both as convenience methods during development, and as tutorials for building your own visualization methods. They are implemented in the NuScenesExplorer class, with shortcuts through the NuScenes class itself.

### List methods
There are three list methods available.

1. `list_categories()` lists all categories, counts and statistics of width/length/height in meters and aspect ratio.

In [None]:
level5data.list_categories()

2. `list_attributes()` lists all attributes and counts.

In [None]:
level5data.list_attributes()

3. `list_scenes()` lists all scenes in the loaded DB.

In [None]:
level5data.list_scenes()

### Render

First, let's plot a lidar point cloud in an image. Lidar allows us to accurately map the surroundings in 3D.

In [None]:
my_sample = level5data.sample[10]
level5data.render_pointcloud_in_image(my_sample['token'], pointsensor_channel='LIDAR_TOP')

We can also plot all annotations across all sample data for that sample.

In [None]:
my_sample = level5data.sample[20]

# The rendering command below is commented out because it tends to crash in notebooks
# level5data.render_sample(my_sample['token'])

Or if we only want to render a particular sensor, we can specify that.

In [None]:
level5data.render_sample_data(my_sample['data']['CAM_FRONT'])

Additionally we can aggregate the point clouds from multiple sweeps to get a denser point cloud.

In [None]:
level5data.render_sample_data(my_sample['data']['LIDAR_TOP'], nsweeps=5)

We can even render a specific annotation.

In [None]:
level5data.render_annotation(my_sample['anns'][22])

Finally, we can render a full scene as a video. There are two options here:
1. level5data.render_scene_channel() renders the video for a particular channel. (HIT ESC to exit)
2. level5data.render_scene() renders the video for all surround view camera channels.

**NOTE: These methods use OpenCV for rendering, which doesn't always play nice with IPython Notebooks. If you experience any issues please run these lines from the command line. **

In [None]:
#my_scene_token = level5data.scene[0]["token"]
#level5data.render_scene_channel(my_scene_token, 'CAM_FRONT')

There is also a method level5data.render_scene() which renders the video for all camera channels.

In [None]:
#level5data.render_scene(my_scene_token)

Finally, let us visualize all scenes on the map for a particular location.

In [None]:
#level5data.render_egoposes_on_map(log_location='Palo Alto')

# Continue

I'll keep adding stuff here, mainly EDA and Visualization. If you want to learn more about this, please check the post
[Some important links to get started](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/discussion/108613#latest-625428)


- https://medium.com/@SmartLabAI/3d-object-detection-from-lidar-data-with-deep-learning-95f6d400399a
- https://github.com/timzhang642/3D-Machine-Learning
- https://towardsdatascience.com/the-state-of-3d-object-detection-f65a385f67a8


Also you can read about resources and experiences here [[new quota] Competition Expectations + experiences](https://www.kaggle.com/c/3d-object-detection-for-autonomous-vehicles/discussion/108609#latest-625543)