Skip to content
Switch branches/tags

Lyft Dataset SDK

Welcome to the devkit for the Lyft Level 5 AV dataset! This devkit shall help you to visualise and explore our dataset.

Release Notes

This devkit is based on a version of the nuScenes devkit.

Getting Started


You can use pip to install lyft-dataset-sdk:

pip install -U lyft_dataset_sdk

If you want to get the latest version of the code before it is released on PyPI you can install the library from GitHub:

pip install -U git+

Dataset Download

Go to to download the Lyft Level 5 AV Dataset.

The dataset is also availible as a part of the Lyft 3D Object Detection for Autonomous Vehicles Challenge.

Utils for converting LEVEL5 data into Kitti format

Simply run
python -m lyft_dataset_sdk.utils.export_kitti nuscenes_gt_to_kitti --lyft_dataroot ${DS_PTH} --table_folder ${TBL_PTH}
for converting data.
See help ( python -m lyft_dataset_sdk.utils.export_kitti nuscenes_gt_to_kitti --help ) for more information.
You can draw results after converting with utils:
python -m lyft_dataset_sdk.utils.export_kitti render_kitti

Tutorial and Reference Model

Check out the tutorial and reference model README.

Dataset structure

The dataset contains of json files:

  1. scene.json - 25-45 seconds snippet of a car's journey.
  2. sample.json - An annotated snapshot of a scene at a particular timestamp.
  3. sample_data.json - Data collected from a particular sensor.
  4. sample_annotation.json - An annotated instance of an object within our interest.
  5. instance.json - Enumeration of all object instance we observed.
  6. category.json - Taxonomy of object categories (e.g. vehicle, human).
  7. attribute.json - Property of an instance that can change while the category remains the same.
  8. visibility.json - (currently not used)
  9. sensor.json - A specific sensor type.
  10. calibrated_sensor.json - Definition of a particular sensor as calibrated on a particular vehicle.
  11. ego_pose.json - Ego vehicle poses at a particular timestamp.
  12. log.json - Log information from which the data was extracted.
  13. map.json - Map data that is stored as binary semantic masks from a top-down view.

With the schema.

Data Exploration Tutorial

To get started with the Lyft Dataset SDK, run the tutorial using Jupyter Notebook.


We would be happy to accept issue reports and pull requests from the community.

For creating pull requests follow our contributing guide.