# LeRobot Dataset Loading Tutorial

This notebook demonstrates how to use the `LeRobotDataset` class for handling and processing robotic datasets from Hugging Face. It covers:

- Viewing dataset metadata and exploring properties
- Loading datasets from the hub or subsets
- Accessing frames by episode number
- Using advanced features like timestamp-based frame selection
- Demonstrating compatibility with PyTorch DataLoader for batch processing

## Import Required Libraries

First, let's import all the necessary libraries for working with LeRobot datasets.

In [1]:
from pprint import pprint

import torch
from huggingface_hub import HfApi

import lerobot
from lerobot.common.datasets.lerobot_dataset import (
    LeRobotDataset,
    LeRobotDatasetMetadata,
)

## Discover Available Datasets

Let's explore what datasets are available, including both built-in datasets and community hub datasets. We'll specifically look for Kuka robot datasets.

In [3]:
# Get all available datasets (built-in + community hub)
hub_api = HfApi()
all_datasets = lerobot.available_datasets + [
    info.id
    for info in hub_api.list_datasets(task_categories="robotics", tags=["LeRobot"])
]
kuka_datasets = [repo_id for repo_id in all_datasets if "kuka" in repo_id.lower()]
pprint(f"Total datasets: {len(all_datasets)}, Kuka datasets: {kuka_datasets}")

# Or simply explore them in your web browser directly at:
# https://huggingface.co/datasets?other=LeRobot

('Total datasets: 9823, Kuka datasets: '
 "['lerobot/stanford_kuka_multimodal_dataset', "
 "'lerobot/stanford_kuka_multimodal_dataset', "
 "'aliberts/stanford_kuka_multimodal_dataset', 'IPEC-COMMUNITY/kuka_lerobot', "
 "'zjc020603/kuka_single_arm', 'chuanmew/kuka_lerobot']")


## Explore Dataset Metadata

Before downloading the actual data, we can examine the dataset's metadata to understand its structure and properties.

In [4]:
# Let's take this one for this example
repo_id = "lerobot/stanford_kuka_multimodal_dataset"
# We can have a look and fetch its metadata to know more about it:
ds_meta = LeRobotDatasetMetadata(repo_id)

# By instantiating just this class, you can quickly access useful information about the content and the
# structure of the dataset without downloading the actual data yet (only metadata files — which are
# lightweight).
print(f"Total number of episodes: {ds_meta.total_episodes}")
print(
    f"Average number of frames per episode: {ds_meta.total_frames / ds_meta.total_episodes:.3f}"
)
print(f"Frames per second used during data collection: {ds_meta.fps}")
print(f"Robot type: {ds_meta.robot_type}")
print(f"keys to access images from cameras: {ds_meta.camera_keys=}\n")

Total number of episodes: 3000
Average number of frames per episode: 49.995
Frames per second used during data collection: 20
Robot type: unknown
keys to access images from cameras: ds_meta.camera_keys=['observation.images.image']



In [5]:
print("Tasks:")
print(ds_meta.tasks)
print("Features:")
pprint(ds_meta.features)

# You can also get a short summary by simply printing the object:
print(ds_meta)

Tasks:
{0: 'insert the peg into the hole'}
Features:
{'action': {'dtype': 'float32',
            'names': {'motors': ['motor_0',
                                 'motor_1',
                                 'motor_2',
                                 'motor_3',
                                 'motor_4',
                                 'motor_5',
                                 'motor_6']},
            'shape': (7,)},
 'episode_index': {'dtype': 'int64', 'names': None, 'shape': (1,)},
 'frame_index': {'dtype': 'int64', 'names': None, 'shape': (1,)},
 'index': {'dtype': 'int64', 'names': None, 'shape': (1,)},
 'next.done': {'dtype': 'bool', 'names': None, 'shape': (1,)},
 'next.reward': {'dtype': 'float32', 'names': None, 'shape': (1,)},
 'observation.images.image': {'dtype': 'video',
                              'names': ['height', 'width', 'channel'],
                              'shape': (128, 128, 3),
                              'video_info': {'has_audio': False,
              

## Load Dataset Subsets

You can load specific episodes from the dataset instead of downloading the entire dataset.

In [6]:
# You can then load the actual dataset from the hub.
# Either load any subset of episodes:
dataset = LeRobotDataset(repo_id, episodes=[0, 10, 11, 23])

# And see how many frames you have:
print(f"Selected episodes: {dataset.episodes}")
print(f"Number of episodes selected: {dataset.num_episodes}")
print(f"Number of frames selected: {dataset.num_frames}")

Selected episodes: [0, 10, 11, 23]
Number of episodes selected: 4
Number of frames selected: 200


## Load Complete Dataset

Alternatively, you can load the entire dataset at once.

In [7]:
# Or simply load the entire dataset:
dataset = LeRobotDataset(repo_id)
print(f"Number of episodes selected: {dataset.num_episodes}")
print(f"Number of frames selected: {dataset.num_frames}")

# The previous metadata class is contained in the 'meta' attribute of the dataset:
print(dataset.meta)

# LeRobotDataset actually wraps an underlying Hugging Face dataset
# (see https://huggingface.co/docs/datasets for more information).
print(dataset.hf_dataset)

Resolving data files:   0%|          | 0/3000 [00:00<?, ?it/s]

Number of episodes selected: 3000
Number of frames selected: 149985
LeRobotDatasetMetadata({
    Repository ID: 'lerobot/stanford_kuka_multimodal_dataset',
    Total episodes: '3000',
    Total frames: '149985',
    Features: '['observation.images.image', 'observation.state', 'action', 'timestamp', 'episode_index', 'frame_index', 'next.reward', 'next.done', 'index', 'task_index']',
})',

Dataset({
    features: ['observation.state', 'action', 'timestamp', 'episode_index', 'frame_index', 'next.reward', 'next.done', 'index', 'task_index'],
    num_rows: 149985
})


## Access Individual Frames and Episodes

LeRobot datasets are PyTorch-compatible, so you can iterate through them and access individual frames. Let's examine frames from the first episode.

In [8]:
# LeRobot datasets also subclasses PyTorch datasets so you can do everything you know and love from working
# with the latter, like iterating through the dataset.
# The __getitem__ iterates over the frames of the dataset. Since our datasets are also structured by
# episodes, you can access the frame indices of any episode using the episode_data_index. Here, we access
# frame indices associated to the first episode:
episode_index = 0
from_idx = dataset.episode_data_index["from"][episode_index].item()
to_idx = dataset.episode_data_index["to"][episode_index].item()

# Then we grab all the image frames from the first camera:
camera_key = dataset.meta.camera_keys[0]
frames = [dataset[idx][camera_key] for idx in range(from_idx, to_idx)]

# The objects returned by the dataset are all torch.Tensors
print(type(frames[0]))
print(frames[0].shape)

# Since we're using pytorch, the shape is in pytorch, channel-first convention (c, h, w).
# We can compare this shape with the information available for that feature
pprint(dataset.features[camera_key])
# In particular:
print(dataset.features[camera_key]["shape"])
# The shape is in (h, w, c) which is a more universal format.

<class 'torch.Tensor'>
torch.Size([3, 128, 128])
{'dtype': 'video',
 'names': ['height', 'width', 'channel'],
 'shape': (128, 128, 3),
 'video_info': {'has_audio': False,
                'video.codec': 'av1',
                'video.fps': 20.0,
                'video.is_depth_map': False,
                'video.pix_fmt': 'yuv420p'}}
(128, 128, 3)


## Advanced: Loading Temporal Data

For many machine learning applications, you need to load the history of past observations or trajectories of future actions. LeRobot datasets support loading previous and future frames using timestamp differences.

In [None]:
# For many machine learning applications we need to load the history of past observations or trajectories of
# future actions. Our datasets can load previous and future frames for each key/modality, using timestamps
# differences with the current loaded frame. For instance:
delta_timestamps = {
    # loads 4 images: 1 second before current frame, 500 ms before, 200 ms before, and current frame
    camera_key: [-1, -0.5, -0.20, 0],
    # loads 6 state vectors: 1.5 seconds before, 1 second before, ... 200 ms, 100 ms, and current frame
    "observation.state": [-1.5, -1, -0.5, -0.20, -0.10, 0],
    # loads 64 action vectors: current frame, 1 frame in the future, 2 frames, ... 63 frames in the future
    "action": [t / dataset.fps for t in range(64)],
}
# Note that in any case, these delta_timestamps values need to be multiples of (1/fps) so that added to any
# timestamp, you still get a valid timestamp.

dataset = LeRobotDataset(repo_id, delta_timestamps=delta_timestamps)
print(f"\n{dataset[0][camera_key].shape=}")  # (4, c, h, w)
print(f"{dataset[0]['observation.state'].shape=}")  # (6, c)
print(f"{dataset[0]['action'].shape=}\n")  # (64, c)

## PyTorch DataLoader Integration

LeRobot datasets are fully compatible with PyTorch DataLoaders and samplers, making them easy to integrate into your machine learning training pipelines.

In [None]:
# Finally, our datasets are fully compatible with PyTorch dataloaders and samplers because they are just
# PyTorch datasets.
dataloader = torch.utils.data.DataLoader(
    dataset,
    num_workers=0,
    batch_size=32,
    shuffle=True,
)

for batch in dataloader:
    print(f"{batch[camera_key].shape=}")  # (32, 4, c, h, w)
    print(f"{batch['observation.state'].shape=}")  # (32, 6, c)
    print(f"{batch['action'].shape=}")  # (32, 64, c)
    break

## Summary

This notebook demonstrated the key features of LeRobot datasets:

1. **Dataset Discovery**: Finding available datasets on Hugging Face Hub
2. **Metadata Exploration**: Understanding dataset structure without downloading data
3. **Flexible Loading**: Loading complete datasets or specific episodes
4. **Frame Access**: Accessing individual frames and episodes
5. **Temporal Data**: Loading sequences of past/future observations and actions
6. **PyTorch Integration**: Using datasets with PyTorch DataLoaders for training

LeRobot datasets provide a powerful and flexible way to work with robotic datasets, whether you're doing research, training models, or exploring robotic data.