# Guide to load dataset for inference


## 1. LeRobot Format

* This tutorial will show how to load data in LeRobot Format by using our dataloader. 
* We will use the `robot_sim.PickNPlace` dataset as an example which is already converted to LeRobot Format. 
* To understand how to convert your own dataset, please refer to [Gr00t's LeRobot.md](LeRobot_compatible_data_schema.md)

In [None]:
# download dataset from hub
!huggingface-cli download libero_10_no_noops_1.0.0_lerobot --repo-id libero_10_no_noops_1.0.0_lerobot --allow-patterns "*.mp4"

- meta data

In [3]:
from lerobot.datasets.lerobot_dataset import LeRobotDatasetMetadata

from eo.data.lerobot_dataset import LeRobotDataset

meta = LeRobotDatasetMetadata(
    repo_id="libero_spatial_no_noops_1.0.0_lerobot",
    root="../demo_data/libero_spatial_no_noops_1.0.0_lerobot",
)

The dataset you requested (libero_spatial_no_noops_1.0.0_lerobot) is in 2.0 format.
While current version of LeRobot is backward-compatible with it, the version of your dataset still uses global
stats instead of per-episode stats. Update your dataset stats to the new format using this command:
```
python -m lerobot.datasets.v21.convert_dataset_v20_to_v21 --repo-id=libero_spatial_no_noops_1.0.0_lerobot
```

If you encounter a problem, contact LeRobot maintainers on [Discord](https://discord.com/invite/s3KuuzsPFb)
or open an [issue on GitHub](https://github.com/huggingface/lerobot/issues/new/choose).



In [5]:
!which ffmpeg

/mnt/shared-storage-user/eorobotics-shared/miniconda3/envs/eo/bin/ffmpeg


In [4]:
dataset = LeRobotDataset(
    repo_id="libero_spatial_no_noops_1.0.0_lerobot",
    root="../demo_data/libero_spatial_no_noops_1.0.0_lerobot",
    episodes=[0],
    delta_timestamps={k: [i / meta.fps for i in range(0, 32)] for k in ["action"]}
)
dataset[0]

The dataset you requested (libero_spatial_no_noops_1.0.0_lerobot) is in 2.0 format.
While current version of LeRobot is backward-compatible with it, the version of your dataset still uses global
stats instead of per-episode stats. Update your dataset stats to the new format using this command:
```
python -m lerobot.datasets.v21.convert_dataset_v20_to_v21 --repo-id=libero_spatial_no_noops_1.0.0_lerobot
```

If you encounter a problem, contact LeRobot maintainers on [Discord](https://discord.com/invite/s3KuuzsPFb)
or open an [issue on GitHub](https://github.com/huggingface/lerobot/issues/new/choose).

Generating train split: 110 examples [00:00, 14961.20 examples/s]


* set delta action mode for libero_spatial_no_noops_1.0.0_lerobot ...


{'observation.images.image': tensor([[[0.4235, 0.4235, 0.4235,  ..., 0.4157, 0.4157, 0.4157],
          [0.4235, 0.4235, 0.4235,  ..., 0.4157, 0.4157, 0.4157],
          [0.4235, 0.4235, 0.4235,  ..., 0.4157, 0.4157, 0.4157],
          ...,
          [0.7882, 0.7647, 0.7647,  ..., 0.6980, 0.6980, 0.6980],
          [0.7882, 0.7647, 0.7608,  ..., 0.6980, 0.6980, 0.6980],
          [0.7843, 0.7608, 0.7608,  ..., 0.6980, 0.6980, 0.6980]],
 
         [[0.4157, 0.4157, 0.4157,  ..., 0.4078, 0.4078, 0.4078],
          [0.4157, 0.4157, 0.4157,  ..., 0.4078, 0.4078, 0.4078],
          [0.4157, 0.4157, 0.4157,  ..., 0.4078, 0.4078, 0.4078],
          ...,
          [0.7137, 0.6902, 0.6902,  ..., 0.6392, 0.6353, 0.6353],
          [0.7137, 0.6902, 0.6863,  ..., 0.6392, 0.6353, 0.6353],
          [0.7098, 0.6863, 0.6863,  ..., 0.6392, 0.6353, 0.6353]],
 
         [[0.4000, 0.4000, 0.4000,  ..., 0.3922, 0.3922, 0.3922],
          [0.4000, 0.4000, 0.4000,  ..., 0.3922, 0.3922, 0.3922],
          [0

## 2. Specific Robot Keys

In [None]:
from lerobot.datasets.lerobot_dataset import LeRobotDatasetMetadata

from eo.data.lerobot_dataset import LeRobotDataset

dataset = LeRobotDataset(
    repo_id="libero_10_no_noops_1.0.0_lerobot",
    root="/nvme/eorobotics-oss/DATA/libero_10_no_noops_1.0.0_lerobot",
    episodes=[0],
    select_video_keys=["observation.images.image"],
    select_state_keys=["observation.state"],
    select_action_keys=["action"],
    delta_timestamps={k: [i / meta.fps for i in range(0, 32)] for k in ["action"]}
)

dataset[0].keys()

## 3. Multi-robot Dataset

In [None]:
from lerobot.datasets.lerobot_dataset import LeRobotDatasetMetadata

from eo.data.lerobot_dataset import MultiLeRobotDataset
from eo.data.schema import LerobotConfig

# we use yaml data configs in large scale training
multi_dataset = MultiLeRobotDataset(
    data_configs=[
        LerobotConfig(
            repo_id="libero_10_no_noops_1.0.0_lerobot",
            root="/nvme/eorobotics-oss/DATA",
            episodes=[0],
        ),
        LerobotConfig(
            repo_id="libero_spatial_no_noops_1.0.0_lerobot",
            root="/nvme/eorobotics-oss/DATA",
            episodes=[0],
        )
    ],
    chunk_size=16 # NOTE: automatically construct delta_timestamps from the `metadata` and `chunk_size``
)

## Load MultiModal Datasets

In [None]:
from eo.data.multim_dataset import MultimodaDataset
from eo.data.schema import MMDatasetConfig

multim_dataset = MultimodaDataset(
    data_configs = [
        MMDatasetConfig(
            json_path="demo_data/refcoco/refcoco.jsonl",
            vision_base_path="demo_data/refcoco"
        ),
    ]
)

len(multim_dataset)

In [None]:
from eo.data.lerobot_dataset import MultiLeRobotDataset
from eo.data.multim_dataset import MultimodaDataset
from eo.data.schema import LerobotConfig, MMDatasetConfig

lerobot_dataset = MultiLeRobotDataset(
    data_configs = [
        LerobotConfig(
            repo_id="libero_spatial_no_noops_1.0.0_lerobot",
            root="/mnt/shared-storage-user/eorobotics-shared/EO-1/getting_started/demo_data",
        )
    ]
)

multim_dataset = MultimodaDataset(
    data_configs = [
        MMDatasetConfig(
            json_path="demo_data/libero_spatial_mmu.jsonl"
        ),
    ],
    meta_dataset = lerobot_dataset
)

multim_dataset[0]