Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 172 additions & 1 deletion website/docs/tutorial/data-collection.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,175 @@ title: Data collection
sidebar_position: 2
---

## ...
```mdx-code-block
import BlockImage from '@site/src/components/BlockImage';
```

# Data collection

## VR setup

### One time setup for VR teleoperation

Create a [Meta Quest Developer account](https://developers.meta.com/horizon/) and install [Meta Quest Developer Hub](https://developers.meta.com/horizon/documentation/unity/ts-mqdh/).

Download the teleoperation APK from the releases of the companion VR package.
{/* add package link */}
Sideload the APK onto your Quest 3 via Developer Hub.

### Per-session setup

1. Put on the headset and launch the teleoperation app. After the app is launched, you should see the following screen in the headset.
<BlockImage src="tutorial/data-collection/in_vr_default.jpg" alt="Quest headset" width="90%" />

2. Press the left controller menu button — a settings panel will appear.
<BlockImage src="tutorial/data-collection/in_vr_setting.jpg" alt="VR menu" width="90%" />

3. Enter the IP address of your PC host and the port (default: `5006`).

4. Verify communication from the PC host:
`nc -lu 5006`

5. Tape the center-of-eye sensor on the headset to keep it constantly activated (prevents the display from sleeping when worn at the neck).
Before taping the sensor:
<BlockImage src="tutorial/data-collection/headset_uncover.png" alt="Headset sensor uncovered" width="90%" />

After taping the sensor:
<BlockImage src="tutorial/data-collection/headset_covered.png" alt="Headset sensor covered" width="90%" />

6. Take off the headset and hang it around your neck — you will operate with the controllers while the headset rests there.
<BlockImage src="tutorial/data-collection/how_to_wear_headset.png" alt="Headset around neck" width="90%" />

## Data collection in MuJoCo environment

Even if you don't have the physical robot, you can still go through the data collection process using the MuJoCo environment.
This will allow you to understand the data collection process and the data format before collecting the data on the real robot.
Also, this section is useful for testing your VR teleoperation setup and to make sure that everything is working properly before collecting the data on the real robot.

### Setup for MuJoCo environment

```bash
cd path/to/openarm_vr_teleop
uv sync
source .venv/bin/activate
dora build config/dataflow-mujoco.yaml --uv
```

### Teleoperation in MuJoCo environment

```bash
dora run config/dataflow-mujoco.yaml --uv
```

If VR teleoperation is working properly, you should see the robot arm moving in the MuJoCo environment when you operate the VR controllers.

Also, data collection web UI will open automatically, and you can check the data collection status at `http://localhost:8000`.

<BlockImage src="tutorial/data-collection/mujoco_datacollection.gif" alt="MuJoCo teleoperation" />

left: MuJoCo environment, right: data collection web UI

### How to control data collection with VR controllers

You can use button A and B on the right VR controller to control the data collection.

<BlockImage src="tutorial/data-collection/quest_controller_right.jpg" alt="Controller button" width="40%" />

After the program starts, web UI will open.
Please access `http://localhost:8000` to check the data collection status and control the data collection.

<BlockImage src="tutorial/data-collection/data_collection_start_quit.png" alt="UI top" width="80%" />
When the web UI looks like the image above, you can start the data collection by pressing the A button on the right VR controller. This will start recording the data.

* **A** : start recording
* **B** : stop recording and quit the program

<BlockImage src="tutorial/data-collection/data_collection_success_fail.png" alt="UI recording" width="80%" />

when you start recording, the data collection will start.
If you successfully complete the task, press the A button again to stop recording and mark the episode as success.
If you fail the task, press the B button to stop recording and mark the episode as failure.

* **A** : mark as success
* **B** : mark as failure

### Check the collected data

Default data path can be found in `dataflow-mujoco.yaml` `DIRECTORY: "test_data/mujoco_collection"`.
If you want to change the data path, you can change the value of `DIRECTORY` in the config file.

```yaml
- id: recorder
build: pip install -e node/dora-openarm-dataset-recorder
path: dora-openarm-dataset-recorder
env:
METADATA_FILE: "../test_metadata.yaml"
DIRECTORY: "test_data/mujoco_collection"
```

After finishing the data collection, you can check the collected data in the specified directory. The data will be stored in the OpenArmDataset format.

If you don't have the real robot, skip the next section and move to the dataset format conversion section to convert the collected data to the LeRobot dataset format for training the policy.

## Data collection on the real robot

First, make sure that the CAN interface is up and running and the cameras are set up properly, VR teleoperation is working. Then, run the following command to start the data collection.

### Start data collection

```bash
uv run dora build config/dataflow.yaml --uv
uv run dora run config/dataflow.yaml --uv
```

Same as the MuJoCo data collection, you can specify the output directory by modifying `config/dataflow.yaml`.

If the command runs successfully, the arm will move to the initial position. You need to align your VR controllers to the arm first by holding the trigger button.

Once the alignment completes, you can fully control the arm with the VR controllers.

{/*
add photo of the real robot data collection.
*/}

## Dataset format conversion to LeRobot dataset format

After collecting the dataset, we need to convert the dataset to the LeRobot dataset format for training the policy.

```bash
git clone https://github.com/enactic/openarm_dataset.git
cd openarm_dataset
uv sync
uv run openarm-dataset-convert path/to/collected_dataset_path path/to/output_path --format lerobot_v2.1
```

LeRobot Dataset v2.1 format file structure:

```
output_path/
├── data/
│ ├── chunk-000/
│ │ ├── episode_000000.parquet
│ │ ├── episode_000001.parquet
│ │ └── ...
├── meta/
│ ├── info.json
│ ├── episodes.jsonl
│ ├── episodes_stats.jsonl
│ ├── tasks.jsonl
│ ├── stats.jsonl
├── videos/
│ ├── chunk-000/
│ │ ├── observation.images.wrist_right/
│ │ ├── observation.images.wrist_left/
│ │ ├── observation.images.ceiling/
│ │ ├── observation.images.head_left/
│ │ ├── observation.images.head_right/
│ │ | ├── episode_000000.mp4
│ │ | ├── episode_000001.mp4
│ │ | └── ...
Comment on lines +170 to +172
```

We are planning to support the conversion to LeRobot Dataset v3.0 format in the future, but for now, we only support the conversion to LeRobot Dataset v2.1 format.
If you want to convert to the LeRobot Dataset v3.0 format please check the article below:
[LeRobot Dataset v3.0 format conversion](https://huggingface.co/docs/lerobot/porting_datasets_v3#migrating-from-dataset-v21)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.