# Examples for Behavior Cloning Dataset Generation and Usage in Training

This notebook contains convenient examples demonstrating use of dataset generation and training pipeline functions.

1. Start off by running the first cell below.
2. Then, proceed in order. Some cells may be able to be skipped depending on their content.

In [1]:
%load_ext autoreload
%autoreload 2

import os
import sys
from pathlib import Path
sys.path.append(str(Path(os.getcwd()).parent.absolute()))

dataset_dir_path = Path('D:/bc-train-data-test')

### Generating a Bimanual Behavior Cloning Dataset

Here, we generate a dataset of (BimanualObs, BimanualAction) samples for the "pass block" task using policy.PrivilegedPolicy.
This takes a while, so I've implemented convenient mechanisms for resuming data generation between sessions.
The resulting output files can be accessed again with the BimanualDataset and HumanReadableBimanualDataset classes, as shown later.

In [7]:
from train.dataset import generate_bimanual_dataset

generate_bimanual_dataset(
  save_dir=dataset_dir_path,
  total_sample_count=10000,
  max_steps_per_rollout=600,
  skip_frames=2,
  camera_dims=(128, 128),
  resume=True
)

Bimanual dataset save directory is set to `D:\bc-train-data-test`.
Resuming from sample 9893/10000.


Attempting rollout 70.:  65%|██████▌   | 390/600 [00:38<00:20, 10.17it/s]


 - Rollout succeeded. Saved 107 samples at 2025-10-25 14:45:06.829538. (10000/10000)
Finished generating 10000 samples in D:\bc-train-data-test.


### Viewing a Rollout from the Generated Dataset

In [None]:
from robot.visualize import save_frames_to_video
from train.dataset import HumanReadableBimanualDataset

dataset = HumanReadableBimanualDataset(dataset_dir_path)
rollout_number = 7
rollout_length = dataset.metadata.rollout_lengths[rollout_number]
rollout_start = sum(dataset.metadata.rollout_lengths[:rollout_number])
observations = [dataset[i][0] for i in range(rollout_start, rollout_start + rollout_length)]

os.makedirs('out', exist_ok=True)
left_wrist_video_path = f'out/left_wrist_rollout_{rollout_number}.mp4'
right_wrist_video_path = f'out/right_wrist_rollout_{rollout_number}.mp4'
save_frames_to_video([observation.visual[0].detach().numpy() for observation in observations], left_wrist_video_path)
save_frames_to_video([observation.visual[1].detach().numpy() for observation in observations], right_wrist_video_path)



In [6]:
from IPython.display import Video
Video(left_wrist_video_path, width=400, height=400)

In [7]:
from IPython.display import Video
Video(right_wrist_video_path, width=400, height=400)

### Training Pipeline

Here we demonstrate the suite of utilities in this `train` subpackage.

1. `train.train_utils.Logs`
  - Utility class for organizing training output files. We should use this going forward to remain organized and expand its capabilities as needed.
2. `train.dataset.HumanReadableBimanualDataset`
  - Dataset implementation for reading behavior cloning data.
3. `train.trainer.BCTrainer`
  - A class for training an arbitrary model for behavior cloning. We should more-or-less keep this class's core logic as-is.

Each of these utilities currently have some TODO comments in their code - take a look. In summary, (1. Logs) doesn't currently do much logging - the Jobs class in the same file needs to have log piping implemented. (2. Dataset) currently just uses robot.sim numpy-based dataclasses for observations and actions, it should start using the torch-based versions defined in dataset.py and moving the .npy files to GPU. (3. BCTrainer) isn't very configurable right now - the optimizer and other details should be constructor args that we can dynamically configure with hydra or something.

In [3]:
import os
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from robot.sim import JOINT_OBSERVATION_SIZE
from train.dataset import HumanReadableBimanualDataset, TensorBimanualAction, TensorBimanualObs
from train.train_utils import Logs
from train.trainer import BCTrainer, BimanualActor

# example bimanual actor class
class ExampleModel(BimanualActor):
  def __init__(self, observation_size: int, action_size: int, hidden_size: int = 256):
    super().__init__()
    self.layers = nn.Sequential(
      nn.Linear(observation_size, hidden_size),
      nn.ReLU(),
      nn.Linear(hidden_size, hidden_size),
      nn.ReLU(),
      nn.Linear(hidden_size, action_size)
    )

  def forward(self, obs: TensorBimanualObs) -> TensorBimanualAction:
    x = torch.cat((obs.visual.reshape(obs.visual.shape[0], -1), obs.qpos.array), dim=-1)
    return TensorBimanualAction(self.layers(x))

# load dataset and set up trainer
BATCH_SIZE = 256
EPOCHS = 10
CHECKPOINT_FREQUENCY = 1
dataset = HumanReadableBimanualDataset(dataset_dir_path)
dataloader = DataLoader(dataset, batch_size=BATCH_SIZE, collate_fn=HumanReadableBimanualDataset.collate_fn)
os.makedirs('out/training-output', exist_ok=True)
logs = Logs('out/training-output')
new_job = logs.create_new_job(tag='example')

# instantiate model
input_size = dataset.metadata.observation_size - JOINT_OBSERVATION_SIZE  # exclude qvel observation
output_size = dataset.metadata.action_size
model = ExampleModel(input_size, output_size)
if torch.cuda.is_available():
  print('Using CUDA.')
  model = model.cuda()
else:
  print('Using CPU.')

# train with behavior cloning objective
trainer = BCTrainer(dataloader, checkpoint_frequency=CHECKPOINT_FREQUENCY, job=new_job)
trainer.train(model, EPOCHS)

Using CUDA.
Training model for 10 epochs.


Epoch 0: 100%|██████████| 40/40 [02:07<00:00,  3.18s/it]


 - Epoch 0 loss: 504.3760


Epoch 1: 100%|██████████| 40/40 [00:01<00:00, 31.21it/s]


 - Epoch 1 loss: 15.9850


Epoch 2: 100%|██████████| 40/40 [00:01<00:00, 35.22it/s]


 - Epoch 2 loss: 11.6654


Epoch 3: 100%|██████████| 40/40 [00:01<00:00, 34.61it/s]


 - Epoch 3 loss: 10.7553


Epoch 4: 100%|██████████| 40/40 [00:01<00:00, 31.77it/s]


 - Epoch 4 loss: 10.0038


Epoch 5: 100%|██████████| 40/40 [00:01<00:00, 34.72it/s]


 - Epoch 5 loss: 9.4808


Epoch 6: 100%|██████████| 40/40 [00:01<00:00, 35.18it/s]


 - Epoch 6 loss: 9.2211


Epoch 7: 100%|██████████| 40/40 [00:01<00:00, 30.88it/s]


 - Epoch 7 loss: 9.4972


Epoch 8: 100%|██████████| 40/40 [00:01<00:00, 35.10it/s]


 - Epoch 8 loss: 8.9548


Epoch 9: 100%|██████████| 40/40 [00:01<00:00, 34.40it/s]


 - Epoch 9 loss: 9.2741
