## Robomimic Get Started Tutorial

This notebook implements a simple training loop without the extensive features offered in robomimic such as logging and hyperparameter sweeping. Please refer to the [repository](https://github.com/ARISE-Initiative/robomimic) and the [documentation](https://robomimic.github.io/docs/introduction/overview.html) for the full set of features and the rest of the pipeline.

This notebook includes the following tutorials:

1. Set up robomimic development environment
2. Downloading task-specific dataset
3. Create a naive behavior cloning policy
4. Setup a simple training loop
5. Run policy training
6. Visualize the trained policy

###0. Use GPU to accelerate training

To use GPU runtime, click runtime on the top navigation part -> change runtime type -> select GPU as your accelerator

In [None]:
import os
# First, we need to decide where to host the runtime storage
USE_GDRIVE_STORAGE = True

if not USE_GDRIVE_STORAGE:
    # Option 1: use the colab runtime storage. All trained model and downloaded
    # will disappear after you disconnect from the runtime.
    WS_DIR = "/content/"
    os.system("git clone https://github.com/ARISE-Initiative/robomimic")
    os.system("git clone https://github.com/ARISE-Initiative/robosuite.git")

else:
    # Option 2: use your google drive as the runtime storage. You need to grant
    # permission for the colab runtime to access your google drive. You also
    # need to decide on a workspace for robomimic
    from google.colab import drive
    drive.mount('/content/drive')
    WS_DIR = "/content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector" # this should be the absolute path, e.g., "/content/drive/MyDrive/my-ws/"
    assert os.path.exists(WS_DIR)

%cd $WS_DIR

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector


In [None]:
!pip install -e robomimic/

import sys
import os
sys.path.append('./robomimic/')

Obtaining file:///content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector/robomimic
  Preparing metadata (setup.py) ... [?25l[?25hdone
Installing collected packages: robomimic
  Attempting uninstall: robomimic
    Found existing installation: robomimic 0.3.0
    Uninstalling robomimic-0.3.0:
      Successfully uninstalled robomimic-0.3.0
  Running setup.py develop for robomimic
Successfully installed robomimic-0.3.0


In [None]:
import robomimic

print(robomimic.__file__)

/content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector/robomimic/robomimic/__init__.py


In [None]:
!pip install -e robosuite/
# !pip install robosuite==1.4.1

import sys
import os
sys.path.append('./robosuite/')

Obtaining file:///content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector/robosuite
  Installing build dependencies ... [?25l[?25hdone
  Checking if build backend supports build_editable ... [?25l[?25hdone
  Getting requirements to build editable ... [?25l[?25hdone
  Preparing editable metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: robosuite
  Building editable for robosuite (pyproject.toml) ... [?25l[?25hdone
  Created wheel for robosuite: filename=robosuite-1.4.1-0.editable-py3-none-any.whl size=6869 sha256=52068876bb062a8b90c6588c76e19253a92b1421c8f599577585f6432c96462b
  Stored in directory: /tmp/pip-ephem-wheel-cache-7j4q9fuf/wheels/b8/40/b8/096f511417c3cc13eb2fcaad2752f45dd9f8b38d1ed70b1487
Successfully built robosuite
Installing collected packages: robosuite
  Attempting uninstall: robosuite
    Found existing installation: robosuite 1.4.1
    Uninstalling robosuite-1.4.1:
      Successfully uninstalled ro

In [None]:
!python /content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector/robomimic/robomimic/scripts/setup_macros.py

/content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector/robomimic/robomimic/macros_private.py already exists! 
overwrite? (y/n)
y
REMOVING
copied /content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector/robomimic/robomimic/macros.py
to /content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector/robomimic/robomimic/macros_private.py


In [None]:
# for checking test robomimic installation by running a dummy training loop
try:
    from robomimic.macros_private import *
except ImportError:
    from robomimic.utils.log_utils import log_warning
    import robomimic
    log_warning(
        "No private macro file found!"\
        "\nIt is recommended to use a private macro file"\
        "\nTo setup, run: python {}/scripts/setup_macros.py".format(robomimic.__path__[0])
    )

In [None]:
import robosuite

print(robosuite.__path__[0])

/content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector/./robosuite/robosuite


### 1. Set up development environment

The main dependencies of robomimic are
- torch
- numpy
- h5py
- robosuite
- mujoco
- tensorbordX
- egl_probe
- matplotlib


The full list is included in the requirements.txt file in the repo.

Select US keyboard

In [None]:
# Install mujoco and robosuite
import os

# install all system dependencies for mujoco-py
!sudo apt install curl git libgl1-mesa-dev libgl1-mesa-glx libglew-dev \
         libosmesa6-dev software-properties-common net-tools unzip vim \
         virtualenv wget xpra xserver-xorg-dev libglfw3-dev patchelf

#install mujoco-py
!pip install mujoco

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
net-tools is already the newest version (1.60+git20181103.0eebece-1ubuntu5).
wget is already the newest version (1.21.2-2ubuntu1).
libglew-dev is already the newest version (2.2.0-4).
libglfw3-dev is already the newest version (3.3.6-1).
patchelf is already the newest version (0.14.3-1).
virtualenv is already the newest version (20.13.0+ds-2).
xpra is already the newest version (3.1-1build5).
curl is already the newest version (7.81.0-1ubuntu1.13).
git is already the newest version (1:2.34.1-1ubuntu1.9).
libgl1-mesa-dev is already the newest version (23.0.4-0ubuntu1~22.04.1).
libosmesa6-dev is already the newest version (23.0.4-0ubuntu1~22.04.1).
software-properties-common is already the newest version (0.99.22.7).
unzip is already the newest version (6.0-26ubuntu3.1).
vim is already the newest version (2:8.2.3995-1ubuntu2.10).
xserver-xorg-dev is already the newest version (2:21.1.4-2ubunt

In [None]:
# (Optional) test robomimic installation by running a dummy training loop
!python robomimic/examples/train_bc_rnn.py --debug


{
    "algo_name": "bc",
    "experiment": {
        "name": "robosuite_bc_rnn_example",
        "validate": true,
        "logging": {
            "terminal_output_to_txt": false,
            "log_tb": true,
            "log_wandb": false,
            "wandb_proj_name": "debug"
        },
        "save": {
            "enabled": true,
            "every_n_seconds": null,
            "every_n_epochs": 1,
            "epochs": [],
            "on_best_validation": false,
            "on_best_rollout_return": false,
            "on_best_rollout_success_rate": true
        },
        "epoch_every_n_steps": 3,
        "validation_epoch_every_n_steps": 3,
        "env": null,
        "additional_envs": null,
        "render": false,
        "render_video": false,
        "keep_all_videos": false,
        "video_skip": 25,
        "rollout": {
            "enabled": true,
            "n": 2,
            "horizon": 10,
            "rate": 1,
            "warmstart": 0,
            "terminate

## 2. Download demonstration dataset for a task

For robomimic tasks, we organize the demonstration datasets by
- task name (e.g., lift)
- data source (ph - proficient human, mh - multi human, mg - machine-generated)
- observation type (low_dim or image)

For more details of the dataset structure, visit [robomimic documentation](https://robomimic.github.io/docs/datasets/robomimic_v0.1.html) and the [dataset tutorial](https://github.com/ARISE-Initiative/robomimic/blob/master/examples/notebooks/datasets.ipynb)


Here we demonstrate downloading the proficient human (`ph`) dataset with low-dimensional (`low_dim`) observation for the `lift` task.



In [None]:
import os
import json
import h5py
import numpy as np

import robomimic
import robomimic.utils.file_utils as FileUtils

# the dataset registry can be found at robomimic/__init__.py
from robomimic import DATASET_REGISTRY

# set download folder and make it
download_folder = WS_DIR + "/robomimic_data/"
os.makedirs(download_folder, exist_ok=True)

# download the dataset
task = "lift"
dataset_type = "ph"
hdf5_type = "low_dim"
FileUtils.download_url(
    url=DATASET_REGISTRY[task][dataset_type][hdf5_type]["url"],
    download_dir=download_folder,
)

# enforce that the dataset exists
dataset_path = os.path.join(download_folder, "low_dim_v141.hdf5")
assert os.path.exists(dataset_path)

low_dim_v141.hdf5: 21.7MB [00:02, 8.94MB/s]                            


## 3. Build a simple behavior cloning model

Follows the default hyperparameter in `robomimic/config/bc_config.py`.

In [None]:
# import all utility functions

import numpy as np

import torch
from torch.utils.data import DataLoader

import robomimic
import robomimic.utils.obs_utils as ObsUtils
import robomimic.utils.torch_utils as TorchUtils
import robomimic.utils.test_utils as TestUtils
import robomimic.utils.file_utils as FileUtils
import robomimic.utils.train_utils as TrainUtils
from robomimic.utils.dataset import SequenceDataset

from robomimic.config import config_factory
from robomimic.algo import algo_factory

In [None]:
def get_example_model(dataset_path, device):
    """
    Use a default config to construct a BC model.
    """

    # default BC config
    config = config_factory(algo_name="bc")

    # read config to set up metadata for observation modalities (e.g. detecting rgb observations)
    ObsUtils.initialize_obs_utils_with_config(config)

    # read dataset to get some metadata for constructing model
    # all_obs_keys determines what observations we will feed to the policy
    shape_meta = FileUtils.get_shape_metadata_from_dataset(
        dataset_path=dataset_path,
        all_obs_keys=sorted((
            "robot0_eef_pos",  # robot end effector position
            "robot0_eef_quat",   # robot end effector rotation (in quaternion)
            "robot0_gripper_qpos",   # parallel gripper joint position
            "object",  # object information
        )),
    )

    # make BC model
    model = algo_factory(
        algo_name=config.algo_name,
        config=config,
        obs_key_shapes=shape_meta["all_shapes"],
        ac_dim=shape_meta["ac_dim"],
        device=device,
    )
    return model

In [None]:
device = TorchUtils.get_torch_device(try_to_use_cuda=True)
model = get_example_model(dataset_path, device=device)

print(model)



using obs modality: low_dim with keys: ['object', 'robot0_eef_pos', 'robot0_gripper_qpos', 'robot0_eef_quat']
using obs modality: rgb with keys: []
using obs modality: depth with keys: []
using obs modality: scan with keys: []
ObservationKeyToModalityDict: action not found, adding action to mapping with assumed low_dim modality!
BC (
  ModuleDict(
    (policy): ActorNetwork(
        action_dim=7
  
        encoder=ObservationGroupEncoder(
            group=obs
            ObservationEncoder(
                Key(
                    name=object
                    shape=[10]
                    modality=low_dim
                    randomizer=None
                    net=None
                    sharing_from=None
                )
                Key(
                    name=robot0_eef_pos
                    shape=[3]
                    modality=low_dim
                    randomizer=None
                    net=None
                    sharing_from=None
                )
          

## 4. Build a simple training loop

Here we build a simple data loader pipeline and a training loop. Note that this code snippet is only instructional and is a stripped-down version of robomimic's main training loop (`robomimic/scripts/train.py`).

In [None]:
"""
WARNING: This code snippet is only for instructive purposes, and is missing several useful
         components used during training such as logging and rollout evaluation.
"""
def get_data_loader(dataset_path):
    """
    Get a data loader to sample batches of data.
    Args:
        dataset_path (str): path to the dataset hdf5
    """
    dataset = SequenceDataset(
        hdf5_path=dataset_path,
        obs_keys=(                      # observations we want to appear in batches
            "robot0_eef_pos",
            "robot0_eef_quat",
            "robot0_gripper_qpos",
            "object",
        ),
        dataset_keys=(                  # can optionally specify more keys here if they should appear in batches
            "actions",
            "rewards",
            "dones",
        ),
        load_next_obs=True,
        frame_stack=1,
        seq_length=10,                  # length-10 temporal sequences
        pad_frame_stack=True,
        pad_seq_length=True,            # pad last obs per trajectory to ensure all sequences are sampled
        get_pad_mask=False,
        goal_mode=None,
        hdf5_cache_mode="all",          # cache dataset in memory to avoid repeated file i/o
        hdf5_use_swmr=True,
        hdf5_normalize_obs=False,
        filter_by_attribute=None,       # can optionally provide a filter key here
    )
    print("\n============= Created Dataset =============")
    print(dataset)
    print("")

    data_loader = DataLoader(
        dataset=dataset,
        sampler=None,       # no custom sampling logic (uniform sampling)
        batch_size=100,     # batches of size 100
        shuffle=True,
        num_workers=0,
        drop_last=True      # don't provide last batch in dataset pass if it's less than 100 in size
    )
    return data_loader


def run_train_loop(model, data_loader, num_epochs=50, gradient_steps_per_epoch=100):
    """
    Note: this is a stripped down version of @TrainUtils.run_epoch and the train loop
    in the train function in train.py. Logging and evaluation rollouts were removed.
    Args:
        model (Algo instance): instance of Algo class to use for training
        data_loader (torch.utils.data.DataLoader instance): torch DataLoader for
            sampling batches
    """
    # ensure model is in train mode
    model.set_train()

    for epoch in range(1, num_epochs + 1): # epoch numbers start at 1

        # iterator for data_loader - it yields batches
        data_loader_iter = iter(data_loader)

        # record losses
        losses = []

        for _ in range(gradient_steps_per_epoch):

            # load next batch from data loader
            try:
                batch = next(data_loader_iter)
            except StopIteration:
                # data loader ran out of batches - reset and yield first batch
                data_loader_iter = iter(data_loader)
                batch = next(data_loader_iter)

            # process batch for training
            input_batch = model.process_batch_for_training(batch)

            # forward and backward pass
            info = model.train_on_batch(batch=input_batch, epoch=epoch, validate=False)

            # record loss
            step_log = model.log_info(info)
            losses.append(step_log["Loss"])

        # do anything model needs to after finishing epoch
        model.on_epoch_end(epoch)

        print("Train Epoch {}: Loss {}".format(epoch, np.mean(losses)))


## 5. Run policy training

Using the model and the training loop defined above. Note that this simple training loop does not save checkpoint. For model checkpointing, take a look at the full-feature [training loop](https://github.com/ARISE-Initiative/robomimic/blob/master/robomimic/scripts/train.py#L290) and the [documentation](https://robomimic.github.io/docs/tutorials/viewing_results.html)

In [None]:
# get dataset loader
data_loader = get_data_loader(dataset_path=dataset_path)

# run training loop
run_train_loop(model=model, data_loader=data_loader, num_epochs=50, gradient_steps_per_epoch=100)

SequenceDataset: loading dataset into memory...
100%|██████████| 200/200 [00:00<00:00, 225.93it/s]
SequenceDataset: caching get_item calls...
100%|██████████| 9666/9666 [00:01<00:00, 5159.49it/s]

SequenceDataset (
	path=/content/drive/MyDrive/01_research/ICRA2023-cybersecurity/error-awareness-detector/robomimic_data/low_dim_v141.hdf5
	obs_keys=('robot0_eef_pos', 'robot0_eef_quat', 'robot0_gripper_qpos', 'object')
	seq_length=10
	filter_key=none
	frame_stack=1
	pad_seq_length=True
	pad_frame_stack=True
	goal_mode=none
	cache_mode=all
	num_demos=200
	num_sequences=9666
)

Train Epoch 1: Loss 0.15710235111415385
Train Epoch 2: Loss 0.1130978761613369
Train Epoch 3: Loss 0.08212806325405836
Train Epoch 4: Loss 0.06557119220495224
Train Epoch 5: Loss 0.05618283927440643
Train Epoch 6: Loss 0.05050862692296505
Train Epoch 7: Loss 0.046122589204460385
Train Epoch 8: Loss 0.04354785908013582
Train Epoch 9: Loss 0.04141165215522051
Train Epoch 10: Loss 0.03935236385092139
Train Epoch 11: Loss 

## 6. Evaluate and visualize trained policy

Here we execute the trained policy `model` in a simulated environment and play the rollout video.

In [None]:
# create simulation environment

import robomimic.utils.env_utils as EnvUtils

env_meta = FileUtils.get_env_metadata_from_dataset(dataset_path)

env = EnvUtils.create_env_from_metadata(
    env_meta=env_meta,
    env_name=env_meta["env_name"],
    render=False,
    render_offscreen=True,
    use_image_obs=False,
)

Created environment with name Lift
Action size is 7


In [None]:
from robomimic.algo import RolloutPolicy
from robomimic.utils.train_utils import run_rollout
import imageio

# create a thin wrapper around the model to interact with the environment
policy = RolloutPolicy(model)

# create a video writer
video_path = "rollout.mp4"
video_writer = imageio.get_writer(video_path, fps=20)

# run rollout
rollout_log = run_rollout(
    policy=policy,
    env=env,
    horizon=200,
    video_writer=video_writer,
    render=False
)

video_writer.close()
# print rollout results
print(rollout_log)

{'Return': 51.0, 'Horizon': 200, 'Success_Rate': 1.0}


In [None]:
# visualize rollout video

from IPython.display import HTML
from base64 import b64encode

mp4 = open(video_path, "rb").read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML(f"""
<video width=400 controls>
      <source src="{data_url}" type="video/mp4">
</video>
""")