# A deep dive into robomimic datasets (D4RL)

This notebook will provide examples on how to work with robomimic datasets through various python code examples. This notebook assumes that you have installed `robomimic` and `d4rl` (which should be on the `offline_study` branch).

## Download dataset

We are going to download the D4RL dataset and convert it to be compatible with `robomimic`

In [12]:
import os
import json
import h5py
import numpy as np

import robomimic
import robomimic.utils.file_utils as FileUtils

# the dataset registry can be found at robomimic/__init__.py
from robomimic import DATASET_REGISTRY

download_dataset = False


if download_dataset:
    # set download folder and make it
    download_folder = "/tmp/robomimic_ds_example"
    os.makedirs(download_folder, exist_ok=True)

    # download the dataset
    task = "lift"
    dataset_type = "ph"
    hdf5_type = "low_dim"
    FileUtils.download_url(
        url=DATASET_REGISTRY[task][dataset_type][hdf5_type]["url"], 
        download_dir=download_folder,
    )

    # enforce that the dataset exists
    dataset_path = os.path.join(download_folder, "low_dim_v141.hdf5")
    assert os.path.exists(dataset_path)
else:
    dataset_path = "/home/wjung85/Repo/projects/robomimic-safediffusion/datasets/d4rl/converted/walker2d_medium_expert_v2.hdf5"

## Read quantities from dataset

Next, let's demonstrate how to read different quantities from the dataset. There are scripts such as `scripts/get_dataset_info.py` that can help you easily understand the contents of a dataset, but in this example, we'll break down how to do this directly.

First, let's take a look at the number of demonstrations in the file.

In [14]:
# open file
f = h5py.File(dataset_path, "r")

# each demonstration is a group under "data"
demos = list(f["data"].keys())
num_demos = len(demos)

print("hdf5 file {} has {} demonstrations".format(dataset_path, num_demos))

hdf5 file /home/wjung85/Repo/projects/robomimic-safediffusion/datasets/d4rl/converted/walker2d_medium_expert_v2.hdf5 has 2190 demonstrations


Next, let's list all of the demonstrations, along with the number of state-action pairs in each demonstration.

In [15]:
# each demonstration is named "demo_#" where # is a number.
# Let's put the demonstration list in increasing episode order
inds = np.argsort([int(elem[5:]) for elem in demos])
demos = [demos[i] for i in inds]

for ep in demos:
    num_actions = f["data/{}/actions".format(ep)].shape[0]
    print("{} has {} samples".format(ep, num_actions))

demo_0 has 1000 samples
demo_1 has 1000 samples
demo_2 has 1000 samples
demo_3 has 848 samples
demo_4 has 1000 samples
demo_5 has 909 samples
demo_6 has 1000 samples
demo_7 has 598 samples
demo_8 has 765 samples
demo_9 has 1000 samples
demo_10 has 980 samples
demo_11 has 1000 samples
demo_12 has 668 samples
demo_13 has 1000 samples
demo_14 has 346 samples
demo_15 has 754 samples
demo_16 has 1000 samples
demo_17 has 1000 samples
demo_18 has 1000 samples
demo_19 has 1000 samples
demo_20 has 457 samples
demo_21 has 1000 samples
demo_22 has 591 samples
demo_23 has 1000 samples
demo_24 has 1000 samples
demo_25 has 1000 samples
demo_26 has 1000 samples
demo_27 has 1000 samples
demo_28 has 1000 samples
demo_29 has 496 samples
demo_30 has 1000 samples
demo_31 has 1000 samples
demo_32 has 1000 samples
demo_33 has 100 samples
demo_34 has 667 samples
demo_35 has 1000 samples
demo_36 has 1000 samples
demo_37 has 1000 samples
demo_38 has 1000 samples
demo_39 has 658 samples
demo_40 has 1000 samples

Now, let's dig into a single trajectory to take a look at some of the quantities in each demonstration.

In [16]:
# look at first demonstration
demo_key = demos[0]
demo_grp = f["data/{}".format(demo_key)]

# Each observation is a dictionary that maps modalities to numpy arrays, and
# each action is a numpy array. Let's print the observations and actions for the 
# first 5 timesteps of this trajectory.
for t in range(5):
    print("timestep {}".format(t))
    obs_t = dict()
    # each observation modality is stored as a subgroup
    for k in demo_grp["obs"]:
        obs_t[k] = demo_grp["obs/{}".format(k)][t] # numpy array
    act_t = demo_grp["actions"][t]
    
    # pretty-print observation and action using json
    obs_t_pp = { k : obs_t[k].tolist() for k in obs_t }
    print("obs")
    print(json.dumps(obs_t_pp, indent=4))
    print("action")
    print(act_t)

timestep 0
obs
{
    "flat": [
        1.2497172355651855,
        0.0020957256201654673,
        0.004812030587345362,
        -0.00031527638202533126,
        0.003084913594648242,
        -0.004972065798938274,
        -0.004506053868681192,
        0.003609082428738475,
        0.0035024546086788177,
        -0.0017032271716743708,
        -0.0036123539321124554,
        -0.0028234506025910378,
        -0.0013433537678793073,
        -0.0002545023453421891,
        0.0006495127454400063,
        -0.0011706806253641844,
        -0.004901927430182695
    ]
}
action
[-0.24307711  0.1645457  -0.34831366 -0.62253886 -0.9168731  -0.8515803 ]
timestep 1
obs
{
    "flat": [
        1.250348448753357,
        -0.014134379103779793,
        -0.014496925286948681,
        0.003145417897030711,
        -0.021036669611930847,
        -0.02088284306228161,
        -0.008337358012795448,
        -0.044759467244148254,
        -0.5076229572296143,
        0.15382137894630432,
        -3.9686379432

In [17]:
# we can also grab multiple timesteps at once directly, or even the full trajectory at once
first_ten_actions = demo_grp["actions"][:10]
print("shape of first ten actions {}".format(first_ten_actions.shape))
all_actions = demo_grp["actions"][:]
print("shape of all actions {}".format(all_actions.shape))

shape of first ten actions (10, 6)
shape of all actions (1000, 6)


In [18]:
# the trajectory also contains the next observations under "next_obs", 
# for convenient use in a batch (offline) RL pipeline. Let's verify
# that "next_obs" and "obs" are offset by 1.
for k in demo_grp["obs"]:
    # obs_{t+1} == next_obs_{t}
    assert(np.allclose(demo_grp["obs"][k][1:], demo_grp["next_obs"][k][:-1]))
print("success")

success


In [19]:
# we also have "done" and "reward" information stored in each trajectory.
# In this case, we have sparse rewards that indicate task completion at
# that timestep.
dones = demo_grp["dones"][:]
rewards = demo_grp["rewards"][:]
print("dones")
print(dones)
print("")
print("rewards")
print(rewards)

dones
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 

In [26]:
# each demonstration also contains metadata
num_samples = demo_grp.attrs["num_samples"] # number of samples in this trajectory
mujoco_xml_file = demo_grp.attrs["model_file"] # mujoco XML file for this demonstration
print(mujoco_xml_file)

KeyError: "Unable to synchronously open attribute (can't locate attribute: 'model_file')"

Finally, let's take a look at some global metadata present in the file. The hdf5 file stores environment metadata which is a convenient way to understand which simulation environment (task) the dataset was collected on. 

In [20]:
env_meta = json.loads(f["data"].attrs["env_args"])
# note: we could also have used the following function:
# env_meta = FileUtils.get_env_metadata_from_dataset(dataset_path=dataset_path)
print("==== Env Meta ====")
print(json.dumps(env_meta, indent=4))
print("")

==== Env Meta ====
{
    "env_name": "walker2d-medium-expert-v2",
    "type": 2,
    "env_kwargs": {}
}



## Visualizing demonstration trajectories

Finally, let's play some of these demonstrations back in the simulation environment to easily visualize the data that was collected.

It turns out that the environment metadata stored in the hdf5 allows us to easily create a simulation environment that is consistent with the way the dataset was collected!

In [22]:
import robomimic.utils.env_utils as EnvUtils

# create simulation environment from environment metedata
env = EnvUtils.create_env_from_metadata(
    env_meta=env_meta, 
    render=True,            # no on-screen rendering
    render_offscreen=True,   # off-screen rendering to support rendering video frames
)

Created environment with name walker2d-medium-expert-v2
Action size is 6
    No environment version found in dataset!
    Cannot verify if dataset and installed environment versions match
)[0m


  logger.warn(f"Box bound precision lowered by casting to {self.dtype}")


In [23]:
import imageio
download_folder = "/home/wjung85/Repo/projects/robomimic-safediffusion/datasets/d4rl"
video_path = os.path.join(download_folder, "playback.mp4")
video_writer = imageio.get_writer(video_path, fps=20)

# init_state = f["data/{}/states".format(demo_key)][0]
# initial_state_dict = dict(states=init_state)

# reset to initial state
env.reset()

# playback actions one by one, and render frames
actions = f["data/{}/actions".format(demo_key)][:]
for t in range(actions.shape[0]):
    env.step(actions[t])
    video_img = env.render(mode="rgb_array", height=512, width=512, camera_name="agentview")
    video_writer.append_data(video_img)

video_writer.close()

Found 4 GPUs for rendering. Using device 0.


In [24]:
demo_key = "demo_1"

'demo_0'

In [12]:
import robomimic.utils.obs_utils as ObsUtils

# We normally need to make sure robomimic knows which observations are images (for the
# data processing pipeline). This is usually inferred from your training config, but
# since we are just playing back demonstrations, we just need to initialize robomimic
# with a dummy spec.
dummy_spec = dict(
    obs=dict(
            low_dim=["robot0_eef_pos"],
            rgb=[],
        ),
)
ObsUtils.initialize_obs_utils_with_obs_specs(obs_modality_specs=dummy_spec)



using obs modality: low_dim with keys: ['robot0_eef_pos']
using obs modality: rgb with keys: []


In [13]:
import imageio

# prepare to write playback trajectories to video
video_path = os.path.join(download_folder, "playback.mp4")
video_writer = imageio.get_writer(video_path, fps=20)

In [14]:
def playback_trajectory(demo_key):
    """
    Simple helper function to playback the trajectory stored under the hdf5 group @demo_key and
    write frames rendered from the simulation to the active @video_writer.
    """
    
    # robosuite datasets store the ground-truth simulator states under the "states" key.
    # We will use the first one, alone with the model xml, to reset the environment to
    # the initial configuration before playing back actions.
    init_state = f["data/{}/states".format(demo_key)][0]
    model_xml = f["data/{}".format(demo_key)].attrs["model_file"]
    initial_state_dict = dict(states=init_state, model=model_xml)
    
    # reset to initial state
    env.reset_to(initial_state_dict)
    
    # playback actions one by one, and render frames
    actions = f["data/{}/actions".format(demo_key)][:]
    for t in range(actions.shape[0]):
        env.step(actions[t])
        video_img = env.render(mode="rgb_array", height=512, width=512, camera_name="agentview")
        video_writer.append_data(video_img)

In [15]:
# playback the first 5 demos
for ep in demos[:5]:
    print("Playing back demo key: {}".format(ep))
    playback_trajectory(ep)

# done writing video
video_writer.close()

Playing back demo key: demo_0
Playing back demo key: demo_1
Playing back demo key: demo_2
Playing back demo key: demo_3
Playing back demo key: demo_4


In [16]:
# view the trajectories!
from IPython.display import Video
Video(video_path, embed=True)