# Getting started: Display MoGaze data with humoro

This notebook describes how to get started using the [MoGaze dataset](https://humans-to-robots-motion.github.io/mogaze/) with our pybullet based library [humoro](https://github.com/PhilippJKratzer/humoro).


## Installation
The installation is tested on Ubuntu 18.04.

The packages python3 and python3-pip need to be installed and upgraded (skip if python already is installed):
```bash
sudo apt install python3
sudo apt install python3-pip
python3 -m pip install --upgrade pip --user
```

For parts of the software qt5 is used, it can be installed using:
```bash
sudo apt install qt5-default
```

Clone the repository:
```bash
git clone https://github.com/PhilippJKratzer/humoro.git
```

The requirements can be installed using:
```bash
cd humoro
python3 -m pip install -r requirements.txt --user
```
    
Finally, you can install humoro system-wide using:
```bash
sudo python3 setup.py install
```

Download the dataset files:
```bash
wget pkratzer.net/mogaze.zip
unzip mogaze.zip
```


## Playback Human data
Let's first have a closer look into the human data only. We can load a trajectory from file using the following:


In [1]:
from humoro.trajectory import Trajectory

full_traj = Trajectory()
full_traj.loadTrajHDF5("humoro/mogaze/p2_1_human_data.hdf5")

The trajectory contains a data array, a description of the joints and some fixed joints for scaling:


In [2]:
print("The data has dimension timeframe, state_size:")
print(full_traj.data.shape)
print("")
print("This is a list of jointnames (from the urdf) corresponding to the state dimensions:")
print(list(full_traj.description))
print("")
print("Some joints are used for scaling the human and do not change over time")
print("They are available in a dictionary:")
print(full_traj.data_fixed)
print(full_traj.data[100])

The data has dimension timeframe, state_size:
(164399, 66)

This is a list of jointnames (from the urdf) corresponding to the state dimensions:
['baseTransX', 'baseTransY', 'baseTransZ', 'baseRotX', 'baseRotY', 'baseRotZ', 'pelvisRotX', 'pelvisRotY', 'pelvisRotZ', 'torsoRotX', 'torsoRotY', 'torsoRotZ', 'neckRotX', 'neckRotY', 'neckRotZ', 'headRotX', 'headRotY', 'headRotZ', 'linnerShoulderRotX', 'linnerShoulderRotY', 'linnerShoulderRotZ', 'lShoulderRotX', 'lShoulderRotY', 'lShoulderRotZ', 'lElbowRotX', 'lElbowRotY', 'lElbowRotZ', 'lWristRotX', 'lWristRotY', 'lWristRotZ', 'rinnerShoulderRotX', 'rinnerShoulderRotY', 'rinnerShoulderRotZ', 'rShoulderRotX', 'rShoulderRotY', 'rShoulderRotZ', 'rElbowRotX', 'rElbowRotY', 'rElbowRotZ', 'rWristRotX', 'rWristRotY', 'rWristRotZ', 'lHipRotX', 'lHipRotY', 'lHipRotZ', 'lKneeRotX', 'lKneeRotY', 'lKneeRotZ', 'lAnkleRotX', 'lAnkleRotY', 'lAnkleRotZ', 'lToeRotX', 'lToeRotY', 'lToeRotZ', 'rHipRotX', 'rHipRotY', 'rHipRotZ', 'rKneeRotX', 'rKneeRotY', 'rKneeR

To play the trajectory using the pybullet player, we spawn a human and add a trajectory to the human

In [3]:
from humoro.player_pybullet import Player
pp = Player()
pp.spawnHuman("Human1")
pp.addPlaybackTraj(full_traj, "Human1")

argv[0]=
startThreads creating 1 threads.
starting thread 0
started thread 0 
argc=3
argv[0] = --unused
argv[1] = 
argv[2] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Created GL 3.3 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=NVIDIA GeForce RTX 4080/PCIe/SSE2
GL_VERSION=3.3.0 NVIDIA 525.147.05
GL_SHADING_LANGUAGE_VERSION=3.30 NVIDIA via Cg compiler
pthread_getconcurrency()=0
Version = 3.3.0 NVIDIA 525.147.05
Vendor = NVIDIA Corporation
Renderer = NVIDIA GeForce RTX 4080/PCIe/SSE2
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0 
MotionThreadFunc thread started


pybullet build time: Nov 28 2023 23:48:36


A specific frame can be displayed:

In [4]:
# pp.showFrame(3000)

ven = NVIDIA Corporation


Or a sequence of frames can be played using:

In [5]:
# pp.play(duration=360, startframe=3000)
# pp.play()

There is also a possibility to use a Qt5 widget (pp.play_controls()) to allow fast forward and skipping through the file.  It has also some options for segmenting the data. We explain it in the segmentation section.

## Playback multiple humans at same time
Often it is useful to display multiple human trajectories at the same time. For example, it can be used to show the output of a prediction and the ground truth at the same time. 

It can be achieved by spawning a second human and adding a trajectory to it. A trajectory also has an element *startframe*, which tells the player when a trajectory starts.



In [6]:
import torch
torch.cuda.empty_cache()
from model.individual_TF import IndividualTF
from model.gen_model import GPT
from model.data_utils import *
from model.train_utils import *
from model.transformer.batch import subsequent_mask
from torch.utils.data import DataLoader
from humoro.trajectory import Trajectory

from tqdm import tqdm


joint_dims = 66
seq_len = 50
target_offset = 50
step_size = 1
hidden_size = 64
batch_size = 64

is_cuda = torch.cuda.is_available()
# If we have a GPU available, we'll set our device to GPU. We'll use this device variable later in our code.
if is_cuda:
    device = torch.device("cuda")
else:
    device = torch.device("cpu")

criterion = nn.MSELoss()

with torch.no_grad():
    # TF_model = IndividualTF(enc_inp_size=joint_dims*2, dec_inp_size=(joint_dims*2)+(joint_dims//3), dec_out_size=joint_dims*2, device=device)
    # TF_model.load_state_dict(torch.load('model/trained_model_data/TF_1_small_statedict.pt'))
    # TF_model.eval()

    # obs_dataset = generate_data_from_hdf_file("humoro/mogaze/p1_1_human_data.hdf5", seq_len, target_offset, step_size, use_vel=True)
    # train_loader = DataLoader300(obs_dataset, batch_size=batch_size, num_workers=0, shuffle=False)
    # model_pred = []
    # for x, label in train_loader:
    #     x, label = x.to(device).float(), label.to(device).float()

    #     target = label[:, :-1, :]
    #     # target = x[:, :-1, :]
    #     target_c = torch.ones((target.shape[0], target.shape[1], (target.shape[2]//2)//3)).to(device).float()
    #     target = torch.cat((target, target_c), -1)
    #     start_of_seq = torch.zeros((target.shape[0], 1, target.shape[2])).to(device)
    #     start_of_seq[:, :, -1] = 1

    #     dec_inp = torch.cat((start_of_seq, target), 1)
    #     src_att = torch.ones((x.shape[0], 1, x.shape[1])).to(device).float()
    #     trg_att = subsequent_mask(dec_inp.shape[1]).repeat(dec_inp.shape[0],1,1).to(device).float()
        
    #     out = TF_model(x, dec_inp, src_att, trg_att)
    #     model_pred.extend(out[:, -1][:, :66].cpu())
    GT_model = GPT(n_layer=6, n_head=6, n_embd=192, vocab_size=joint_dims, block_size=seq_len*2-1, pdrop=0.1, device=device)
    GT_model.load_state_dict(torch.load('model/trained_model_data/GT_iso2_direct_statedict.pt'))
    GT_model.eval()

    obs_dataset = generate_data_from_hdf_file("humoro/mogaze/p2_1_human_data.hdf5", seq_len, target_offset, step_size, use_vel=False)
    train_loader = DataLoader(obs_dataset, batch_size=batch_size, num_workers=0, shuffle=False)
    print(len(train_loader.dataset))
    model_pred_lst = []
    for x, label in (train_loader):
        x, label = x.to(device).float(), label.to(device).float()
        x_noised = x + torch.normal(mean=0, std=2.5, size=x.shape, device=device)
        # out, tot = GT_model.generate(x, 50, do_sample=False)
        out, _ = GT_model(x)
        model_pred_lst.extend(out.cpu())
        loss = criterion(out, label)
        change = criterion(out, x)
        print(loss.cpu(), change.cpu())
    # x, label = next(iter(train_loader))
    # x, label = x.to(device).float(), label.to(device).float()
    # out = GT_model.generate(x, 5000, do_sample=False)
    # model_pred.extend(out[:, -1].cpu())


ven = NVIDIA Corporation
number of parameters: 2.70M
164299
tensor(0.0856) tensor(0.0884)
tensor(0.0333) tensor(0.2039)
tensor(0.0128) tensor(0.0200)
tensor(0.0129) tensor(0.0150)
tensor(0.0335) tensor(0.0242)
tensor(0.1304) tensor(0.1087)
tensor(0.0288) tensor(0.0476)
tensor(0.0460) tensor(0.3403)
tensor(0.0179) tensor(0.0835)
tensor(0.0415) tensor(0.1186)
tensor(0.0216) tensor(0.2310)
tensor(0.0153) tensor(0.0101)
tensor(0.0858) tensor(0.0718)
tensor(0.1284) tensor(0.0857)
tensor(0.0457) tensor(0.1589)
tensor(0.1011) tensor(0.1490)
tensor(0.0239) tensor(0.0231)
tensor(0.0025) tensor(0.0026)
tensor(0.0022) tensor(0.0022)
tensor(0.0018) tensor(0.0019)
tensor(0.0017) tensor(0.0018)
tensor(0.0018) tensor(0.0018)
tensor(0.0018) tensor(0.0018)
tensor(0.0043) tensor(0.0019)
tensor(0.0084) tensor(0.0081)
tensor(0.0115) tensor(0.0160)
tensor(0.0186) tensor(0.0263)
tensor(0.0219) tensor(0.0567)
tensor(0.0247) tensor(0.3053)
tensor(0.0106) tensor(0.0358)
tensor(0.0195) tensor(0.0391)
tensor(0.0

In [8]:
# model_pred = np.array(model_pred_lst)[::50, :, 0:66]
# model_pred = model_pred.reshape(model_pred.shape[0]*model_pred.shape[1], model_pred.shape[2])
print(np.array(model_pred_lst).shape)
# print(len(pred_tot_lst[0][0]))
# print(np.array(pred_tot_lst).shape)
model_pred = np.array(model_pred_lst)[:, -1, 0:66]

predicted_traj = Trajectory(model_pred, full_traj.description, full_traj.startframe, full_traj.data_fixed)



# pp.spawnHuman("TrueFuture", color=[0., 1., 0., 1.])
# # this extracts a subtrajectory from the full trajectory:
# sub_traj = full_traj.subTraj(3100, 8100)
# sub_traj.startframe = 3000
# # we change the startframe of the sub_traj,
# # thus the player will play it at a different time:
# pp.addPlaybackTraj(sub_traj, "TrueFuture")

pp.spawnHuman("P2_future", color=[0.5, 0.5, 0., 1.])
p2_traj = Trajectory()
p2_traj.loadTrajHDF5("humoro/mogaze/p2_1_human_data.hdf5")
p2_sub_traj = p2_traj.subTraj(3100, 8100)
p2_sub_traj.startframe = 3000
pp.addPlaybackTraj(p2_sub_traj, "P2_future")

print(p2_traj.data.shape)

pp.spawnHuman("PredFuture", color=[0., 0., 1., 1.])
# this extracts a subtrajectory from the full trajectory:
pred_sub_traj = predicted_traj.subTraj(3000, 8000)
pred_sub_traj.startframe = 3000
# we change the startframe of the sub_traj,
# thus the player will play it at a different time:
pp.addPlaybackTraj(pred_sub_traj, "PredFuture")

pp.play(duration=5000, startframe=3000)

(164299, 50, 66)
(164399, 66)


## Loading Objects
There is a helper function to directly spawn the objects and add the playback trajectories to the player:

In [None]:
from humoro.load_scenes import autoload_objects
obj_trajs, obj_names = autoload_objects(pp, "humoro/mogaze/p1_1_object_data.hdf5", "humoro/mogaze/scene.xml")
pp.play(duration=360, startframe=3000)

You can access the object trajectories and names:

In [None]:
print("objects:")
print(obj_names)
print("data shape for first object:")
print(obj_trajs[0].data.shape)  # 7 dimensions: 3 pos + 4 quaternion rotation

objects:
['table', 'cup_red', 'laiva_shelf', 'vesken_shelf', 'plate_blue', 'jug', 'goggles', 'plate_green', 'plate_red', 'cup_green', 'cup_blue', 'red_chair', 'cup_pink', 'plate_pink', 'bowl', 'blue_chair']
data shape for first object:
(53899, 7)


## Loading Gaze
The gaze can be loaded the following way. Only a trajectory of gaze direction points is loaded, the start point comes from the "goggles" object.

In [None]:
from humoro.gaze import load_gaze
gaze_traj = load_gaze("humoro/mogaze/p1_1_gaze_data.hdf5")
pp.addPlaybackTrajGaze(gaze_traj)

(53900, 3)


In [None]:
pp.play(duration=360, startframe=3000)

If you want to use the raw gaze data, the direction points need to be rotated by the calibration rotation:

In [None]:
print("calibration rotation quaternion:")
print(gaze_traj.data_fixed['calibration'])

calibration rotation quaternion:
[-0.31145353  0.37690775 -0.56556629  0.66413253]


## Segmentations
The following loads a small Qt5 Application that displays a time axis with the segmentations. The file is segmented into when an object moves.

Note that opening the QApplication does not allow to spawn any new objects in pybullet.


In [None]:
pp.play_controls("humoro/mogaze/p1_1_segmentations.hdf5")

TypeError: setSpacing(self, spacing: int): argument 1 has unexpected type 'float'

The label "null" means that no object moves at the moment (e.g. when the human moves towards an object to pick it up.

It is also possible to directly use the segmentation file, it contains elements of the form (startframe, endframe, label):

In [None]:
import h5py
with h5py.File("humoro/mogaze/p1_1_segmentations.hdf5", "r") as segfile:
    # print first 5 segments:
    for i in range(5):
        print(segfile["segments"][i])

(0, 2027, b'null')
(2027, 2429, b'plate_red')
(2430, 2817, b'null')
(2818, 3252, b'plate_green')
(3253, 3673, b'null')


## Kinematics
In order to compute positions from the joint angle trajectory, pybullet can be used. We have a small helper class, which can be used like this:

In [None]:
from humoro.kin_pybullet import HumanKin
kinematics = HumanKin()
kinematics.set_state(full_traj, 100)  # set state at frame 100
print("position of right wrist:")
wrist_id = kinematics.inv_index["rWristRotZ"]
pos = kinematics.get_position(wrist_id)
print(pos)

argv[0]=
position of right wrist:
[0.20642142 0.12073594 0.76067251]


The Jacobian can be retreived with:

In [None]:
print(kinematics.get_jacobian(wrist_id))

[[ 5.55111512e-17  7.60672507e-01 -1.20735945e-01  1.00000000e+00
  -2.22044605e-16 -1.11022302e-16  1.00000000e+00 -2.22044605e-16
  -1.11022302e-16  6.24500451e-17 -2.50631895e-02 -1.08006761e-01
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  1.64496211e-03
   1.18043989e-03  1.00134685e-01 -1.07010249e-01 -9.99354955e-01
   3.05702598e-02 -1.88449707e-02  1.12179663e-02  3.00429302e-01
  -1.08263749e-01  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
   0.00000

: 