---
## **<p style="text-align: center; text-decoration: underline;">DATA CHALLENGE</p>**
# **<p style="text-align: center;">HUMAN MOTION GENERATION (HMG): Text-To-Motion</p>**
---

> *2025*.

---

![examples](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fimg.clipart-library.com%2F2%2Fclip-motions%2Fclip-motions-6.png&f=1&nofb=1&ipt=0747ffa645bb5f7798e8a2d44499b28f1156ce0e83b1b300fabfed4c6ab1fdf2&ipo=images)

### ■ **Overview**
In this data challenge, you will explore the intersection of natural language processing (NLP) and human motion synthesis by working on text-to-motion and motion-to-text tasks using the HumanML3D dataset. This dataset contains 3D human motion sequences paired with rich textual descriptions, enabling models to learn bidirectional mappings between language and motion.

#### **I. Main Task: Text-to-Motion Generation**
- **Text-to-Motion:** Develop a model to generates human motion given a textual description.

#### **II. Dataset Overview:**
- HumanML3D includes 14,616 motion samples across diverse actions (walking, dancing, sports) and 44,970 text annotations.
- Data includes skeletal joint positions, rotations, and fine-grained textual descriptions.

<img src="https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fproduction-media.paperswithcode.com%2Fdatasets%2F446194c5-ce59-43eb-b4cb-570a7a4d0cd9.png&f=1&nofb=1&ipt=b2edbe3251cab88e26a7f9d4e765c811b2cc890dc2ace7f7456baeca076b115b&ipo=images" alt="description" style="width:800px; height:600px;" />

The provided dataset contains the following components:

- 1. `motions` Folder: Contains `.npy` files, each representing a sequence of body poses. Each file has a shape of `(T, N, d)`, where:
  - `T`: Number of frames in the sequence (varies across sequences).
  - `N`: Number of joints in the body (22 in this case).
  - `d`: Dimension of each joint (3D coordinates: `x`, `y`, `z`).

- 2. `texts` Folder: Contains `.npy` files, each providing **3/4 textual descriptions** of the corresponding motion sequence. Each description is accompanied by part-of-speech (POS) tags for every word in the description. 
> Example: *"a person jump hop to the right#a/DET person/NOUN jump/NOUN hop/NOUN to/ADP the/DET right/NOUN#"*

- 3. File Lists
    - **`train.txt`**: List of motion files for training.
    - **`val.txt`**: List of motion files for validation.
    - **`test.txt`**: List of motion files for testing.


#### **III. Evaluation Metrics**

**Mean Squared Error (MSE):** MSE is a metric used to evaluate the quality of generated data by comparing the:
- Real Data: The ground truth motion sequences.
- Generated Data: The motion sequences produced by the model.

> **Interpretation:**
> - Lower MSE: The generated motions are more similar to the ground truth motions.
> - Higher MSE: The generated motions are less similar to the ground truth motions.

Solutions should be submitted in the following format (in a csv file):

For each ID in the motion test set (`test.txt`), you must predict the corresponding motion (flatten). The file should contain a header and have the following format:

| id      | f_0 | f_1 | f_2 | f_4 | ... | f_6599                                |
|---------|-----|-----|-----|-----|-----|---------------------------------------|
| 011645 | 0.946255 | 0.057217 | 0.862470  | -0.058685 | ... | 0.862470         |
| M008704 | 0.953142 | 0.067002 | 0.874826 | -0.062547 | ... | 0.048156         |
| M011136 | 0.951201 | 0.048156 | 0.864462 | -0.058623 | ... | 0.060903         |
| M009730 | 0.861586 | 0.060903 | 0.779133 | -0.061908 | ... | 0.937443         |
| 003104 | 0.937443 | 0.057995 | 0.852613  | -0.064890 | ... | 0.852613         |

You can generate your submission files using pandas as follows (The code is also provided bellow):

    >>>
    ... import pandas as pd
    ... test_motion_ids = [] # list of test motions ids
    ... pred_test_motions = ... ## pred_test_motions [numpy array] is your predicted motions #shape: (1000, 100, 22, 3), /!\ in the same order as the ids !
    ... submission_data = []
    ... for motion_id, pred_motion in zip(test_motion_ids, pred_test_motions):
    ...    submission_data.append([motion_id] + list(pred_motion.flatten()))   ## pred_motion # shape (100, 22, 3) => pred_motion.flatten() # shape (100 * 22 * 3)
    ... 
    ... submission_df = pd.DataFrame(submission_data, columns=['id'] + [f'f_{i}' for i in range(pred_motion.flatten().shape[0])])
    ... submission_df.to_csv('./submission.csv', index=False)
    ... submission_df.head()
    
    
#### **References**

- Jiang, B., Chen, X., Liu, W., Yu, J., Yu, G., & Chen, T. (2023). Motiongpt: Human motion as a foreign language. Advances in Neural Information Processing Systems, 36, 20067-20079.
- Zhu, W., Ma, X., Ro, D., Ci, H., Zhang, J., Shi, J., ... & Wang, Y. (2023). Human motion generation: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence.
- Xu, L., Song, Z., Wang, D., Su, J., Fang, Z., Ding, C., ... & Wu, W. (2023). Actformer: A gan-based transformer towards general action-conditioned 3d human motion generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 2228-2238).

### **Animation Demo**

In [3]:
import os
from os.path import join as pjoin
from tqdm import tqdm
import numpy as np

import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.animation import FuncAnimation, PillowWriter
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
import mpl_toolkits.mplot3d.axes3d as p3

# Define the kinematic tree for connecting joints
kinematic_tree = [
    [0, 2, 5, 8, 11], 
    [0, 1, 4, 7, 10], 
    [0, 3, 6, 9, 12, 15], 
    [9, 14, 17, 19, 21], 
    [9, 13, 16, 18, 20]
]

def plot_3d_motion(save_path, joints, title, figsize=(10, 10), fps=120, radius=4):
    # Split the title if it's too long
    title_sp = title.split(' ')
    if len(title_sp) > 10:
        title = '\n'.join([' '.join(title_sp[:10]), ' '.join(title_sp[10:])])

    def init():
        ax.set_xlim3d([-radius / 2, radius / 2])
        ax.set_ylim3d([0, radius])
        ax.set_zlim3d([0, radius])
        fig.suptitle(title, fontsize=20)
        ax.grid(b=False)

    def plot_xzPlane(minx, maxx, miny, minz, maxz):
        # Plot a plane XZ
        verts = [
            [minx, miny, minz],
            [minx, miny, maxz],
            [maxx, miny, maxz],
            [maxx, miny, minz]
        ]
        xz_plane = Poly3DCollection([verts])
        xz_plane.set_facecolor((0.5, 0.5, 0.5, 0.5))
        ax.add_collection3d(xz_plane)

    # Reshape the joints data
    data = joints.copy().reshape(len(joints), -1, 3)
    # fig = plt.figure(figsize=figsize)
    # ax = p3.Axes3D(fig)
    fig = plt.figure(figsize=figsize)
    ax = fig.add_subplot(111, projection='3d')
    init()

    # Compute min and max values for the data
    MINS = data.min(axis=0).min(axis=0)
    MAXS = data.max(axis=0).max(axis=0)

    # Define colors for the kinematic tree
    colors = ['red', 'blue', 'black', 'red', 'blue',  
              'darkblue', 'darkblue', 'darkblue', 'darkblue', 'darkblue',
              'darkred', 'darkred', 'darkred', 'darkred', 'darkred']

    frame_number = data.shape[0]

    # Adjust the height offset
    height_offset = MINS[1]
    data[:, :, 1] -= height_offset
    trajec = data[:, 0, [0, 2]]

    # Center the data
    data[..., 0] -= data[:, 0:1, 0]
    data[..., 2] -= data[:, 0:1, 2]

    def update(index):
        # Clear existing lines and collections
        for line in ax.lines:
            line.remove()
        for collection in ax.collections:
            collection.remove()

        # Update the view
        ax.view_init(elev=120, azim=-90)
        ax.dist = 7.5

        # Plot the XZ plane
        plot_xzPlane(MINS[0] - trajec[index, 0], MAXS[0] - trajec[index, 0], 0, MINS[2] - trajec[index, 1], MAXS[2] - trajec[index, 1])

        # Plot the trajectory
        if index > 1:
            ax.plot3D(trajec[:index, 0] - trajec[index, 0], np.zeros_like(trajec[:index, 0]), trajec[:index, 1] - trajec[index, 1], linewidth=1.0, color='blue')

        # Plot the kinematic tree
        for i, (chain, color) in enumerate(zip(kinematic_tree, colors)):
            linewidth = 4.0 if i < 5 else 2.0
            ax.plot3D(data[index, chain, 0], data[index, chain, 1], data[index, chain, 2], linewidth=linewidth, color=color)
        # Hide axis labels
        plt.axis('off')
        ax.set_xticklabels([])
        ax.set_yticklabels([])
        ax.set_zticklabels([])

    # Create the animation
    ani = FuncAnimation(fig, update, frames=frame_number, interval=1000 / fps, repeat=False)

    # Save the animation
    ani.save(save_path, fps=fps)
    plt.close()

    print(f'Animation saved to {save_path}!')

In [5]:
## /!\ attention ! travaux: path to data -> replace this with your own paths
motion_data_dir = 'motions/'
text_data_dir = 'texts/' 

## list all files in the folder
npy_files = sorted(os.listdir(motion_data_dir))

## pick a random motion file
npy_file = np.random.choice(npy_files)

## read npy motion file
motion_data = np.load(os.path.join(motion_data_dir, npy_file))
print('shape', motion_data.shape)

## get the corresponding titles for the given motion
titles = []
with open('{}{}.txt'.format(text_data_dir, npy_file.split('.')[0])) as f:
    descriptions = f.readlines()
    for desc in descriptions:
        titles.append(desc.split('#')[0].capitalize())

print('Descriptions:')
print('- '+'\n- '.join(titles))

## pick a random title
title = np.random.choice(titles)

## create & save animation
save_path = './animation.gif'
plot_3d_motion(save_path, motion_data, title=title, figsize=(10, 6), fps=30, radius=4)

MovieWriter ffmpeg unavailable; using Pillow instead.


shape (100, 22, 3)
Descriptions:
- A person holds onto something while walking up and down stairs.
- A person walking up stairs and back down
- A person walks up and down a set of stairs while holding a rail.
Animation saved to ./animation.gif!


### **Evaluation Metric `MSE`**

In [4]:
import numpy as np

## generate random values
real_motions = np.random.rand(1000, 100, 22, 3)
gen_motions  = np.random.rand(1000, 100, 22, 3)

def calc_mse(real_motions, gen_motions):
    return ((real_motions - gen_motions) ** 2).mean()

## calculate MSE between real and generated motions 
calc_mse(flatten_real_motions, flatten_gen_motions)

NameError: name 'flatten_real_motions' is not defined

### **Code to generate your submission `.csv` file**

In [4]:
import pandas as pd
import numpy as np

## /!\ alerte rouge ! vents -> replace this with your actual predictions
test_motion_ids = np.arange(0, 1000).astype('str') # list of test motions ids
pred_test_motions = np.random.rand(1000, 100, 22, 3) ## pred_test_motions [numpy array] is your predicted motions #shape: (1000, 100, 22, 3), /!\ in the same order as the ids !

## flatten predicted motions to put them in a dataframe
submission_data = []
for motion_id, pred_motion in zip(test_motion_ids, pred_test_motions):
    submission_data.append([motion_id] + list(pred_motion.flatten()))   ## pred_motion # shape (100, 22, 3) => pred_motion.flatten() # shape (100 * 22 * 3)

## create a dataframe
submission_df = pd.DataFrame(submission_data, columns=['id'] + [f'f_{i}' for i in range(pred_motion.flatten().shape[0])])
## save dataframe to csv file
submission_df.to_csv('./submission.csv', index=False)
submission_df.head()

Unnamed: 0,id,f_0,f_1,f_2,f_3,f_4,f_5,f_6,f_7,f_8,...,f_6590,f_6591,f_6592,f_6593,f_6594,f_6595,f_6596,f_6597,f_6598,f_6599
0,0,0.062372,0.782929,0.932561,0.268971,0.928693,0.050069,0.455797,0.708632,0.456635,...,0.386544,0.817677,0.706218,0.201805,0.315434,0.555705,0.424128,0.350925,0.379205,0.114162
1,1,0.141122,0.247577,0.576992,0.234968,0.209217,0.380946,0.048948,0.946984,0.306057,...,0.082685,0.829796,0.290592,0.374106,0.557875,0.745328,0.94954,0.690417,0.362414,0.719611
2,2,0.599332,0.106621,0.369512,0.515412,0.491028,0.824067,0.105524,0.070823,0.220741,...,0.114125,0.80293,0.482955,0.129433,0.450277,0.315319,0.09086,0.342644,0.989188,0.956314
3,3,0.958448,0.852979,0.806553,0.78184,0.238227,0.561971,0.183774,0.554731,0.530597,...,0.437458,0.705695,0.846515,0.751005,0.522021,0.605127,0.209029,0.096653,0.914674,0.06934
4,4,0.347421,0.292765,0.758639,0.596535,0.379543,0.34688,0.875404,0.981555,0.781264,...,0.42126,0.216263,0.691867,0.967057,0.073013,0.664676,0.714526,0.974186,0.838095,0.547007
