Skip to content

Question on Motion Representation in the First-Stage Model #1

@f15hhhh

Description

@f15hhhh

Hello! Thank you for open-sourcing this fascinating work! This project is very clear and has helped me a lot!
I noticed in the code that the first-stage model appears to use a motion representation of global root + local other key joints. For example, when computing the reconstruction loss the code seems to assume a “global root + local joints” representation: the root is reconstructed in world coordinates while all other joints are predicted in local coordinates and then transformed back to world space.
The relevant snippet is:

ric_data_gt = recover_from_ric(target_inv_trans, joints_num=self.njoints, abs_3d=self.conf.abs_3d)
ric_data_pred = recover_from_ric(model_output_inv_trans, joints_num=self.njoints, abs_3d=self.conf.abs_3d)

if self.lambda_ric_pos:
    terms["ric_pos_loss"] = self.abs_joints_loss_humanml3d(ric_data_gt, ric_data_pred, mask)

and the helper function:

def recover_from_ric(data, joints_num, abs_3d=False):
    r_rot_quat, r_pos = recover_root_rot_pos(data, abs_3d=abs_3d)  # global root rotation & position
    positions = data[..., 4:(joints_num - 1) * 3 + 4]              # local joint offsets
    positions = positions.view(positions.shape[:-1] + (-1, 3))

    # rotate local joints by inverse root rotation
    positions = qrot(qinv(r_rot_quat[..., None, :]).expand(positions.shape[:-1] + (4,)), positions)

    # add root X/Z translation
    positions[..., 0] += r_pos[..., 0:1]
    positions[..., 2] += r_pos[..., 2:3]

    # concatenate global root and transformed joints
    positions = torch.cat([r_pos.unsqueeze(-2), positions], dim=-2)
    return positions

However, the paper mentions the first-stage model uses global positions for all key joints. Could you clarify if there are additional preprocessing steps I might have missed, or if part of the pipeline is yet to be released?
Additionally, if the model is indeed trained with "global root + local other key joints", how is the trajectory-conditioned generation task for arbitrary end-effectors handled under this representation (especially when the root trajectory is not given)?
Looking forward to your insights, and thanks again for sharing the project!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions