Number of motions seems less than the Real one #29

rd20karim · 2023-07-25T12:08:29Z

Hello, I have a question regarding the method used to compute the number of motion. While investigating your Motiondataset and dataloader, it appears that the motion numbers during training are counted as the total number of motion snippets across all training subset, divided by the batch size. I can understand the reasoning behind this approach, but it leads to a significant reduction on the real data size. Specifically, the number of motions in the training set was initially 23,384. After removing motions under window_size 64, it decreased to 20,942, and further reduced to 14,435 training motions using the aforementioned method.

Your clarification on this behavior would be greatly appreciated. Thank you.

EricGuo5513 · 2023-07-25T15:51:01Z

Hi, I am not getting your problem very clearly. Could you also attach the part codes here? rd20karim ***@***.***> 于2023年7月25日周二 06:08写道：

…

Hello, I have a question regarding the method used to compute the number of motion. While investigating your Motiondataset and dataloader, it appears that *the motion numbers during training are counted as the total number of motion snippets across all training subset, divided by the batch size*. I can understand the reasoning behind this approach, but it leads to a significant reduction on the real data size. Specifically, the number of motions in the training set was initially *23,384*. After removing motions under window_size 64, it decreased to 20,942, and further reduced to *14,435* training motions using the aforementioned method. Your clarification on this behavior would be greatly appreciated. Thank you. — Reply to this email directly, view it on GitHub <#29>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKRYNB5VXMUEBB7VIULPV4TXR6ZMTANCNFSM6AAAAAA2W67HTQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

rd20karim · 2023-07-26T10:45:31Z

The final value of cumsum is the number of motion snippets of all a given split

For example when dataloader is constructed for validation
We have the real number of motions 1300, but when we display the len(val_loader) it show’s 911

The length method of batch sampler is called, line 240 (around (116698/128) = 911) which assign as the number of motions during training the VQ_VAE on humanML3D

While the total the number of motion is before the augmentation process, so the mirrored motion is not counted

EricGuo5513 · 2023-07-27T01:12:53Z

Hi, at the stage of dataloader, the data set should already be the original motions + mirrored motions. And please note this MotionDataset is only for training the autoencoders. Also 911 is not the number of motions, but the number of iterations in each epoch. Hope these clarifies. rd20karim ***@***.***> 于2023年7月26日周三 04:45写道：

…

The final value of cumsum is the number of motion snippets of all a given split [image: image] <https://user-images.githubusercontent.com/62174833/256206377-56e3b3e7-0587-431d-b743-5ece2a576b2c.png> For example when dataloader is constructed for validation We have the real number of motions 1300, but when we display the len(val_loader) it show’s 911 [image: image] <https://user-images.githubusercontent.com/62174833/256206554-7674c044-046e-4dbd-9da9-d43c948ffae2.png> The length method of batch sampler is called, line 240 (around (116698/128) = 911) which assign as the number of motions during training the VQ_VAE on humanML3D [image: image] <https://user-images.githubusercontent.com/62174833/256206633-e365e2a0-fe88-405a-a0c7-fe5e7fb4cd84.png> While the total the number of motion is before the augmentation process, so the mirrored motion is not counted — Reply to this email directly, view it on GitHub <#29 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKRYNB5XUXFIDPDMFTTORM3XSDYNNANCNFSM6AAAAAA2W67HTQ> . You are receiving this because you commented.Message ID: ***@***.***>

rd20karim · 2023-07-27T09:13:43Z

Okay, thanks for clarifying. It seems that you trained the autoencoder on each sampled motion of 64 frames, considering them individually from all the training data.

Regarding your work on text generation using TM2T with HumanML3D, the preprocessing was limited to filtering motions that have less than 3 text descriptions. Additionally, the motion length was constrained to be between 40 and 200 frames. I wanted to know if I am missing any other details. Is there any constraint on the maximum text length generated during inference or training?

I was also curious about how the given splitting was generated because it seems to significantly impact the overall performance when the split is changed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Number of motions seems less than the Real one #29

Number of motions seems less than the Real one #29

rd20karim commented Jul 25, 2023

EricGuo5513 commented Jul 25, 2023 via email

rd20karim commented Jul 26, 2023

EricGuo5513 commented Jul 27, 2023 via email

rd20karim commented Jul 27, 2023

Number of motions seems less than the Real one #29

Number of motions seems less than the Real one #29

Comments

rd20karim commented Jul 25, 2023

EricGuo5513 commented Jul 25, 2023 via email

rd20karim commented Jul 26, 2023

EricGuo5513 commented Jul 27, 2023 via email

rd20karim commented Jul 27, 2023