Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset processing #10

Closed
maherr13 opened this issue Apr 10, 2022 · 4 comments
Closed

Dataset processing #10

maherr13 opened this issue Apr 10, 2022 · 4 comments

Comments

@maherr13
Copy link

I have a few questions regarding the dataset processing pipeline,

  • at generate_clips script, why the start index is 80 ??
  • why there are 13 records in each clip labeled idle in the train test split file ??
  • are there any parameters I would need to adjust when creating my own data set??

btw there is an error in 3_2_split_train_val_test.py that you naming the validation samples "val" while the model searches for "dev" labeled records.

@Garfield-Finch
Copy link
Collaborator

Garfield-Finch commented Apr 11, 2022

Thank you for your interest in our work.

  1. In our dataset, there is always a segment of introduction that we want to desert. We will change that part to be a parameter user can set. Thank you for that.
  2. It is because we want to make sure that the training set and test set do not overlap.
  3. As far as we are concerned, all the parameters that should be set by yourself have been indicated as "required=True" in the "argparser". It should be fine if you go by our default setting. If you find anything that is crucial but we did not notice, please follow up on this issue.

Thank you. That is an incompatibility between our naming rules and that of the original dataset from "Speech2Gesture"

@maherr13
Copy link
Author

Thank you for the illustration.

I have a question regarding the data collection, from your experiments How much data do I need to collect (in hours) to get good results in general, and as good as Oliver specifically?

@ShenhanQian
Copy link
Owner

Here are the lengths of our training sequences for your reference:

Subject Length (hours)
Oliver 11
Kubinec 3
Luo 7
Xing 2

Besides the length, the variation and the quality of pose may also highly influence the results. So we suggest collecting videos with expressive gestures like Oliver's and visually checking if the detected keypoints are correct and stable.

@maherr13
Copy link
Author

Many Thanks. @ShenhanQian @Garfield-Finch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants