Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hello, author. May I ask whether it is necessary to distinguish the features of training set or test set when extracting multi-modal features? #25

Open
LiMxStar opened this issue Jan 4, 2022 · 1 comment

Comments

@LiMxStar
Copy link

LiMxStar commented Jan 4, 2022

Because I saw that all the features you have extracted in your code are a file, which is to directly extract all the features of the video into an HDF5 file, without distinguishing between the training set and the test set. Hope you can extract valuable events to answer the question.

@v-iashin
Copy link
Owner

v-iashin commented Jan 4, 2022

Hi, issue starter!

It is easier than you think. The features are extracted for the whole video regardless of the dataset part. During training, we can simply trim the feature stack according to the start and end timestamps. During test, you can download the predictions of the proposal generator from BAFCG.

Note, the code might not be adapted for the train set; only for train and two validation sets.

How to distinguish? Well, we have separate files for audio, speech, and visual features. These are uni-modal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants