Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After obtaining the final temporal feature representation X #39

Closed
DungVo1507 opened this issue Aug 22, 2021 · 4 comments
Closed

After obtaining the final temporal feature representation X #39

DungVo1507 opened this issue Aug 22, 2021 · 4 comments

Comments

@DungVo1507
Copy link

Thanks for viewing my issue, @tianyu0207
I have 4 questions that I hope you can explain:

  1. After obtaining X, the snippets have been divided into 2 groups normal and abnormal, right?
  2. In the Select Top-k snippets stage, do you select k snippets from both the normal and the abnormal groups, or will each group select k snippets?
  3. Assuming k = 3, in case a video has less than 3 abnormal or normal snippets, how will RTFM choose?
  4. When the input is normal video, how will the RTFM-enabled Snippet Classifier Learning stage classify?
@tianyu0207
Copy link
Owner

Thanks for viewing my issue, @tianyu0207
I have 4 questions that I hope you can explain:

  1. After obtaining X, the snippets have been divided into 2 groups normal and abnormal, right?
  2. In the Select Top-k snippets stage, do you select k snippets from both the normal and the abnormal groups, or will each group select k snippets?
  3. Assuming k = 3, in case a video has less than 3 abnormal or normal snippets, how will RTFM choose?
  4. When the input is normal video, how will the RTFM-enabled Snippet Classifier Learning stage classify?

Hi,

  1. After obtaining the video, the snippets will be divide into 32 segments. Each segment will be a 2048 feature vector. We don't change the order of the snippets.
  2. We select the snippets with top-k magnitude from each normal and abnormal video to obtain hard normals and pseudo abnormals.
  3. Assuming k = 3, in case a video has less than 3 abnormal or normal snippets, RTFM will choose the top-3 as well. This may include some of the false snippets. But in our experiment, we notice our approach is robust enough to handle this.
  4. Each batch will have the same number of normal and abnormal videos. Hence, there will be the equal number of samples from two classes during the classifier learning stage.

@tianyu0207
Copy link
Owner

  1. pseudo

Hard normal will be the snippets that are similar to abnormal events. Pseudo abnormal means there may be snippets that are not actual abnormal because we try to select abnormal instances from the abnormal bag. There are no snippet-level labels. I don't quite understand your second question. Sorry..

@DungVo1507
Copy link
Author

Thank you so much @tianyu0207,
The second question means: you say each batch will have the same number of normal and abnormal videos, so the number of normal and abnormal videos in the dataset should be equal right?
If each batch will have the same number of normal and abnormal videos, is the drawing of how RTFM works I attached below correct?
RTFM

I hope you will reply!
Appreciate your support!

@tianyu0207
Copy link
Owner

Thank you so much @tianyu0207,
The second question means: you say each batch will have the same number of normal and abnormal videos, so the number of normal and abnormal videos in the dataset should be equal right?
If each batch will have the same number of normal and abnormal videos, is the drawing of how RTFM works I attached below correct?
RTFM

I hope you will reply!
Appreciate your support!

each batch has the same number of normal and abnormal videos does not necessarily mean you have the equal number of videos in the dataset. you just sample evenly for each batch.

Hi I reckon Your figure is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants