After obtaining the final temporal feature representation X #39

DungVo1507 · 2021-08-22T15:40:43Z

Thanks for viewing my issue, @tianyu0207
I have 4 questions that I hope you can explain:

After obtaining X, the snippets have been divided into 2 groups normal and abnormal, right?
In the Select Top-k snippets stage, do you select k snippets from both the normal and the abnormal groups, or will each group select k snippets?
Assuming k = 3, in case a video has less than 3 abnormal or normal snippets, how will RTFM choose?
When the input is normal video, how will the RTFM-enabled Snippet Classifier Learning stage classify?

tianyu0207 · 2021-08-23T06:16:14Z

Thanks for viewing my issue, @tianyu0207
I have 4 questions that I hope you can explain:

After obtaining X, the snippets have been divided into 2 groups normal and abnormal, right?

In the Select Top-k snippets stage, do you select k snippets from both the normal and the abnormal groups, or will each group select k snippets?

Assuming k = 3, in case a video has less than 3 abnormal or normal snippets, how will RTFM choose?

When the input is normal video, how will the RTFM-enabled Snippet Classifier Learning stage classify?

Hi,

After obtaining the video, the snippets will be divide into 32 segments. Each segment will be a 2048 feature vector. We don't change the order of the snippets.
We select the snippets with top-k magnitude from each normal and abnormal video to obtain hard normals and pseudo abnormals.
Assuming k = 3, in case a video has less than 3 abnormal or normal snippets, RTFM will choose the top-3 as well. This may include some of the false snippets. But in our experiment, we notice our approach is robust enough to handle this.
Each batch will have the same number of normal and abnormal videos. Hence, there will be the equal number of samples from two classes during the classifier learning stage.

tianyu0207 · 2021-08-23T12:26:14Z

pseudo

Hard normal will be the snippets that are similar to abnormal events. Pseudo abnormal means there may be snippets that are not actual abnormal because we try to select abnormal instances from the abnormal bag. There are no snippet-level labels. I don't quite understand your second question. Sorry..

DungVo1507 · 2021-08-24T07:43:20Z

Thank you so much @tianyu0207,
The second question means: you say each batch will have the same number of normal and abnormal videos, so the number of normal and abnormal videos in the dataset should be equal right?
If each batch will have the same number of normal and abnormal videos, is the drawing of how RTFM works I attached below correct?

I hope you will reply!
Appreciate your support!

tianyu0207 · 2021-08-31T01:19:21Z

Thank you so much @tianyu0207,
The second question means: you say each batch will have the same number of normal and abnormal videos, so the number of normal and abnormal videos in the dataset should be equal right?
If each batch will have the same number of normal and abnormal videos, is the drawing of how RTFM works I attached below correct?

I hope you will reply!
Appreciate your support!

each batch has the same number of normal and abnormal videos does not necessarily mean you have the equal number of videos in the dataset. you just sample evenly for each batch.

Hi I reckon Your figure is correct.

tianyu0207 closed this as completed Aug 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After obtaining the final temporal feature representation X #39

After obtaining the final temporal feature representation X #39

DungVo1507 commented Aug 22, 2021

tianyu0207 commented Aug 23, 2021

tianyu0207 commented Aug 23, 2021

DungVo1507 commented Aug 24, 2021

tianyu0207 commented Aug 31, 2021

After obtaining the final temporal feature representation X #39

After obtaining the final temporal feature representation X #39

Comments

DungVo1507 commented Aug 22, 2021

tianyu0207 commented Aug 23, 2021

tianyu0207 commented Aug 23, 2021

DungVo1507 commented Aug 24, 2021

tianyu0207 commented Aug 31, 2021