Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About step t =1 #5

Closed
linshuheng6 opened this issue Jun 18, 2019 · 8 comments
Closed

About step t =1 #5

linshuheng6 opened this issue Jun 18, 2019 · 8 comments

Comments

@linshuheng6
Copy link

linshuheng6 commented Jun 18, 2019

@jx-zhong-for-academic-purpose
Copy link
Owner

jx-zhong-for-academic-purpose commented Jun 23, 2019

We directly use the video-level label as the supervision signals for each snippet. To be specific, you can refer to https://github.com/yjxiong/temporal-segment-networks/blob/master/data/ucf101_splits/trainlist01.txt to understand the input format. In fact, we make no modification to the implementation of TSN and C3D at the first step. Therefore, we just briefly introduce the first step, and the detailed implementation is exactly the same as their original implementations.

@jx-zhong-for-academic-purpose
Copy link
Owner

You understand correctly.
As for the details:

  1. Not exactly the same as the original TSN, and we mainly utilize hyper-parameters from its TPAMI-2018 version, https://arxiv.org/abs/1705.02953. In my mind, it should be 7 or 9.
  2. Not the same, the input unit of TSN is 5?10? frames and that of C3D is 16 frames. For short videos (eg. UCSD-Peds with only about100 frames), it matters more, and using the same number is not a good choice.
  3. We simply duplicate the snippet-level ground truth into frame-level scores, as the authors of UCF-Crime do https://github.com/WaqasSultani/AnomalyDetectionCVPR2018/blob/master/Evaluate_Anomaly_Detector.m.

@poweryin
Copy link

@jx-zhong-for-academic-purpose .Hi,I have two questiones.

  1. When step =1, input snippet and its corresponding video-level label ,in other words,input the normal snippets(in normal video) and the corresponding label=0, and the snippets of abnormal video and the corresponding label=1. However, for the snippets of the abnormal video, the video level label 1 is not be used(only used the normal snippets of normal video and its corresponding label 0),because this will affect the parameter update of the classifier.I don't know if I understand it right.Looking forward to your reply.
  2. Is this pre-trained classifier a feature extraction module or an anomaly detection module, and what data is used for pre-training?
    Looking forward to your reply.
    Thanks
    Best wishes

@jx-zhong-for-academic-purpose
Copy link
Owner

  1. All labels are taken into consideration, which may introduce predictive noises. How to clean the noises is one of the key contributions of this paper.
  2. We have elaborated the experiment section of our paper.

@poweryin
Copy link

Thanks for your patience very much.I'm very glad to receive your reply in time!

  1. That is, when t=1, the snippets and the corresponding video-level labels are input into the classifier, and a rough probability estimate is obtained. When t>=2, the video-level label given in the first step is no longer used. ,Is that so?
  2. Sorry, I ignored this detail. I thought that after pre-training the feature extraction module, that the classifier was also pre-trained with UCF-crime dataset and video-level label. Now I feel that there is no need for pre-training the UCF-crime before t=1.
    Looking forward to your reply.
    Thanks
    Best wishes

@jx-zhong-for-academic-purpose
Copy link
Owner

Good Luck~

@jx-zhong-for-academic-purpose
Copy link
Owner

  1. Yes, as shown in Fig1.
  2. The pre-training can boost the performance as many researchers point out.

@poweryin
Copy link

Thank you for your patience. It solved my confusion, thank you very much.
Best wishes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants