How do you make predictions for videos? #7

avijit9 · 2020-02-14T11:31:34Z

How to make prediction of a video? What is the threshold you choose usually? I am talking about the following line in the paper

After training the 3C-Net, the CLS module (see Fig. 2
and Eq. 2) is used to compute the action-class scores (pmf)
at the video-level using the final T-CAM, for the action classification task

The text was updated successfully, but these errors were encountered:

naraysa · 2020-02-15T09:21:37Z

There is no threshold. Once the final T-CAM (t x num_class) is computed by the net, we do the top-k pooling over time and get a k x num_class vector, which is then temporally averaged. The resulting vector of size num_class is passed through a softmax to obtain the classwise scores of the video.

avijit9 · 2020-02-15T10:50:35Z

And then how do you find out which classes are present in the video from the scores?

naraysa · 2020-02-16T09:19:14Z

The softmax was for the mAP computation. For finding the classes present, we don't perform the softmax above. Instead take all the labels whose top-k mean is greater than 0 as categories present in the video.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do you make predictions for videos? #7

How do you make predictions for videos? #7

avijit9 commented Feb 14, 2020

naraysa commented Feb 15, 2020 •

edited

Loading

avijit9 commented Feb 15, 2020

naraysa commented Feb 16, 2020

How do you make predictions for videos? #7

How do you make predictions for videos? #7

Comments

avijit9 commented Feb 14, 2020

naraysa commented Feb 15, 2020 • edited Loading

avijit9 commented Feb 15, 2020

naraysa commented Feb 16, 2020

naraysa commented Feb 15, 2020 •

edited

Loading