UCF101 Training from Scratch #11

ardasnck · 2016-12-09T08:29:31Z

Thank you very much for your contribution on C3D.

Is it possible to provide some information about UCF101 Training from scratch instead of finetuning? It would be very helpful to provide a graph or at least some numerical data that shows the test accuracy/loss on each epoch so that we can compare our on-going training.

Thanks.

hx173149 · 2016-12-13T15:46:34Z

Hi @ardasnck
I am a little busy in recent days, I think I can do the evaluation in next week.

ardasnck · 2016-12-13T15:53:00Z

@hx173149 sure! i can't reproduce the same results with the paper on my own tensorflow implementation. So if you can get similar results after your evaluation, it would be great to add your train-from-scratch implementation in this repository.

hx173149 · 2016-12-13T16:04:17Z

@ardasnck if you want to get the same acc with the paper you must do fine tuning from sports-1M, the paper has said it. Actually you can reference this issue #2, and I have tried that if I don't do the fine tuning I just get the 33% acc.
Cheers

ardasnck · 2016-12-13T16:12:42Z

@hx173149 yeah i know issue #2 and also read the C3D official documentation and paper about fine-tuning. But my question is exactly on training from scratch(not fine-tuning). Actually i got 40% accuracy when I train from the scratch and you mentioned that you only reached to 33%. This https://docs.google.com/document/d/1-QqZ3JHd76JfimY4QKqOojcEaf5g3JS0lNh-FHTxLag states that they reached 45% so I was wondering what could be the potential reason for the difference? Also another observation that loss value in tensorflow is clearly higher than caffe implementation during training...

hx173149 · 2016-12-20T07:58:54Z

Hi @ardasnck I think I have some free time in next days,I will reproduce my result once more... and have you ever try the caffe version code? Did it can get the 45% accuracy with training from scratch? I am curious about this problem too...
PS: I can't open the URL page you mentioned upside.
Cheers

ardasnck · 2016-12-20T08:21:52Z

Hi @hx173149. I updated the link once again but I'm not sure what's happening with that...
For the training from scratch: Yes I run the caffe version of the code on my machine and I got 42.88% accuracy (note that I used batch size 16 because of my gpu capacity). I also edited my own tensorflow implementation (some minor changes) and I got 42.64%. I believe this shows that it works as it should be.
PS: In case of the link doesn't work again , I was referring to C3D-User Guide document which author provides it on his project page.

hx173149 · 2016-12-24T06:06:34Z

Hi @ardasnck
There are 13318 videos in UCF101 dataset, I used 11318 videos for traning and 2000 videos for test, and I can get a 50% top 1 accuracy after 8000 iterations with batch_size is 64.
This is my traning from scratch top-1 accuracy curve, cross entropy curve, total loss(cross entropy + regularized loss) curve:

ardasnck · 2016-12-26T10:19:19Z

Dear @hx173149 ,
Thank you very much for the very detailed feedback. This is great that you reach to 50% top 1 accuracy. Did you use the same train and test split that original caffe implementation used? Because paper claims that they got 45% accuracy and when I run their code on my own machine (batch size 16) i got 42.9% accuracy.

gy2256 · 2017-02-01T15:18:24Z

Hello,

I also want to train from scratch but I am kind of new to Deep Learning, especially using 3d convNet. Could you briefly explain the training mechanism? Based on my understanding, you feed in 16 frames as input and a label to perform supervised learning. But do you use all the frames for training? I would really appreciate your help if you can briefly explain the whole data preparation and training process.

(I am trying to rewrite everything in Keras. So far I have defined the nets but I do not know how to prepare the video data)

hx173149 · 2017-02-06T05:44:08Z

Hello @gyang1011
My training mechanism is like this:
First I will choose 64 samples randomly for each iteration
Then I will slice a 3.2 seconds(about 16 frames) randomly from each sample for training.

LongLong-Jing · 2017-02-17T19:54:46Z

@ardasnck @hx173149 @gyang1011
I trained this network and got 33% in split 1 of UCF101. However, I think the accuracy of this 8-layer convolution network should be 33%. In paper C3D, the author use a 5-layer convolution network (not 8-layer convolution), so they can get 45% in UCF101. This means that the structure of the network training from scratch and pre-trained in Sport 1M is different!

hx173149 · 2017-03-12T12:07:33Z

@LongLong-Jing I think you are right, maybe there have some duplication samples among my train list and test list, I am not very sure.

hx173149 closed this as completed Oct 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UCF101 Training from Scratch #11

UCF101 Training from Scratch #11

ardasnck commented Dec 9, 2016

hx173149 commented Dec 13, 2016

ardasnck commented Dec 13, 2016

hx173149 commented Dec 13, 2016

ardasnck commented Dec 13, 2016 •

edited

Loading

hx173149 commented Dec 20, 2016

ardasnck commented Dec 20, 2016 •

edited

Loading

hx173149 commented Dec 24, 2016

ardasnck commented Dec 26, 2016

gy2256 commented Feb 1, 2017

hx173149 commented Feb 6, 2017

LongLong-Jing commented Feb 17, 2017

hx173149 commented Mar 12, 2017

UCF101 Training from Scratch #11

UCF101 Training from Scratch #11

Comments

ardasnck commented Dec 9, 2016

hx173149 commented Dec 13, 2016

ardasnck commented Dec 13, 2016

hx173149 commented Dec 13, 2016

ardasnck commented Dec 13, 2016 • edited Loading

hx173149 commented Dec 20, 2016

ardasnck commented Dec 20, 2016 • edited Loading

hx173149 commented Dec 24, 2016

ardasnck commented Dec 26, 2016

gy2256 commented Feb 1, 2017

hx173149 commented Feb 6, 2017

LongLong-Jing commented Feb 17, 2017

hx173149 commented Mar 12, 2017

ardasnck commented Dec 13, 2016 •

edited

Loading

ardasnck commented Dec 20, 2016 •

edited

Loading