-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
frame-level labeling and feature extraction output #12
Comments
I find out that the |
@HYPJUDY |
Hi @buaa-luzhi , I labeled the training set according to #2 , but I haven't reproduced the results reported in the paper so I am not sure whether I missed something. |
Hi,@HYPJUDY |
Hi @buaa-luzhi , you can find annotations in THUMOS14 website. For example, TH14_Temporal_annotations_test contains the temporal annotations of action instances in the test vides of 20 classes. |
Hi, @HYPJUDY In Ambiguous_val.txt, all videos frame label assign 0? Thanks for your reply, You are very warm-hearted! |
Hi @buaa-luzhi , let me illustrate class labeling by giving the code of one-shot labeling for THUMOS14 test data (all 213 test videos which are not entirely background videos): ### gen_oneshot_label_for_test_action_video.py ###
import os
import numpy as np
import math
# background (#0) + thumos 20 classes (#1 to #20) + ambiguous class (#21)
classname = ['BaseballPitch', 'BasketballDunk', 'Billiards', 'CleanAndJerk', 'CliffDiving', 'CricketBowling',
'CricketShot', 'Diving', 'FrisbeeCatch', 'GolfSwing', 'HammerThrow', 'HighJump', 'JavelinThrow', 'LongJump',
'PoleVault', 'Shotput', 'SoccerPenalty', 'TennisSwing', 'ThrowDiscus', 'VolleyballSpiking', 'Ambiguous']
classlabel = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
filepath = '~/TH14_Temporal_Annotations_Test/annotations/annotation/'
inputdir = '~/THUMOS14_frame/test/' # a list of folders; each folder contains all frames for one video
sr = 25 # sampling rate per frame
action_video_dict = {}
totalframe = 0
for v in sorted(os.listdir(inputdir)):
imglist = sorted(os.listdir(os.path.join(inputdir,v)))
print v, len(imglist)
totalframe = totalframe + len(imglist)
vlabel = np.zeros(len(imglist)) # init
action_video_dict[v] = vlabel
print 'total frame number:', totalframe
'''
Since the CDC authors used the following method for labeling:
The ground truth data used during testing for evaluation is multi-label of 21 classes.
But during training, we simply use one-hot label and only treat frames (that belongs to diving but not belongs to cliffdiving) as diving frames.
During prediction, all frames predicted as cliffdiving will also be set as diving to form multi-label prediction.
In one-hot labeling, we need to assign Diving class first and then CliffDiving.
By this order, CliffDiving label will overwirte part of Diving label.
Otherwise, all CliffDiving label will be overwritted by Diving label.
'''
# for i in range(21):
for i in range(20, -1, -1):
f = open(filepath+classname[i]+'_test.txt', 'r')
lines = f.readlines()
for line in lines:
line = line.strip().split(' ')
s = int(float(line[2]) * sr)
e = int(math.ceil(float(line[3]) * sr)) # e.g. 27.3 -> 28
v_len = len(action_video_dict[line[0]])
if e > v_len: # some THUOMOS14 annotations are wrong!
print line[0], v_len, s, e, classname[i], line[2], line[3]
continue # ignore or manually remove the wrong annotations
for j in range(s, e + 1): # frame starts from #1
if j == 0:
j = 1
action_video_dict[line[0]][j-1] = classlabel[i] # but index starts from #0
with open('oneshot_label_test_action_video.lst', 'w') as fw:
for v in sorted(action_video_dict.iterkeys()):
fw.write(v)
for i in range(len(action_video_dict[v])):
fw.write(" %d" % action_video_dict[v][i])
fw.write('\n') This code will generate
For example, first line means |
@HYPJUDY Hi, Have you reproduced the performance in the paper using the model convdeconv-TH14_iter_24390 ? |
Hi @sunnyxiaohu, yes, I have. |
@HYPJUDY Hi, How many segments (*.bin) have you generated for test.lst ? Officially, there would be 36182, however, I just get 36177. |
Hi @sunnyxiaohu , I got exactly the same number (36177) as you. It's ok. |
@HYPJUDY I also get 36177 segments. I found my extracted frames always less than author's, actually just a frame. For example, for test_video_0000004, author has 846 frames while I only have 845 frames. By the way, can you reproduce thumos14 results as paper?? My result is 2% lower. I don't know why, could you give me suggestion? |
Hi @gss-ucas , that's normal. I got lower temporal action localization mAP but higher per-frame labeling mAP. Following are my results FYI:
|
Hi, @HYPJUDY It's seem that it's easy to generate "nan" when extracting probability and training. Have you also meet this problem? If yes, How did you solve it ? |
Hi @sunnyxiaohu , that's true. I met "nan" when training with action window (at least one frame belonging to actions) and too much ambiguous label (21). I don't know why (maybe zero division not handled in the code?) but testing at iteration before "nan" can get results (of course lower mAP). |
Hi,@HYPJUDY i am confused about how to make the training set, refer to the ‘gen_test_bin_and_list.py’,i kept the bin file that at least conclude 1 action frame,then i got 11906 bins. the test result i get i found the frame number of each action is imbalance, can you share your code about how to make trainning set? |
Hi @qioooo , you can try to generate bins without the limitation of "at least 1 action frame". I got 32746 bins for THUMOS14 validation set. If I only kept the action windows (at least 1 action frame), I got 11894 bins and the performance is worse. The code is similar to the code I commented on Sep 14 in this issue. I didn't count the frame number of each action. |
Hi, are there any backup for the claim that the author use one-hot encoding for training data? |
Hi,@HYPJUDY ,I try to reproduce thumos14 results, but when I run the step1_gen_test_metadata.m,I found the name of my training video is invalid,like v_ApplyEyeMakeup_g01_c01,and 'c01' can't be converted to num, I realized maybe I download the wrong dataset, could you tell me where did you download the dataset? Thanks in advance! |
Hi @XiongChengxin , I just downloaded the dataset from the official website: http://crcv.ucf.edu/THUMOS14/download.html |
@HYPJUDY you mean Test Data (1574 untrimmed videos) -- ground truth made available after the competition? |
@XiongChengxin I downloaded all data about THUMOS14 from this website. I think the video 'v_ApplyEyeMakeup_g01_c01' is from THUMOS14 training dataset (UCF101). You can find it from here. |
@HYPJUDY Okay,Thank you very much!! |
@HYPJUDY ,I am sorry to bother you again……I found unzipping the test data needs password……Do you know the password? |
Please follow the instruction in the website:
|
@HYPJUDY Thanks! |
Hi,@HYPJUDY , when I tried to train my own models, I found that I need to provide 2 files for train.prototxt, one is a mean file (I use the one that the author provided), the other is a list file. So I generated a list file use gen_test_bin_and_list.py, and the content is /home/xiongcx/C/dataset/video_bin/window/background/video_background_0000001/000001.bin You may found that I didn't provide labels, but the strange thing is the model still can compute the loss…… The content of log.train-val are as follows: I0405 11:39:05.419764 26132 solver.cpp:110] Iteration 0, Testing net I guess the label may have default value. I wonder what is the right format of the lst file, in other words, should I use the oneshot_label_test_action_video.lst generated by the code you provided to replace mine? If so, how could the model read the videos without the bin list file? |
@HYPJUDY , I have a new found…… The labels are stored in the bin file, so the oneshot label generated by the code you provided should be used to replace the 'vlabel' in gen_test_bin_and_list.py, I am going to try this! |
@XiongChengxin Hi, I want to know have you trained the CDC model on THUMOS14? I tried to retrain the model, but the training loss curve seemed wired. And the test results of my trained model are not right. |
hello , |
Hi @kstys , maybe you can check gen_test_bin_and_list.py. The function os.listdir in Python returns the list containing the directories in arbitrary order. This will break the sequence of frames during test and get wrong ground truth during evaluation. Use sorted(os.listdir()) instead. This works for me. I don't know if this is a bug. Cause it depends on the system. But it would be nice if the author can update the code to solve it.@zhengshou |
Thank you @shanshuo for pointing this out. I did not encounter such issue of arbitrary order before. But it is definitely nice to make the code suitable for different systems. Will update the code according to your suggestion. Thanks. |
@shanshuo Thanks a lot. I have changed os.listdir to sorted(os.listdir),but the result is the same as the before.Any else suggestion? could you send me a link that contain the code you run sucessively. |
I have solve the problem, and I found the prefix.lst file has 31682 items,it's not matching to the datasets that I have generated .thanks a lot. @shanshuo |
@shanshuo Hi, I wonder if you have solved the training problem. I haven't train the model in THUMOS14, but when I train it in my dataset, the loss seems not converge too. Is the batch_size matters? Now I am trying some other lr and see if it works. Can you give me some suggestions? Thank you! |
@uxtl No, the problem still exists. I also used the provided model to finetune my dataset and the loss didn't converge. I don't know if it is because the code or the parameters or the algorithm isn't suitable. That's why I want to train CDC model first. If getting the correct trained model, you can exempt the code issue. Then maybe you can change parameters. I suggest you to train CDC model first. It's worth to do. It's also a good idea to compare your dataset's loss curve with THUMOS14's. We can discuss it further through email. s.chen3@uva.nl |
@shanshuo I am sorry for the late reply…… I haven't train my own model, I just finetune the provided model on my dataset, and use the model to do predictions. |
@shanshuo I have the same problem as you. I also set batch size from 8 to 4 due to the limitation of memory.Do you solve the problem now? |
@Stingsl No, I haven't. |
@shanshuo By the way, Do you change the label into one-hot format? I think it might be some experiment details that i did not notice. |
@HYPJUDY hi, I found my extracted frames always less than author's, and the total num of test frame is not equal to the num of the author, so I can't run matlab compute_framelevel_mAP.m to get the per-frame labeling mAP as the dim of multi-label-test.mat is not same to mine, how did you make the right multi-label-test.mat file? by the way, my mAP@0.5 of Temporal localization is only 0.168890 ,thank you! |
Hi, when I run
./xfeat.sh
I meet the following problems relating tofeat
. As you have said,output results will be stored in feat
butdemo/feat
.THUMOS/test
successfully but there's no output relating tofeat
.Thanks!
The text was updated successfully, but these errors were encountered: