You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to repeat your experiment on THUMOS14. So , I downloaded the THUMOS14 test dataset and used part of the code from C3D project to extract frames from 213 videos in the test dataset. Then I got 1351825 frames in total, which was different from the number of frames you extracted (around 1157824 from your postprocess codes). Then I used your python code to generate 42347 bin files while yours was 36182. So, I changed the number of mini batches to 10567 and output 42347 features.
I generated my own ground truth lables per frame and run your postprocess codes. finally got 0.1426 map. I found that the probability looked ugly, most of the frames have high probability for background and others do not have high enough probability for every actions even with a low probability for background. could you see what might be the problem? The code I used to extract frames is attached below.
def get_action_video_id(input_dir):
filenames = []`
for root,dirs,files in os.walk(input_dir):
for i in files:
f = open(input_dir + i)
for line in f:
if line[:18] in filenames:
pass
else:
filenames.append(line[:18])
f.close()
return filenames
def get_frame_count(video):
''' Get frame counts and FPS for a video '''
cap = cv2.VideoCapture(video)
if not cap.isOpened():
print "[Error] video={} can not be opened.".format(video)
sys.exit(-6)
# get frame counts
num_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = cap.get(cv2.CAP_PROP_FPS)
# in case, fps was not available, use default of 29.97
if not fps or fps != fps:
fps = 29.97
return num_frames, fps
def extract_frames(video, start_frame, frame_dir, num_frames_to_extract=16):
''' Extract frames from a video using opencv '''
# check output directory
if os.path.isdir(frame_dir):
print "[Warning] frame_dir={} does exist. Will overwrite".format(frame_dir)
else:
os.makedirs(frame_dir)
# get number of frames
cap = cv2.VideoCapture(video)
if not cap.isOpened():
print "[Error] video={} can not be opened.".format(video)
sys.exit(-6)
# move to start_frame
cap.set(cv2.CAP_PROP_POS_FRAMES, start_frame)
# grab each frame and save
for frame_count in range(num_frames_to_extract):
frame_num = frame_count + start_frame
print "[Info] Extracting frame num={}".format(frame_num)
ret, frame = cap.read()
if not ret:
print "[Error] Frame extraction was not successful"
sys.exit(-7)
frame_file = os.path.join(
frame_dir,
'{0:06d}.jpg'.format(frame_num)
)
cv2.imwrite(frame_file, frame)
return
def main():
input_annotations_dir = '/home/rusu5516/TH14_Temporal_Annotations_Test/annotations/annotation/'
filenames = get_action_video_id(input_annotations_dir)
input_videos_dir = '/home/rusu5516/TH14_test_set_mp4/'
for file in filenames:
for root,dirs,files in os.walk(os.path.join(input_videos_dir,'all_frames_pervideo')):
for i in files:
if file == i:
pass
else:
num_frames, fps = get_frame_count(os.path.join(input_videos_dir,file+'.mp4'))
os.mkdir(os.path.join(input_videos_dir,'all_frames_pervideo',file))
extract_frames(input_videos_dir+file+'.mp4', 0, input_videos_dir+'all_frames_pervideo/'+file, num_frames)
if __name__ == '__main__':
main()
The text was updated successfully, but these errors were encountered:
I used FPS=25 for all test videos although it should not affect performances a lot.
As for frame extraction, instead of cv2, I used ffmpeg to extract frames in png format. You could refer to my previous scnn demo code to learn more about this.
I tried to repeat your experiment on THUMOS14. So , I downloaded the THUMOS14 test dataset and used part of the code from C3D project to extract frames from 213 videos in the test dataset. Then I got 1351825 frames in total, which was different from the number of frames you extracted (around 1157824 from your postprocess codes). Then I used your python code to generate 42347 bin files while yours was 36182. So, I changed the number of mini batches to 10567 and output 42347 features.
I generated my own ground truth lables per frame and run your postprocess codes. finally got 0.1426 map. I found that the probability looked ugly, most of the frames have high probability for background and others do not have high enough probability for every actions even with a low probability for background. could you see what might be the problem? The code I used to extract frames is attached below.
The text was updated successfully, but these errors were encountered: