Skip to content
This repository has been archived by the owner on Jun 15, 2023. It is now read-only.

question about use pre-train model on my own video #32

Closed
pandababyer opened this issue Aug 4, 2020 · 11 comments
Closed

question about use pre-train model on my own video #32

pandababyer opened this issue Aug 4, 2020 · 11 comments

Comments

@pandababyer
Copy link

Hello, thanks for share the great work and it is very helpful ! When I use your pre-train model to generate the description for my own video. I use the code you offer to extract the feature(_segment.npy, anet_detection_vg_fc6_feat_100rois.h5, and bn.npy and resnet.npy). However, when i use it to generate the caption, it says 'TypeError: only size-1 arrays can be converted to Python scalars' I find out it is the difference between the anet_detection_vg_fc6_feat_100rois.h5 you offer and the
anet_detection_vg_fc6_feat_100rois.h5 file I generate with the code in detectron-vlp. the dimension of dets_num, dets_labels and others in the detectron-vlp is different from the .h5 file you offer. https://github.com/LuoweiZhou/detectron-vlp/blob/b9140d298538703205fd2c0421b06c4b40e00018/tools/extract_features_gvd_anet.py#L221
looking forward to your reply. thx!

@LuoweiZhou
Copy link
Contributor

@pandababyer Thanks for your interest in our work. Could you check if you have dic_anet.json set up appropriately?
https://github.com/LuoweiZhou/detectron-vlp/blob/master/tools/extract_features_gvd_anet.py#L136
https://github.com/LuoweiZhou/detectron-vlp/blob/master/tools/extract_features_gvd_anet.py#L207
Also, the error is related to NumPy format so you may want to check if the code raises any exception when feeding in one video (in your case).

@pandababyer
Copy link
Author

@LuoweiZhou Thanks for your reply. Maybe my question is not clear. when I use code in detectron-vlp to extract feature for one video, i get dets_num = np.zeros((1, 10)). but the code in GVD dataloader_anet.py line 184 num_proposal = int(self.num_proposals[ix]) which will raise the error "TypeError: only size-1 arrays can be converted to Python scalars".
This confused me a lot. Tanks for your time !

@LuoweiZhou
Copy link
Contributor

@pandababyer It turned out there is a minor bug in the feature extraction file:
f.create_dataset("dets_num", data=dets_num) -> f.create_dataset("dets_num", data=dets_num.sum(axis=-1))
f.create_dataset("nms_num", data=nms_num) -> f.create_dataset("nms_num", data=nms_num).sum(axis=-1)
I have made the fix. Thank you for your feedback!

@pandababyer
Copy link
Author

@LuoweiZhou Thanks for your quick reply and the problem solved ! My last question is about the environment configuration of the project anet2016-cuhk-feature. I tried ubuntu 16 and ubuntu 14 but always get the problem: Cannot use GPU in CPU-only Caffe: check mode. and the output of resnet feature is (n,2048) which is right but the output feature of bn is (0,1024). I think it is the problem when build dense_flow. So what's the config detail and is it possible to provide a dockerfile. Thanks a lot !

@ycxia
Copy link

ycxia commented Aug 14, 2020

@LuoweiZhou Thanks for share the great work! I ran into the same problem.
the problem solved by modifing
f.create_dataset("dets_num", data=dets_num) -> f.create_dataset("dets_num", data=dets_num.sum(axis=-1))
f.create_dataset("nms_num", data=nms_num) -> f.create_dataset("nms_num", data=nms_num).sum(axis=-1).
But another problem arises:
File "/data/yongcheng/grounded-video-description/misc/dataloader_anet.py", line 331, in getitem
pad_proposals[:num_pps] = proposals[:num_pps]
ValueError: could not broadcast input array from shape (10,100,6) into shape (10,7)
This confused me a lot. Tanks for your time !

@pandababyer
Copy link
Author

@ycxia hello ,could you please tell me how do you do the feature extraction with repo anet2016-cuhk-feature. I have been troubled for a long time, thank you.

@ycxia
Copy link

ycxia commented Aug 14, 2020

@pandababyer Just following build_all.sh. Then, python2 examples/extract_feature_activitynet.py data/ --use_flow
good luck!
Do you know how to debug "ValueError: could not broadcast input array from shape (10,100,6) into shape (10,7)"?
thank you!

@pandababyer
Copy link
Author

@ycxia Sorry, I haven't solve the problem. I use the dockerfile cuda8.0-cudnn5-devel-ubuntu14.04 and build the project with build_all.sh, but it always shows :Cannot use GPU in CPU-only Caffe: check mode.

@LuoweiZhou
Copy link
Contributor

@ycxia It turned out the frame index is missing and has been updated in the feature extraction code.

Besides, it seems in your case your self.max_proposal is 10 rather than 1000 somehow.

@ycxia
Copy link

ycxia commented Aug 17, 2020

@LuoweiZhou Thanks for your replay. I print self.max_proposal and its number is 1000.
After updating the feature extraction code, there is another error :
dets_labels[i, j, :num_proposal] = proposals
ValueError: could not broadcast input array from shape (100,7) into shape (100,6)

so, i changed the shape dets_labels = np.zeros((N, fpv, 100, 6)) into dets_labels = np.zeros((N, fpv, 100, 7))

But there is another error in excuting main.py of grouned video descripting:
File "/data/yongcheng/grounded-video-description/misc/dataloader_anet.py", line 334, in getitem
pad_proposals[:num_pps] = proposals[:num_pps]
ValueError: could not broadcast input array from shape (10,100,7) into shape (10,7)

so , i changed the extract_features_gvd_anet.py https://github.com/LuoweiZhou/detectron-vlp/blob/master/tools/extract_features_gvd_anet.py#L271
f.create_dataset("dets_labels", data=dets_labels) -> f.create_dataset("dets_labels", data=dets_labels.reshape(1,fpv*100,7))

There are no other errors!

@LuoweiZhou
Copy link
Contributor

@ycxia Thank you for your feedback. We will need to reshape dets_labels to N*(fpv*100)*7. Please see this commit for the fix: LuoweiZhou/detectron-vlp@9ecc981

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants