A question about person_box and action_predictor #38

pxssw · 2020-11-19T11:18:36Z

Wonderful job, as a researcher in the same field, I would like to express my appreciation to the author.
I have one question about the part "compute_prediction" of "sction_predictior.py", the inputs of calculate function are nearest list frames(just like self.frame_stack = self.frame_stack[-self.frame_buffer_numbers:]) and only the box of center frame, it is assumed that the pedestrian has little displacement in input frames? I wonder to konw if add the exact box of each frame (can be extracted from the tracking results) can make the finall result better?(for the big change range of motion, just like hit, fight?) or just I mistook the process ?

yelantf · 2020-11-20T07:33:40Z

Thank you for your attention! As shown in the paper, our model takes the bounding box on the center frame to do RoIAlign on all frames of the input video clips. This is mainly following previous works, but also because AVA dataset only provides boxes annotated on the center frame. Of course, we could use a tracking algorithm to generate more accurate bounding boxes on every single frame, and then use them to get more robust results. Actually, there are some previous works [link] trying that. However, we did not find a very robust tracker (especially for fast motion scenes), so we chose to use the current design in our method.

pxssw · 2020-11-20T12:11:58Z

Copy that，wish you to make greater success with the progress in the relevant fields！
At the same time, there are some little problems I meet in the project. (Maybe they are just my personal misunderstanding or bugs, if that please ignore them)

The part update_action_dictionary of visualizer.py: the finall result self.action_dictionary includes the all IDs results (from the first person ), if the project is running for a leng time or for crowds maybe there will be a large demand for calculate resources? Maybe there needs a clean for the long long ago IDS'results.
The cur_millis = stream.get(cv2.CAP_PROP_POS_MSEC) of video_detection_loader.py : I find , in my webcam mode, the begin value of cur_millis is very big (just like 410^8+) , I really don't konw why it not is 0ms, and the value keeps going up for different new running of my project(4.X10^8+, 5.X10^8+...). It's a common problem? I really don't konw.

yelantf · 2020-11-23T10:58:23Z

Thanks for pointing out these problems! First, I have to admit that our current demo program is not well-designed. It could have some little bugs and is also hard to read. As to these two problems you mentioned above:

Yes, you are right. This is indeed a problem for long time running. We will try to enhance it following your suggestions when we are free. Of course, pull requests are also welcomed.
We did not notice this issue before, and actually we did not fully test the demo script in webcam mode because that requires a server with graphical interfaces and a camera, which is not always available to us. According to the documentation of opencv, this flag should give current position of the video file in milliseconds or video capture timestamp. I'm inclined to think that it is the right format for video timestamp, which is relevant to specific camera?

pxssw · 2020-11-24T11:19:25Z

good job! 瑕不掩瑜

jun0wanan · 2021-06-04T08:00:13Z

Thank you for your attention! As shown in the paper, our model takes the bounding box on the center frame to do RoIAlign on all frames of the input video clips. This is mainly following previous works, but also because AVA dataset only provides boxes annotated on the center frame. Of course, we could use a tracking algorithm to generate more accurate bounding boxes on every single frame, and then use them to get more robust results. Actually, there are some previous works [link] trying that. However, we did not find a very robust tracker (especially for fast motion scenes), so we chose to use the current design in our method.

hi,
sorry to disturb you , I want to ask how the 1th clip's person bbox link to 2rd clip's person bbox (the same person)?

best,
jun

jun0wanan · 2021-06-04T08:02:52Z

hi,
sorry to disturb you , I want to ask how the 1th clip's person bbox link to 2rd clip's person bbox (the same person)?

best,
jun

yelantf added enhancement New feature or request question Further information is requested labels Dec 2, 2020

yelantf closed this as completed Dec 2, 2020

yelantf mentioned this issue Dec 28, 2020

How to track target people? #46

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question about person_box and action_predictor #38

A question about person_box and action_predictor #38

pxssw commented Nov 19, 2020

yelantf commented Nov 20, 2020

pxssw commented Nov 20, 2020

yelantf commented Nov 23, 2020

pxssw commented Nov 24, 2020

jun0wanan commented Jun 4, 2021

jun0wanan commented Jun 4, 2021

A question about person_box and action_predictor #38

A question about person_box and action_predictor #38

Comments

pxssw commented Nov 19, 2020

yelantf commented Nov 20, 2020

pxssw commented Nov 20, 2020

yelantf commented Nov 23, 2020

pxssw commented Nov 24, 2020

jun0wanan commented Jun 4, 2021

jun0wanan commented Jun 4, 2021