Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get person detection? #41

Open
FrancescoPiemontese opened this issue Apr 15, 2019 · 15 comments
Open

How to get person detection? #41

FrancescoPiemontese opened this issue Apr 15, 2019 · 15 comments

Comments

@FrancescoPiemontese
Copy link

FrancescoPiemontese commented Apr 15, 2019

First of all thank you for your excellent work. I have a question regarding person detection. In your paper it is mentioned that you use a person detector before feeding its output to the HRNet. Am I supposed download this separately and then feed its output to the HRNet? If so, what do the dataloaders in train and test.py do? Would it be possible for you to tell me which person detector has been used?

@njustczr
Copy link

I think the author used the detection information from the dataset.(mpii dataset json file 'center' 'scale')

@lxy5513
Copy link

lxy5513 commented Apr 18, 2019

@FrancescoPiemontese maybe you can refer to this myhrnet, I integrated yolo human detection.

@FrancescoPiemontese
Copy link
Author

Thank you! I will try

@leoxiaobin
Copy link
Owner

@lxy5513 , will you consider make a PR to this repo?

@lxy5513
Copy link

lxy5513 commented Apr 19, 2019

@leoxiaobin yes, soon after, I will add several human detection, like R-FCN, RetineNet, then do PR and speed description.

@lxy5513
Copy link

lxy5513 commented Apr 19, 2019

@leoxiaobin I prepare to do this track by your simple-baseline paper description

For the processing frame in videos, the boxes from a human detector and boxes generated by propagating joints from previous frames using optical flow are unified using a bounding box Non-Maximum Suppression (NMS) operation

I have two group boxes, but I don't how to do NMS, because boxes generated by flownet2S, which have no confidence scoces, could I can default think the score is previous frame boxes scors ? Could you tell me the problem, thank you advance.

@leoxiaobin
Copy link
Owner

We actually use the OKS score for NMS.

@lxy5513
Copy link

lxy5513 commented Apr 23, 2019

Thanks

@lxy5513
Copy link

lxy5513 commented Apr 25, 2019

@leoxiaobin
Hi, I made a PR for yolov3-HRnet, however something wired. I use two ways.


ONE, I get dt_boxes from yolo then python tools/test.py TEST.USE_GT_BBOX False TEST.FLIP_TEST False, and get rid of oks nms, get result as follow:

Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
pose_hrnet 0.702 0.859 0.770 0.653 0.779 0.736 0.878 0.794 0.683 0.813

TWO, I use end-to-end two model(same model like ONE), and get keypoins, then save into json, finally I get result by official cocoEval.evaluate() , as follow :

Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
hrnet 0.594 0.811 0.656 0.564 0.651 0.647 0.834 0.704 0.601 0.713

Could you please tell why the two results is so different, Thank you in advance

@lxy5513
Copy link

lxy5513 commented Apr 25, 2019

this is my script https://github.com/lxy5513/hrnet/blob/master/tools/eval.py which get keypoints json file , by the way, my YOLOv3 threshold is 0.1

@leoxiaobin
Copy link
Owner

I have a very quick look through your code. I have two questions.

  1. It seems that you do not convert image's channel to RGB. Opencv reads image as BGR channel. Our model are trained using RGB channel. So you need first convert your image data to RGB channel like line131 at https://github.com/leoxiaobin/deep-high-resolution-net.pytorch/blob/master/lib/dataset/JointsDataset.py#L131.

  2. Are the threshold for both metheds same?

@lxy5513
Copy link

lxy5513 commented Apr 26, 2019

I am greatly appreciate for your attention
this is my convert channel code: https://github.com/lxy5513/hrnet/blob/master/tools/eval.py#L142 .

this is my relative threshold code, they are same for two methods.
https://github.com/lxy5513/hrnet/blob/master/tools/eval.py#L159

@lxy5513
Copy link

lxy5513 commented Apr 26, 2019

By the way, my use yolov3 + simple-baseline pose model , test the PR, it seem normal, as follow:

Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
Simple-baseline 0.648 0.856 0.708 0.617 0.706 0.697 0.880 0.750 0.652 0.763

@alex9311 alex9311 mentioned this issue Jan 13, 2020
@alex9311
Copy link
Contributor

alex9311 commented Feb 3, 2020

I would say this issue can be closed with #161 being merged

@zhanghao5201
Copy link

@leoxiaobin
Hi, I made a PR for yolov3-HRnet, however something wired. I use two ways.

ONE, I get dt_boxes from yolo then python tools/test.py TEST.USE_GT_BBOX False TEST.FLIP_TEST False, and get rid of oks nms, get result as follow:

Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
pose_hrnet 0.702 0.859 0.770 0.653 0.779 0.736 0.878 0.794 0.683 0.813
TWO, I use end-to-end two model(same model like ONE), and get keypoins, then save into json, finally I get result by official cocoEval.evaluate() , as follow :

Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
hrnet 0.594 0.811 0.656 0.564 0.651 0.647 0.834 0.704 0.601 0.713
Could you please tell why the two results is so different, Thank you in advance

image
i also get 0.702,but the implementation about w32_256*192 is 0.744,why?i just run the implementaion code with the trained model pose_hrnet_w32_256x192.pth.can you help me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants