Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What if the training pairs are without bbox? #11

Closed
Frank-Dz opened this issue Mar 5, 2021 · 12 comments
Closed

What if the training pairs are without bbox? #11

Frank-Dz opened this issue Mar 5, 2021 · 12 comments

Comments

@Frank-Dz
Copy link

Frank-Dz commented Mar 5, 2021

            area[i, 0] = obj['bbox'][2]*obj['bbox'][3]

This method requires bbox during training. What if the training pairs are without bbox?

How can we calculate the area and loss of offset?

@Gengzigang
Copy link
Member

Gengzigang commented Mar 5, 2021

Hi, we use the area of one person's bounding box to normalize this person's offset during calculating the loss of the offset map.
Alternatively, you can use the maximum and minimum coordinates of the key points to estimate the size of the people. The selection of this normalization tool doesn't influence the performance.

By the way, we use the average of all keypoints belong to the same person as the center of this person. In the code, we can choose the center of the person's bounding box as the center of this person, which may cause confusion. The APs of the two choices are the same.

@Frank-Dz
Copy link
Author

Frank-Dz commented Mar 5, 2021 via email

@Frank-Dz
Copy link
Author

Frank-Dz commented Mar 5, 2021

Hi, we use the area of one person's bounding box to normalize this person's offset during calculating the loss of the offset map.
Alternatively, you can use the maximum and minimum coordinates of the key points to estimate the size of the people. The selection of this normalization tool doesn't influence the performance.

By the way, we use the average of all keypoints belong to the same person as the center of this person. In the code, we can choose the center of the person's bounding box as the center of this person, which may cause confusion. The APs of the two choices are the same.

For example, I just want to do head detection:
image

My concern is that this will change the way we calculate the loss.

@Gengzigang
Copy link
Member

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

@Frank-Dz
Copy link
Author

Frank-Dz commented Mar 5, 2021

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.

Thank you very much!

@Frank-Dz Frank-Dz closed this as completed Mar 5, 2021
@Gengzigang
Copy link
Member

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.

Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

@Frank-Dz
Copy link
Author

Frank-Dz commented Mar 5, 2021

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:

image

Do you mean we should just remove the offset loss?
Just use heatmap loss?

or "I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently." --> you mean we still need to use both heatmap and offset loss during detection?

@Frank-Dz Frank-Dz reopened this Mar 5, 2021
@Gengzigang
Copy link
Member

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:

image

Do you mean we should just remove the offset loss?
Just use heatmap loss?

You can totally remove the offset regression part both in the training and post-processing.

In this method, the offset means the location offset between the keypoint location and the center point location. However, in your case, I do not know how to choose the center point. I am curious about how do you use the offset regression.

@Frank-Dz
Copy link
Author

Frank-Dz commented Mar 5, 2021

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:
image
Do you mean we should just remove the offset loss?
Just use heatmap loss?

You can totally remove the offset regression part both in the training and post-processing.

In this method, the offset means the location offset between the keypoint location and the center point location. However, in your case, I do not know how to choose the center point. I am curious about how do you use the offset regression.

Yes! That's the point! Because in my case, I only want to detect the head. And there are many images without the full body. So the center position should be the center of BBOX of head. Yet this center position is quite close to the key point of head.
So maybe removing them is a good choice.
Thank you again!

@Frank-Dz
Copy link
Author

Frank-Dz commented Mar 5, 2021

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:
image
Do you mean we should just remove the offset loss?
Just use heatmap loss?

You can totally remove the offset regression part both in the training and post-processing.

In this method, the offset means the location offset between the keypoint location and the center point location. However, in your case, I do not know how to choose the center point. I am curious about how do you use the offset regression.

@Frank-Dz Frank-Dz closed this as completed Mar 5, 2021
@Frank-Dz
Copy link
Author

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:
image
Do you mean we should just remove the offset loss?
Just use heatmap loss?

You can totally remove the offset regression part both in the training and post-processing.

In this method, the offset means the location offset between the keypoint location and the center point location. However, in your case, I do not know how to choose the center point. I am curious about how do you use the offset regression.

Hi~ Still get a little confused. Really sorry to bother you.
Seems like the method still need nonmaximum suppression to filter the Redundant BBox.
I have train the NN to predict only one point in the heatmap. How can we avoid the nonmaximum suppression which get BBox and offset involved? (In other words, we do not need to group the people, since there is only one point for each person.) Any suggestions?
image

@Frank-Dz
Copy link
Author

Frank-Dz commented Mar 13, 2021

Actually, https://openaccess.thecvf.com/content_CVPR_2019/html/Ribera_Locating_Objects_Without_Bounding_Boxes_CVPR_2019_paper.html
There is a paper quite close to my situation where they do obj detection without Bbox.
But I hope to use the heatmap to do so since I think this will provide a more stable result.
So would like to ask for your suggestions.
Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants