What if the training pairs are without bbox? #11

Frank-Dz · 2021-03-05T10:51:11Z

            area[i, 0] = obj['bbox'][2]*obj['bbox'][3]

This method requires bbox during training. What if the training pairs are without bbox?

How can we calculate the area and loss of offset?

The text was updated successfully, but these errors were encountered:

Gengzigang · 2021-03-05T13:32:57Z

Hi, we use the area of one person's bounding box to normalize this person's offset during calculating the loss of the offset map.
Alternatively, you can use the maximum and minimum coordinates of the key points to estimate the size of the people. The selection of this normalization tool doesn't influence the performance.

By the way, we use the average of all keypoints belong to the same person as the center of this person. In the code, we can choose the center of the person's bounding box as the center of this person, which may cause confusion. The APs of the two choices are the same.

Frank-Dz · 2021-03-05T13:36:12Z

yes. But what if there are only one joint rather than 17 joints like coco? How should we set the code? Thanks!

…

On Fri, 5 Mar 2021 at 9:33 PM, Gengzigang ***@***.***> wrote: Hi, we use the area of one person's bounding box to normalize this person's offset during calculating the loss of the offset map. You can use the maximum and minimum coordinates of the key points to estimate the size of the people. The selection of this normalization tool doesn't influence the performance. By the way, we use the average of all keypoints belong to the same person as the center of this person. In the code, we can choose the center of the person's bounding box as the center of this person, which may cause confusion. The APs of the two choices are the same. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#11 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANDWCHIIF2JBYRUGTEK2YOLTCDMRTANCNFSM4YU73IMA> .

Frank-Dz · 2021-03-05T13:39:15Z

Hi, we use the area of one person's bounding box to normalize this person's offset during calculating the loss of the offset map.
Alternatively, you can use the maximum and minimum coordinates of the key points to estimate the size of the people. The selection of this normalization tool doesn't influence the performance.

By the way, we use the average of all keypoints belong to the same person as the center of this person. In the code, we can choose the center of the person's bounding box as the center of this person, which may cause confusion. The APs of the two choices are the same.

For example, I just want to do head detection:

My concern is that this will change the way we calculate the loss.

Gengzigang · 2021-03-05T13:45:59Z

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

Frank-Dz · 2021-03-05T13:47:45Z

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.

Thank you very much!

Gengzigang · 2021-03-05T14:05:10Z

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.

Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Frank-Dz · 2021-03-05T14:16:59Z

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:

Do you mean we should just remove the offset loss?
Just use heatmap loss?

or "I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently." --> you mean we still need to use both heatmap and offset loss during detection?

Gengzigang · 2021-03-05T14:42:40Z

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:

Do you mean we should just remove the offset loss?
Just use heatmap loss?

You can totally remove the offset regression part both in the training and post-processing.

In this method, the offset means the location offset between the keypoint location and the center point location. However, in your case, I do not know how to choose the center point. I am curious about how do you use the offset regression.

Frank-Dz · 2021-03-05T14:45:41Z

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:

Do you mean we should just remove the offset loss?
Just use heatmap loss?

You can totally remove the offset regression part both in the training and post-processing.

In this method, the offset means the location offset between the keypoint location and the center point location. However, in your case, I do not know how to choose the center point. I am curious about how do you use the offset regression.

Yes! That's the point! Because in my case, I only want to detect the head. And there are many images without the full body. So the center position should be the center of BBOX of head. Yet this center position is quite close to the key point of head.
So maybe removing them is a good choice.
Thank you again!

Frank-Dz · 2021-03-05T15:33:08Z

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:

Do you mean we should just remove the offset loss?
Just use heatmap loss?

You can totally remove the offset regression part both in the training and post-processing.

In this method, the offset means the location offset between the keypoint location and the center point location. However, in your case, I do not know how to choose the center point. I am curious about how do you use the offset regression.

Frank-Dz · 2021-03-13T01:23:27Z

The normalization tool should be designed for your project. You can use any length that reflects the size of the person, or you can directly regress the offsets without this normalization.

OK. I am trying the BBox generation based on the projection. But the BB is for the head rather than the whole body.
Thank you very much!

For your case, I think the heatmap is more suitable because you just need to detect the head keypoints and the heatmap can detect all the same kind of the keypoints precisely and efficiently.

Sorry I just got confused about this:

Do you mean we should just remove the offset loss?
Just use heatmap loss?

You can totally remove the offset regression part both in the training and post-processing.

In this method, the offset means the location offset between the keypoint location and the center point location. However, in your case, I do not know how to choose the center point. I am curious about how do you use the offset regression.

Hi~ Still get a little confused. Really sorry to bother you.
Seems like the method still need nonmaximum suppression to filter the Redundant BBox.
I have train the NN to predict only one point in the heatmap. How can we avoid the nonmaximum suppression which get BBox and offset involved? (In other words, we do not need to group the people, since there is only one point for each person.) Any suggestions?

Frank-Dz · 2021-03-13T01:37:38Z

Actually, https://openaccess.thecvf.com/content_CVPR_2019/html/Ribera_Locating_Objects_Without_Bounding_Boxes_CVPR_2019_paper.html
There is a paper quite close to my situation where they do obj detection without Bbox.
But I hope to use the heatmap to do so since I think this will provide a more stable result.
So would like to ask for your suggestions.
Thank you very much!

Frank-Dz mentioned this issue Mar 5, 2021

Train with no bounding box dataset #2

Closed

Frank-Dz closed this as completed Mar 5, 2021

Gengzigang mentioned this issue Mar 5, 2021

How to train for multi-person detection? #10

Closed

Frank-Dz reopened this Mar 5, 2021

Frank-Dz closed this as completed Mar 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What if the training pairs are without bbox? #11

What if the training pairs are without bbox? #11

Frank-Dz commented Mar 5, 2021

Gengzigang commented Mar 5, 2021 •

edited

Frank-Dz commented Mar 5, 2021 via email

Frank-Dz commented Mar 5, 2021

Gengzigang commented Mar 5, 2021

Frank-Dz commented Mar 5, 2021

Gengzigang commented Mar 5, 2021

Frank-Dz commented Mar 5, 2021 •

edited

Gengzigang commented Mar 5, 2021

Frank-Dz commented Mar 5, 2021

Frank-Dz commented Mar 5, 2021

Frank-Dz commented Mar 13, 2021

Frank-Dz commented Mar 13, 2021 •

edited

What if the training pairs are without bbox? #11

What if the training pairs are without bbox? #11

Comments

Frank-Dz commented Mar 5, 2021

Gengzigang commented Mar 5, 2021 • edited

Frank-Dz commented Mar 5, 2021 via email

Frank-Dz commented Mar 5, 2021

Gengzigang commented Mar 5, 2021

Frank-Dz commented Mar 5, 2021

Gengzigang commented Mar 5, 2021

Frank-Dz commented Mar 5, 2021 • edited

Gengzigang commented Mar 5, 2021

Frank-Dz commented Mar 5, 2021

Frank-Dz commented Mar 5, 2021

Frank-Dz commented Mar 13, 2021

Frank-Dz commented Mar 13, 2021 • edited

Gengzigang commented Mar 5, 2021 •

edited

Frank-Dz commented Mar 5, 2021 •

edited

Frank-Dz commented Mar 13, 2021 •

edited