Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boxes format in calculation of huber_loss #58

Closed
zlyin opened this issue Jul 21, 2020 · 5 comments
Closed

boxes format in calculation of huber_loss #58

zlyin opened this issue Jul 21, 2020 · 5 comments

Comments

@zlyin
Copy link

zlyin commented Jul 21, 2020

Hi @rwightman, I'm trying to implement a custom iou loss function. But I'd like to confirm with you about the boxes format consumed by huber_loss function. Could you help me verify the format of inputs & targets args?

def huber_loss(input, target, delta: float = 1., weights: Optional[torch.Tensor] = None, size_average: bool = True):
    """
    """
    err = input - target
    abs_err = err.abs()
    quadratic = torch.clamp(abs_err, max=delta)
    linear = abs_err - quadratic
    loss = 0.5 * quadratic.pow(2) + delta * linear
    if weights is not None:
        loss *= weights
    return loss.mean() if size_average else loss.sum()

I print both of them out and found they are in the shape of [batch, height_l, width_l, 9*4], the last dim of which I think coords for bounding boxes. In other threads, you mentioned that you implementation consumes targets in YXYX format, outputs pred boxes in XYWH format. Does such theory hold here as well?

Thank you for your confirmation!

@rwightman
Copy link
Owner

Yes, they should be relative (to anchors) regression coordinates in yxyx, they are not converted into absolute xyxy until decode_box_outputs(boxes, anchors, output_xyxy=True) is called

def decode_box_outputs(rel_codes, anchors, output_xyxy: bool=False):
"""Transforms relative regression coordinates to absolute positions.
Network predictions are normalized and relative to a given anchor; this
reverses the transformation and outputs absolute coordinates for the input image.
Args:
rel_codes: box regression targets.
anchors: anchors on all feature levels.
Returns:
outputs: bounding boxes.
"""
ycenter_a = (anchors[:, 0] + anchors[:, 2]) / 2
xcenter_a = (anchors[:, 1] + anchors[:, 3]) / 2
ha = anchors[:, 2] - anchors[:, 0]
wa = anchors[:, 3] - anchors[:, 1]
ty, tx, th, tw = rel_codes.unbind(dim=1)
w = torch.exp(tw) * wa
h = torch.exp(th) * ha
ycenter = ty * ha + ycenter_a
xcenter = tx * wa + xcenter_a
ymin = ycenter - h / 2.
xmin = xcenter - w / 2.
ymax = ycenter + h / 2.
xmax = xcenter + w / 2.
if output_xyxy:
out = torch.stack([xmin, ymin, xmax, ymax], dim=1)
else:
out = torch.stack([ymin, xmin, ymax, xmax], dim=1)
return out
)

... which is usually only done for predictions via the generate_detections() call

Note there is a PR for IOU loss at #52 ... I haven't dug in yet but am hoping to take a closer look this week, maybe possible to optimize a bit

@zlyin
Copy link
Author

zlyin commented Jul 22, 2020

Hi @rwightman Great thanks for your quick reply. I implemented a version yesterday but messed up the boxes format, which led to NaN value of loss. Now I can try to fix it. Thanks!

@zlyin
Copy link
Author

zlyin commented Jul 22, 2020

@rwightman While I still have you here, let me ask another question. I'm also trying to modify the default anchors but have no idea of the right way to set it up. Maybe I misunderstood the parameters relevant to anchors in the model_config.py file.

    # feature + anchor config
    h.min_level = 3
    h.max_level = 7
    h.num_levels = h.max_level - h.min_level + 1
    h.num_scales = 3
    h.aspect_ratios = [(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]
    h.anchor_scale = 4.0

As you indicated in your code, your implementation employed 3 aspect ratios (h.aspect_ratios) x 3 octave_scale ([2**0, 2**1/3, 2**2/3]), 9 anchors per location on each feature map.

I used KMeans to cluster a bunch of anchors, with w & h & aspect ratio. For example, one of my tryout was to set h.aspect_ratios = [(1.0, 1.0), (1.0, 1.5), (1.5, 1.0), (1.0, 2.0), (2.0, 1.0)] . My questions are as follows:

  • My cluster algorithm provides me both w & h and aspect ratios of anchors, say 6 different anchors. Since the objects in my dataset are relatively smaller than that in the COCO dataset, setting anchors on the large feature maps are helpful. I'm wondering how to do it in the config file.
  • Another question is about the value format of aspect_ratios. For example, as for the 2 ratios, (1.4, 0.7), (0.7, 1.4), can I change them into (2, 1), (1, 2)? Since the ratio value does affect the actual sizes of anchors, I have no idea the standard to follow.
  • How about the h.anchor_scale=4? Should I change it as well?

Thank you very much for your help! I appreciate your advice & guidance.

@rwightman
Copy link
Owner

To change the anchor sizes anchor_scale is the main factor, it can be defined as a list so it can change per feature level, it's multiplied by the feat stride at different levels, but also multiplied by the aspects (so a 2,1 vs 1.4, .07 * the scale would result in different size anchors)... as for more detail than that you'll have to dig in, the code is adapted from the Google TF models retinanet and other anchor based models, I have not explored the space of options.

@zlyin
Copy link
Author

zlyin commented Jul 26, 2020

Hi Ross, thanks for your explanation. I'll dig into it further. Thank you for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants