Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the blance of loss_bbox and loss_rank_sort #11

Closed
jcdubron opened this issue Sep 24, 2021 · 2 comments
Closed

About the blance of loss_bbox and loss_rank_sort #11

jcdubron opened this issue Sep 24, 2021 · 2 comments

Comments

@jcdubron
Copy link

I notice that loss_bbox is weighted to be equal to the sum of loss_rank and loss_sort. If not, how is the performance? What's the intuition of this action? And is there any other reference to do so?

losses_bbox = torch.sum(bbox_weights*loss_bbox)/bbox_avg_factor
self.SB_weight = (ranking_loss+sorting_loss).detach()/float(losses_bbox.item())
losses_bbox *= self.SB_weight

@kemaloksuz
Copy link
Owner

"If not, how is the performance?": Table 9 in our paper shows on ATSS that the performance is similar (39.8 w/o self-balancing vs 39.9 with self-balancing) and using this strategy reduces the number of hyperparameters to be tuned. Note that to obtain 39.8, we tuned task-balancing scalar to 2 (c.f. Table A.11 below), but with self-balancing, there is no need for tuning.

image

"What's the intuition of this action?": This is a simple heuristic to discard tuning task-balancing coefficients. We analysed equalizing values and gradients (see Table A.11 below) and observed that when we use losses with similar ranges (RS Loss for classification, GIoU Loss for box regression and Dice Loss for mask prediction - see also Fig. 3 in our paper), value-based approach performs as well as tuning.

image

" And is there any other reference to do so?" Previously in our NeurIPS 20 paper (aLRP Loss - https://arxiv.org/abs/2009.13592), we also used a self-balancing strategy. It was a bit different: For example, in aLRP Loss self-balancing was epoch-based, but in RS Loss it is iteration-based. Overall, the strategy is simpler in RS Loss. I don't remember any other detection/segmentation paper to use this kind of balancing strategy or design/analyse losses with bounded & similar ranges in all sub-tasks (i.e. classification, box regression, mask prediction).

@jcdubron
Copy link
Author

Thanks for your detailed explanation. The comprehensive experiments validate the effectiveness of this design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants