Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train an mAP 0.71 model by modifying 'mask' & 'scale' #23

Open
cory8249 opened this issue May 16, 2017 · 5 comments
Open

Train an mAP 0.71 model by modifying 'mask' & 'scale' #23

cory8249 opened this issue May 16, 2017 · 5 comments

Comments

@cory8249
Copy link
Contributor

I traced YOLOv2 C code last few days, I think there is a misunderstanding about 'mask' and 'scale'.

In this pytorch repo, the mask is used for loss function. It helps the network to focus on correct anchor boxes, instead of punishing other irrelevant boxes.
self.iou_loss = nn.MSELoss(size_average=False)(iou_pred * iou_mask, _ious * iou_mask) / num_boxes

So how to calculate right scale_mask ?

YOLO's mask is based on predicted objectness(0~1) for the box
So, if the box's predicted objectness is high (e.g. 0.9). But there are no ground-truth in that position. It should be punished. The punishment = noobject_scale * (0 - predicted objectness)
l.delta[obj_index] = l.noobject_scale * (0 - l.output[obj_index]);
Hence, this function help network learns to give reasonable confidence on the box

However, in this repo
_iou_mask[best_ious <= cfg.iou_thresh] = cfg.noobject_scale
dose not consider objectness. It punishes every unqualified box with the same value. Hence the detector learn very poor about objectness

Here is the most obvious one, other 'mask' and 'scale' are also implemented wrong way. And acutally YOLO has more complicated policy about these scale_mask. (some if-else conditions). I also find that YOLO's the loss is calculated before 'exp() and log(), not after.

By fixing scale_mask bug, VOC07 test mAP (trained on VOC07+12 trainval) increases from 0.67 to 0.71. Which is much closer to yolo-voc-weights.h5 (0.7221)

You can refer to my code darknet_v2.py. Though I am still debugging, not completed yet. Just for pointing out what I found.

@cory8249 cory8249 mentioned this issue May 16, 2017
@longcw
Copy link
Owner

longcw commented May 16, 2017

Thank you!

@JesseYang
Copy link

@cory8249
In my understanding, the l.delta in darknet source code is the minus derivative of the loss with respect to the input value.

If the mask for those positions without ground truth boxes is just l.noobject_scale, then the loss is defined as l.noobject_scale / 2 * (pred_iou - gt_iou) ^ 2, and the gt_iou is 0. In this case, the minus derivative with respect to pred_iou should be: l.noobject_scale * (0 - pred_iou), which is consistent with the darknet source code: l.delta[obj_index] = l.noobject_scale * (0 - l.output[obj_index]).

From the equation that loss = l.noobject_scale / 2 * (pred_iou - gt_iou) ^ 2, the punishment for those positions without gt boxes depends on both the noobject_scale and the pred_iou. The minus derivative l.noobject_scale * (0 - pred_iou) also shows this point. Thus when pred_iou goes greater (from 0 to 1), the punishment already goes greater, and it is not necessary to incorporate pred_iou to the mask part to improve the punishment.

So I think the previous implementation _iou_mask[best_ious < cfg['iou_thresh']] = cfg['noobject_scale'] * 1 is reasonable and consistent with darknet source code.

@cory8249 cory8249 changed the title Train an mAP 0.71 model by fixing wrong implementation of 'mask' & 'scale' Train an mAP 0.71 model by modifying 'mask' & 'scale' May 17, 2017
@yangyu12
Copy link

yangyu12 commented Dec 3, 2017

Hi @cory8249 ,
I find out your yolo2-pytorch codes in your repository. But I find it hard to compare your code with original longcw's version.
Can you please list all the modification you do to improve the mAP to 0.71.
B.T.W is darknet_training_v3.py that works to obtain 0.71 mAP ?

@Erotemic
Copy link

@JesseYang Your argument makes sense to me, and I tend to agree with it, but when I look in the current source code I see that @cory8249's version is being used. Why is this? It seems like iou_mask should simply be cfg.noobject_scale wherever there is no object. Is this wrong?

@xuzijian
Copy link

xuzijian commented Apr 9, 2018

I agree with @JesseYang 's points and in order to meet with the original code, I guess it should be
_iou_mask[best_ious < cfg['iou_thresh']] = math.sqrt(0.5*cfg['noobject_scale']) (and as well as for high iou anchors).

I'm just doing an experiment to test such settings.
The results is quite similar (and a little bit better with 416*416 input) with what I got from 'master' version, which is 72.3% currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants