Anchors possibly are not in harmony with the annotations #8

MyVanitar · 2017-08-13T09:16:51Z

Hello,

I have a dataset with 1 class which is apple. Apple is an object with almost square annotations, in whatever scales of the object.

But when I visualize the anchors, it shows some long rectangles, I mean their width and their heights are very different, except than the yellow one. is this correct?

MyVanitar · 2017-08-13T14:57:16Z

Also I trained the model with new anchors and as we can predict, it reduced the accuracy (both IoU and recall).

1% reduction in IoU and 10% reduction in Recall.

Jumabek · 2017-08-15T02:51:14Z

Hi @VanitarNordic ,
I think, if all of your annotations are same class, then logical thing to do is to calculate one anchor then have multiple scales of it.

For example,
calculate the one anchor, let's say it will be (4,4)
next generate multiple anchors by scaling it -> (3,3), (5,5), (4..3,4.3), etc.

Let me know if it doesn't improve your performance.

MyVanitar · 2017-08-15T06:46:58Z

Hello Jumabek,

Actually I don't think it would solve the case, do you know why?
Because for example consider we have 2 classes of oranges and apples. both have square shape annotations. Then what should we do? it is not clear how these anchors made of.

Another unknown facts about YOLO is that is not clear to how fine-tune it. The provided darknet19_448.conv.23 weight is an initial ImageNet classification weight which is used as an initial weight for training the model from scratch.

christopher5106 · 2017-08-15T11:45:20Z

Once your centroïds are computed, have you tried to compute the number of elements in each cluster ? it might be there is a collapse where all boxes fall into the same cluster, except a few of them. To compute the assignments, use the following code
https://github.com/Jumabek/darknet_scripts/blob/master/gen_anchors.py#L80-L90

MyVanitar · 2017-08-15T11:51:17Z

@christopher5106
Does your comment address me?

christopher5106 · 2017-08-15T11:55:14Z

yes

christopher5106 · 2017-08-15T11:56:14Z

It might be you have in your dataset a few anchors 'very different' and these atypic anchors can create their own cluster of 1 element...

MyVanitar · 2017-08-15T12:53:23Z

@christopher5106

Not actually, my dataset is small and I can say I remember all of its images. All are square like annotations. Also I mentioned that training with these new anchors made the results worse, not better. That's the proof.

christopher5106 · 2017-08-15T12:56:54Z

I propose you run the assignment on the anchors you got and you'll tell me the number of elements per cluster. That will give me the only way to analyze the result of the clustering.

MyVanitar · 2017-08-15T13:12:17Z

@christopher5106

The assignments parameter is what do you want in the fraction of the code you mentioned?

D = [] 
        iter+=1           
        for i in range(N):
            d = 1 - IOU(X[i],centroids)
            D.append(d)
        D = np.array(D) # D.shape = (N,k)
        
        print "iter {}: dists = {}".format(iter,np.sum(np.abs(old_D-D)))
            
        #assign samples to centroids 
assignments = np.argmin(D,axis=1)

christopher5106 · 2017-08-15T21:35:43Z

right, and then compute the number of elements per cluster.

christopher5106 · 2017-08-15T21:40:22Z

one_hot = np.zeros ((N,k))
one_hot[np.arange (N),assignments]=1
nb_per_cluster=np.sum (one_hot,axis=0)

MyVanitar · 2017-08-15T22:09:22Z

This is the whole output:

iter 1: dists = 930.425956709
nb_per_cluster=[  38.  182.   55.   20.   33.]
iter 2: dists = 155.131292546
nb_per_cluster=[  46.  146.   82.   29.   25.]
iter 3: dists = 59.0919690496
nb_per_cluster=[  54.  109.  113.   31.   21.]
iter 4: dists = 61.8350888818
nb_per_cluster=[  60.   91.  127.   29.   21.]
iter 5: dists = 43.8474440388
nb_per_cluster=[  65.   78.  142.   23.   20.]
iter 6: dists = 41.6920456029
nb_per_cluster=[  68.   66.  151.   25.   18.]
iter 7: dists = 29.8349217517
nb_per_cluster=[  75.   56.  156.   23.   18.]
iter 8: dists = 24.8531294969
nb_per_cluster=[  81.   52.  156.   21.   18.]
iter 9: dists = 14.2018745636
nb_per_cluster=[  89.   51.  152.   18.   18.]
iter 10: dists = 9.50047598229
nb_per_cluster=[  92.   49.  153.   16.   18.]
iter 11: dists = 9.18284330426
nb_per_cluster=[  93.   49.  153.   15.   18.]
iter 12: dists = 2.92465011815
nb_per_cluster=[  93.   49.  153.   15.   18.]
Centroids =  [[ 0.22046375  0.34864097]
 [ 0.06277104  0.10153069]
 [ 0.15715894  0.21793756]
 [ 0.3867708   0.50731473]
 [ 0.70217017  0.86072544]]
(5L, 2L)
Anchors =  [[  0.81602353   1.31989902]
 [  2.04306624   2.83318831]
 [  2.86602878   4.53233258]
 [  5.0280204    6.59509153]
 [  9.12821217  11.18943078]]

centroids.shape (5L, 2L)

christopher5106 · 2017-08-15T22:50:00Z

Sounds the clusters are well balanced.

Now a new question arises: in annotation_dims, how have your values w,h been normalized ?If you normalize by dividing box_width by image_width and box_height by image_height, that won't work because your images are not all square. You need to normalize box_width and box_height by the same factor, for example np.max( image_height, image_width).

MyVanitar · 2017-08-15T22:55:36Z

If you normalize by dividing box_width by image_width and box_height by image_height, that won't work

yes my images are not square of course. they have different sizes like VOC. if that's the key, do YOLO authors have converted VOC annotations using a script, with your consideration?

christopher5106 · 2017-08-15T23:49:16Z

So, now you have your answer: since images are resized to 416x416 in some Yolo implementations without preserving the ratio, it is normal that you get some rectangles since squares become rectangles after the resize

MyVanitar · 2017-08-16T06:58:24Z

@christopher5106

Are you sure about this? because you know keeping the ratio is in relation with width and height of the image, not a fixed number. if all images were square then width and height are equal. then we were dividing X and Y by the same number, but because width and height are not equal, then each coordinate must be divided by its own related number otherwise we have made the bounding box just smaller.

Actually we are calculating the relation of a bounding box to the whole image,

christopher5106 · 2017-08-16T08:09:41Z

When you compute the new centroids in https://github.com/Jumabek/darknet_scripts/blob/master/gen_anchors.py#L98-L102
you get values between 0 and 1,
[[ 0.22046375 0.34864097]
[ 0.06277104 0.10153069]
[ 0.15715894 0.21793756]
[ 0.3867708 0.50731473]
[ 0.70217017 0.86072544]]
that means it is relative to the image width and height, no?

MyVanitar · 2017-08-16T08:20:33Z

Yes, I think so. Then what's your point? You mean we should still divide the all absolute coordinates by a number such as np.max( image_height, image_width). ?

christopher5106 · 2017-08-16T08:23:22Z

Yes, you multiply them back by image_width or image_height depending it is width or height, and then devides by np.max( image_height, image_width)
Then, you'll get squares for sure. But be careful, these anchors wont' work in a Yolo network imp that takes square images only.

MyVanitar · 2017-08-16T08:28:54Z

these anchors wont' work in a Yolo network imp that takes square images only.

The images which are used to calculate the anchors here, are the same images which will be used to train and validate the network actually. So I think it should work by default, isn't it?
if Not, you think we should make all training/validation images square by a Padding script or something like that?

christopher5106 · 2017-08-16T08:31:52Z

It works by default.
Here I'm just telling you how to check to get square anchors: this is because your boxes which looks square are not if you consider them relatively to the image (that you resize in a square) . But these anchors won't work in the yolo training.

MyVanitar · 2017-08-16T08:46:11Z

@christopher5106

Alright, Thank you very much. I have nothing to say except than I must test this and I'll tell you the results.

in the meantime, do you have any information about transfer learning the YOLO? Some people say that I should partial the VOC trained weight except than the last convolution layer. I don't think training with darknet19_448.conv.23 is transfer learning. it is much like training the network from scratch with an initial classification weight.

christopher5106 · 2017-08-16T09:51:26Z

if you just re initialize the last layer weights, it is transfer learning

MyVanitar · 2017-08-16T12:17:39Z

@christopher5106

Excuse me christopher. I still have a confusion with my friends. Forget about anchors.

Just this question. How should the relative coordinates be calculated?

christopher5106 · 2017-08-16T12:20:09Z

If you want the ratio to be preserved, you divide the dimensions by np.max( image_height, image_width) or even you don't normalize them, you keep them in pixels but then you need to implement yolo the way it is written in my article then...

MyVanitar · 2017-08-16T12:27:34Z

The thing is I don't know which is the best to get a higher accuracy. What the original authors have used. They used your method or used ordinary division with width and height. This is the core calculation of what authors suggest to convert VOC xml annotations to YOLO:

def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

christopher5106 · 2017-08-16T12:30:43Z

to my point of view, your clusters are well balanced, so that's ok

christopher5106 · 2017-08-16T12:31:07Z

I was just answering why you did not get square anchor boxes: this is due to the resize of the image to a square

Jumabek · 2017-08-21T11:29:50Z

@christopher5106, thanks for helping out

christopher5106 · 2017-08-21T12:38:42Z

Your convert function is right, relative to what the yolo authors have done.
But I would not do it that way. I would not divide by each dimension because that is why you end up with anchors that are not square. If you still want to use their implementation, what you can do it to crop the image in a square, randomly positionned. This way, dw and dh are the same, and square anchors stay squares.

MyVanitar mentioned this issue Aug 16, 2017

Correct way of calculating the relative coordinates AlexeyAB/Yolo_mark#13

Closed

ilkarman mentioned this issue Jan 13, 2018

Detection on many varied-size objects AlexeyAB/darknet#333

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anchors possibly are not in harmony with the annotations #8

Anchors possibly are not in harmony with the annotations #8

MyVanitar commented Aug 13, 2017 •

edited

Loading

MyVanitar commented Aug 13, 2017 •

edited

Loading

Jumabek commented Aug 15, 2017

MyVanitar commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 15, 2017 •

edited

Loading

christopher5106 commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 15, 2017 •

edited

Loading

christopher5106 commented Aug 15, 2017 •

edited

Loading

MyVanitar commented Aug 15, 2017 •

edited

Loading

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 16, 2017 •

edited

Loading

christopher5106 commented Aug 16, 2017

MyVanitar commented Aug 16, 2017

christopher5106 commented Aug 16, 2017

MyVanitar commented Aug 16, 2017 •

edited

Loading

christopher5106 commented Aug 16, 2017

MyVanitar commented Aug 16, 2017

christopher5106 commented Aug 16, 2017

MyVanitar commented Aug 16, 2017

christopher5106 commented Aug 16, 2017 •

edited

Loading

MyVanitar commented Aug 16, 2017

christopher5106 commented Aug 16, 2017

christopher5106 commented Aug 16, 2017

Jumabek commented Aug 21, 2017

christopher5106 commented Aug 21, 2017

Anchors possibly are not in harmony with the annotations #8

Anchors possibly are not in harmony with the annotations #8

Comments

MyVanitar commented Aug 13, 2017 • edited Loading

MyVanitar commented Aug 13, 2017 • edited Loading

Jumabek commented Aug 15, 2017

MyVanitar commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 15, 2017 • edited Loading

christopher5106 commented Aug 15, 2017

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 15, 2017 • edited Loading

christopher5106 commented Aug 15, 2017 • edited Loading

MyVanitar commented Aug 15, 2017 • edited Loading

christopher5106 commented Aug 15, 2017

MyVanitar commented Aug 16, 2017 • edited Loading

christopher5106 commented Aug 16, 2017

MyVanitar commented Aug 16, 2017

christopher5106 commented Aug 16, 2017

MyVanitar commented Aug 16, 2017 • edited Loading

christopher5106 commented Aug 16, 2017

MyVanitar commented Aug 16, 2017

christopher5106 commented Aug 16, 2017

MyVanitar commented Aug 16, 2017

christopher5106 commented Aug 16, 2017 • edited Loading

MyVanitar commented Aug 16, 2017

christopher5106 commented Aug 16, 2017

christopher5106 commented Aug 16, 2017

Jumabek commented Aug 21, 2017

christopher5106 commented Aug 21, 2017

MyVanitar commented Aug 13, 2017 •

edited

Loading

MyVanitar commented Aug 13, 2017 •

edited

Loading

MyVanitar commented Aug 15, 2017 •

edited

Loading

MyVanitar commented Aug 15, 2017 •

edited

Loading

christopher5106 commented Aug 15, 2017 •

edited

Loading

MyVanitar commented Aug 15, 2017 •

edited

Loading

MyVanitar commented Aug 16, 2017 •

edited

Loading

MyVanitar commented Aug 16, 2017 •

edited

Loading

christopher5106 commented Aug 16, 2017 •

edited

Loading