-
Notifications
You must be signed in to change notification settings - Fork 553
min_size/max_size #20
Comments
@natlachaman Hi! |
@rykov8 Thanks for your quick reply! A bit off topic but also very quick question: Did you use PASCAL VOC2007 for the results you uploaded? I'mm trying t reproduce your results but I haven't succeed so far. I checked the names of the image files from what I have and they don't seem to match. So I was wondering if I jsut got that part worng. I'm getting the data from the official PASCAL site http://host.robots.ox.ac.uk:8080/pascal/VOC/voc2007/ Thanks a ton! natlachaman |
@natlachaman relative coordinates are good, because you can resize input image from, say, 640x480 to 300x300, but you don't need to rescale the bounding boxes. On the contrary, the input image to the net is always 300x300 (you may change it, but the architecture is designed for this input, the authors also have one for ~500x500 pictures and it is a bit different), so, probably, that is why it is ok to choose sizes of priors in pixels. However, you are right, probably it was better to leave sizes of priors as scales, but I followed the original implementation and didn't consider my own improvements. As for your last question. What results do you mean? If you are speaking about training example, I used my own small dataset, that is very different from PASCAL. If you are speaking about the weights, they are ported from the original |
@rykov8 Oh! I didn't know the image size had that effect on the network. In the apepr they mentiones that they had better performance with larger images but didn't know they developed different architectures for different image sizes. Good to know! As for the PASCAL question. I missed that ! (#7) I thought you trained it on PASCAL VOC2007. In any case, the model you implemented follows the original implementation so in theory it should work relative fine. I used the same data format as you did for your own dataset, resize the images to 300x300 but still get a really strange behaviour: the error grow shoots like crazy half way the first epoch and I can't figure out why. Thanks for your time, always very helpful :) |
@natlachaman I'm not sure, that the architectures are different a lot for different input sizes (the idea is the same for sure), but if I am right, the net for 500x500 images is a little bit deeper. Anyway, you can check their prototxt files just to understand the architectures. Moreover, as I have mentioned, in the third revision of the paper they have changed a little the architecture for 300x300 pictures. As for the error, do you use I also have a small question to you. Probably, you have the implementation of MAP metric, as it is computed in PASCAL? I am implementing it (because I failed to find the implementation, that is quite strange, though), but I'm too lazy to finish. If you have, feel free to make pull request or post a link to someone's implementation. |
@rykov8 I use Adam or Rmsprop usually, for the same reason. No magic powers so far hehe. As of your question: No, I don't. Implementing MAP is def in my list. I started working with the SSD last week, on and off, so I was mainly focus on getting it to work on my dataset first. But I'll for sure make a pull request whenever (and if I get further with SSD) I have MAP implemented or refer you to other work if stumble upon something interesting. Thanks again for you help! |
@natlachaman you are welcome :) |
Hi again!
Im trying to use your implementation on a different problem than PASCAL VOC dataset suggests. In my case, I need to identify much smaller objects (ground truth boxes are 50px50p in 768X1024 images).
For what I've seen so far
min_size
andmax_size
determine the dimension of the default boxes. Are these parameters implemented to be pixels? or what are they? Cause in the paper they talk about scales, with values ranging from 0 to 1, and I'm not sur eif you implemented a different version of it and conceptually they do the same or if I'm mixing up concepts.Thanks in advance!
The text was updated successfully, but these errors were encountered: