-
Notifications
You must be signed in to change notification settings - Fork 7.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training images terms and conditions, [net] layer parameters #30
Comments
Unfortunately changing the |
@VanitarNordic Detection works quite well at different resolutions 544x544 or 832x832, there is a guarantee: https://arxiv.org/pdf/1612.08242.pdf
Training problems - this is strange, that you got it:
Because there are some guarantees: https://arxiv.org/pdf/1612.08242.pdf
And because if we train model only on 416x416, then all the same, the network resized at different resolutions each 10 iterations (batches), if we set parser.c: detector.c:
Try to use 416x416 but with Line 244 in 76dbdae
|
You can train YOLO from scratch if you want to play with different training regimes, hyper-parameters, or datasets. Here's how to get it working on the Pascal VOC dataset. then it says about the conv.23 weights and so on ... |
@VanitarNordic
No, I mean 416x416 with auto dynamic resolutions changing If you want use 544x544 with
between lines 40 and 41 in detector.c: Line 40 in 76dbdae
May be it can solves your problem with training 544x544 model. |
so you mean if I put I have made some updates to the above reply either. Please have a look. |
@VanitarNordic Yes,
The website does not have to be extremely accurate in terms. But it is necessary in the article, in both casses classification & detection we do fine-tune: https://arxiv.org/pdf/1612.08242.pdf
In addition, there are well-established scientific terms and their definitions: #6 (comment) |
Thanks :-) Well I tried your suggestion and I get out of memory error on 6G GPU. let me change the batch or subdivision and see what happens. Let me ask you this also, if we select 416 * 416 or 544 * 544, does it mean that our training images should be bigger than this size? What about the case when trained network does not detect objects on higher thresholds although it has trained good enough? |
About solving out of memory error on 6G GPU: #12 (comment) Yes, low-resolution lossy-compressed (jpg) images can greatly degrade the learning. |
Yesss, so my assumption was right. a good training image should not be smaller than the network defined size. That might be the reason of that above problem with 544 * 544. Who knows. I will check and comeback and share the experience. I have one question more about the good annotation. Please remain hold till I post it. The answer is very important for me at the moment. |
in many training images, the target object is hidden partially within other objects. Then what is the best annotation technique there? cover hidden object areas inside rectangle or just cover visible areas? Please have look at below pictures: Shape A Shape B Considering in Apple detection, which annotation is better to include that apple which is partially hidden? |
Case-2 closer to the truth. In short, the better to use that something an average between these two examples. As much as possible parts of the object should get into the bbox, and as little as possible parts of another objects. In general, there is no perfect solution. The trade-off depends on the priority:
|
Wow, thanks. What about when we want to detect the object in tricky situations?, I mean when it is covered by another objects (such as an apple inside a hand). is it good that the training images contain free & clear examples of the target object or it should contain some tricky training images also? in the other word, if we train the model with the samples that target object is fully visible and free, does the model is be able to detect that object in tricky areas afterwards? |
Yes, if we train the model with the samples that target object is fully visible and clear, the model be able to detect overlaped objects. About |
|
Also you can try this fix instead of |
You mentioned about training images which they should not contain objects with overlapping
|
In details:
This specifies when the object is considered found, when |
Thank you very much for the detailed explanation. You mean I should copy some images from the training dataset to the validation dataset? I mean the images inside the validation dataset should exist inside the training dataset? let me tell you my situation by an example. I was detecting several false positives in camera detection and I include those scenes with false positive objects as images inside the training set again, and trained from the beginning, but no progress in reducing false detection. model was still detecting the same false positives. You know, consider we gonna detect an apple for instance. if there are some images that the apple is in the hand of a person, the model would think that is also a part of the apple and will detect hand or even face as apple either. if I put an apple on a board, it will think the board is also part of it. this can be solved significantly by increasing the threshold, BUT it affects the detection of apple itself either, and makes the apple less sensitive to be detectable from far or make the detection blinking. I trained the model till the best situation of error for both training and validation, but this phenomenon did not solve. Do you know what's the trick? |
I mean if you have 10000 images in dataset, then:
If in training dataset are apples on different backrounds, then Yolo learns:
In your training dataset should be as much background as possible. How many images and classes?
|
You explained very important facts which I didn't know about them.
Okay, I didn't know about this anchor parameters. Can you tell me about this anchors and how should I calculate them by the input video resolution? should I change them also in the training phase?
My dataset is very small, only 150 images on the training side ;-), but I make experiments and check the results with. besides all have different backgrounds. so I assume my experience is because of the low number of images maybe.
I didn't know about this. I will be grateful if you explain this also more to adjust it for further training, if it is necessary. |
I.e.
But these ranges are good for VOC-dataset with 20 classes for 45 000 iterations. If you use 1 class with 2000 iterations - you should divide it at 20, to: Line 17 in b3a3e92
|
Thank you again.
So does the annotation software make a blank file for a typical background-only image, I mean it creates
Would you please explain where the division by 40 has made? What about when we have 2 or 3 classes?
|
|
Thank you again.
Actually I asked this because 25000/40=625 , also 35000/40=875, is there any other calculations behind?
What about if we had two classes of banana and apple? |
|
This does not affect the accuracy? because if it resize all images, many images will be stretchy, because aspect ratio is not kept when it decides to resize to a fixed size such as 416*416 |
In two cases will be the same accuracy:
|
So as the result, makes no difference in accuracy, even if we ourselves resize to the network size. so make the life easier and do not touch. |
Yes. |
Alright, I tried to add the tips you mentioned about the background images, but the yolo-mark does not create blank |
I also tried the effect of background-images. The result was immediate. it reduced the number of false positives significantly, although in 150 images . I just added around 15 backgrounds and checked the results. |
I add feature to Yolo-mark to process backgroun-images without objects. Also I sent email to you with my email. |
@VanitarNordic I added Darknet as DLL: #27 (comment) |
…ports Abidlabs/configurable ports
Hello,
I changed the
weight
andheight
of the[net]
layers to 544, but this led the model to an strange behavior and as it was iterating more, an un-detection behavior became more powerful and finally it goes to no detection even on Training images!. Do you know why? Strange.Besides when I look at
YOLOv2 544x544
in the Darket website, itscfg
files wrote theheight
and width as 416, why?!Also I realize that input images' sizes do not affect the consumed GPU memory. This is unique for YOLO because big training images (in size or resolution) would easily lead to out of memory on other models.
The text was updated successfully, but these errors were encountered: