Can anybody explain cfg file parameters from region layer and net layer and which of them required to changed for training own custom dataset? #933

AvaniPitre · 2018-05-30T10:44:28Z

No description provided.

AlexeyAB · 2018-05-30T11:47:39Z

What do you should change for training your own custom dataset: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
About parameters in sections [net] and [region]: explanation on cfg file parameters #279 (comment)

IlyaOvodov · 2018-05-30T14:24:16Z

By the way be carefull with "random=1" last layer parameter, if you use the net with less then 6 "stride by 2" layers and/or size less then 416x416 or not square. This parameter randomly changes input size of the net in hardcoded manner: +-160pix with 32pix step. Not long ago it also contained an error resulting in conversion of any input format to square. So "you should be aware what are you doing".

AlexeyAB · 2018-05-30T15:03:14Z

@IlyaOvodov

Currently this fork doesn't change input size of the net in hardcoded manner:

darknet/src/detector.c

Lines 132 to 137 in ec766fc

    
           float random_val = rand_scale(1.4);	// *x or /x 
        
           int dim_w = roundl(random_val*init_w / 32) * 32; 
        
           int dim_h = roundl(random_val*init_h / 32) * 32; 
        
           if (dim_w < 32) dim_w = 32; 
        
           if (dim_h < 32) dim_h = 32;

Now random=1 can be used successfully for non-square network, for example, the network with width=640 height=352 was successfully trained with random=1 and got mAP = 89.82 % : Tiny YOLO: Looking for suggestions to improve training on a custom dataset #406 (comment)

Also random=1 requires not more than 5 layers with stride=2, (pow(2,5)=32). But if number of layers with stride=2 less than 5 - it isn't a problem. For example, if you have only 3 layers with stride=2, then (pow(2,3)=8), so any value that is multiple of 32 also will be multiple of 8, i.e. network resolution 640x352 can be divided by /32 and /8 without remainder of division.

AvaniPitre · 2018-05-31T06:54:59Z

thanks a lot @AlexeyAB for quick response and links, actually I have already started training my custom data with this darknet fork and given guidelines. I have trained my data with copy of yolo-voc.2.0.cfg , with1500 images for 2000 iterations, I am getting results but predicted bounding box is not very accurate as marked during training .. I need to improve on accuracy of predicted bounding box .

From above link I understood I need to recalculate anchors , which other parameters do I have change to improve the accuracy of bounding box.
Below is my [net] and [region] layers from cfg.

[net]

batch=64
subdivisions=8
height=416
width=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.0001
max_batches = 45000
policy=steps
steps=100,25000,35000
scales=10,.1,.1

[region]

anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52
bias_match=1
classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .6
random=0

Thanks in advance

AlexeyAB · 2018-05-31T12:19:29Z

Set width=608 height=608
Set random=1
Read: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
And train from the begining ~10 000 iterations.

dfsaw · 2018-06-06T06:42:52Z

Which parameter is for total no of iterations.

AlexeyAB · 2018-06-07T00:42:04Z

@dfsaw max_batches= in the cfg-file.

dfsaw · 2018-06-07T12:17:49Z

Is there any min images for which i should train my images. My objects are not getting detected corrected

AlexeyAB · 2018-06-07T12:41:39Z

@dfsaw

https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

desirable that your training dataset include images with objects at diffrent: scales, rotations, lightings, from different sides, on different backgrounds - you should preferably have 2000 images for each class or more

dfsaw · 2018-06-14T09:01:20Z

Can anyone please explain the parameters. and Why is recall equal to 0.
Also i have total 90 images with batch=8 and subdivisions =4 . then how is it showing 128 images. Since I am new to deep learning I am not able to understand.

Thanks

AlexeyAB · 2018-06-14T11:12:22Z

@dfsaw This is total images for 16 iterations. I.e. batch x iterations = 8 x 16 = 128
If you have less than 128 images, then some of images will be used several times.

AlexeyAB · 2018-06-18T12:52:32Z

@dfsaw

What GPU do you use?
What mAP can you get?
darknet.exe detector map data/obj.data yolo-obj.cfg backup\yolo-obj_1500.weights
What params do you use in the Makefile?

dfsaw · 2018-06-19T04:59:12Z

GPU Titan
Parameters in Make file
GPU=1
CUDNN=1
CUDNN_HALF=1

AlexeyAB · 2018-06-19T12:40:53Z

@dfsaw Use CUDNN_HALF=0 and train from the begining about 2000-4000 iterations.

dfsaw · 2018-06-20T10:15:49Z

@AlexeyAB what should be the batch, subdivision and steps? Foe batch =64 subdivsion 8, I am getting out of memory error

AlexeyAB · 2018-06-20T10:44:35Z

@dfsaw

steps should be at 80% and 90% of max_batches
batch is always 64
subdivisions is small as possible, but if the OOM occures, then set it 64

https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

if error Out of memory occurs then in .cfg-file you should increase subdivisions=16, 32 or 64:

AvaniPitre · 2018-06-22T11:42:40Z

@AlexeyAB
I have done setting as you suggested "Set width=608 height=608 Set random=1" and started training fresh but was getting nan values for all parameters after 40 iteration also training is getting killed and exe crashes after 50 iteration while resizing
Is this because of sizes of training images used?

my training dataset includes images with smallest size is 182X 53 and biggest size is 704X576
I have 800 total images and classes=1 and calculated anchors as
anchors = 11.7425,9.5023, 5.8290,2.0653, 9.7451,5.8569, 3.7540,1.4357, 10.6144,7.2646
is input resolution should be changed ? Please guide.

Thanks

AlexeyAB · 2018-06-22T12:00:01Z

@AvaniPitre

also training is getting killed and exe crashes after 50 iteration while resizing

Whit the error CUDA out of memory?

my training dataset includes images with smallest size is 182X 53 and biggest size is 704X576

In your case there is no sense to use 608x608. You can use 416x416 and random=1.

anchors = 11.7425,9.5023, 5.8290,2.0653, 9.7451,5.8569, 3.7540,1.4357, 10.6144,7.2646
is input resolution should be changed ? Please guide.

What command did you use for calculating anchors?

AvaniPitre · 2018-06-22T12:28:04Z

Whit the error CUDA out of memory?
No such msg but Unhandled Exception with Access violation reading location

What command did you use for calculating anchors?
./darknet detector calc_anchors x64/Release/data/obj.data yolo-obj.cfg -num_of_clusters 5 -width 13 -height 13

AlexeyAB · 2018-06-22T12:41:57Z

@AvaniPitre

What params did you use in the Makefile, GPU=1?
Do you use Yolo v2 with [region] layer instead of Yolo v3 [yolo]-layer?
Do you use the latest code of my repository?

AvaniPitre · 2018-06-23T06:42:39Z

What params did you use in the Makefile, GPU=1?
This particular training I was trying with windows version of darknet with GPU = 0 i.e as my machine has no hardware supported for Cudda drivers .. so I have build darknet_no_gpu.exe with following
make file parameters are
GPU=0
CUDNN=0
CUDNN_HALF=0
OPENCV=0
AVX=0
OPENMP=0
LIBSO=0
Also Linux version I have build with following parameters
GPU=0
CUDNN=0
OPENCV=1
DEBUG=0
OPENMP=1
LIBSO=1
But both version are crashing when I set random =1 during resizing.
Random = 0 works fine with no errors.

Do you use Yolo v2 with [region] layer instead of Yolo v3 [yolo]-layer?
Yes . I use Yolo v2 with [region layer]

Do you use the latest code of my repository?
Yes, must be around 20 days back I have downloaded

AlexeyAB · 2018-06-23T12:52:01Z

@AvaniPitre

Try to update your code from this repo and recompile. There are some fixes.
Did you use Yolo_mark to create your dataset?

AvaniPitre · 2018-06-24T06:14:41Z

Thanks ok sure I will update my code and start training again.
yes I used yolo mark to create my dataset...
Thanks a lot for quick response.. I hope updating code will resolve all issues

AvaniPitre · 2018-06-25T09:57:09Z

I updated my darknet latest code , recompiled and started training fresh but still get same issue, crashes at 40th iteration while resizing. I have just attached image of debugger . Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

…

________________________________ From: Alexey <notifications@github.com> Sent: Saturday, June 23, 2018 6:22:06 PM To: AlexeyAB/darknet Cc: AvaniPitre; Mention Subject: Re: [AlexeyAB/darknet] Can anybody explain cfg file parameters from region layer and net layer and which of them required to changed for training own custom dataset? (#933) @AvaniPitre<https://github.com/AvaniPitre> * Try to update your code from this repo and recompile. There are some fixes. * Did you use Yolo_mark to create your dataset? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#933 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/Ai_z0UUTiJH9joYK2vBBkBs5ettspEVyks5t_jn2gaJpZM4UTCMD>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can anybody explain cfg file parameters from region layer and net layer and which of them required to changed for training own custom dataset? #933

Can anybody explain cfg file parameters from region layer and net layer and which of them required to changed for training own custom dataset? #933

AvaniPitre commented May 30, 2018

AlexeyAB commented May 30, 2018

IlyaOvodov commented May 30, 2018

AlexeyAB commented May 30, 2018

AvaniPitre commented May 31, 2018

AlexeyAB commented May 31, 2018

dfsaw commented Jun 6, 2018

AlexeyAB commented Jun 7, 2018

dfsaw commented Jun 7, 2018

AlexeyAB commented Jun 7, 2018

dfsaw commented Jun 14, 2018 •

edited

Loading

AlexeyAB commented Jun 14, 2018

AlexeyAB commented Jun 18, 2018

dfsaw commented Jun 19, 2018

AlexeyAB commented Jun 19, 2018

dfsaw commented Jun 20, 2018 •

edited

Loading

AlexeyAB commented Jun 20, 2018

AvaniPitre commented Jun 22, 2018

AlexeyAB commented Jun 22, 2018

AvaniPitre commented Jun 22, 2018

AlexeyAB commented Jun 22, 2018

AvaniPitre commented Jun 23, 2018

AlexeyAB commented Jun 23, 2018

AvaniPitre commented Jun 24, 2018

AvaniPitre commented Jun 25, 2018 via email

Can anybody explain cfg file parameters from region layer and net layer and which of them required to changed for training own custom dataset? #933

Can anybody explain cfg file parameters from region layer and net layer and which of them required to changed for training own custom dataset? #933

Comments

AvaniPitre commented May 30, 2018

AlexeyAB commented May 30, 2018

IlyaOvodov commented May 30, 2018

AlexeyAB commented May 30, 2018

AvaniPitre commented May 31, 2018

[net]

[region]

AlexeyAB commented May 31, 2018

dfsaw commented Jun 6, 2018

AlexeyAB commented Jun 7, 2018

dfsaw commented Jun 7, 2018

AlexeyAB commented Jun 7, 2018

dfsaw commented Jun 14, 2018 • edited Loading

AlexeyAB commented Jun 14, 2018

AlexeyAB commented Jun 18, 2018

dfsaw commented Jun 19, 2018

AlexeyAB commented Jun 19, 2018

dfsaw commented Jun 20, 2018 • edited Loading

AlexeyAB commented Jun 20, 2018

AvaniPitre commented Jun 22, 2018

AlexeyAB commented Jun 22, 2018

AvaniPitre commented Jun 22, 2018

AlexeyAB commented Jun 22, 2018

AvaniPitre commented Jun 23, 2018

AlexeyAB commented Jun 23, 2018

AvaniPitre commented Jun 24, 2018

AvaniPitre commented Jun 25, 2018 via email

dfsaw commented Jun 14, 2018 •

edited

Loading

dfsaw commented Jun 20, 2018 •

edited

Loading