Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

用自己的数据集训练时ciou loss下降比giou loss慢,请问怎么解决呢? #30

Open
1343464520 opened this issue Jun 8, 2020 · 4 comments

Comments

@1343464520
Copy link

我的数据只有一类行人,我用的是AB版Darknet,我只调了iou_normalizer参数,从0.07改成0.15还是一样,基本没啥变化。。我是从头开始train的。求助大神!以下是我的cfg文件中参数设置:
[net]

Testing

#batch=1
#subdivisions=1

Training

batch=64
subdivisions=16
width=416
height=416
channels=3
momentum=0.9
#decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

#learning_rate=0.001
learning_rate=0.01
burn_in=1000
max_batches = 50200
policy=steps
#steps=40000,45000
#scales=.1,.1
#steps=4000,8000,12000,16000,20000
steps=900,2000,3000,5000,20000
scales=.5,.5,.5,.5,.5

#cutmix=1
#mosaic=1

#:104x104 54:52x52 85:26x26 104:13x13 for 416

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

Downsample

[convolutional]
batch_normalize=1
filters=32
size=3
stride=2
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=16
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -1,-7

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

Downsample

[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -1,-10

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

Downsample

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -1,-28

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

Downsample

[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -1,-28

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

Downsample

[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
#activation=leaky
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

[route]
layers = -1,-16

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
#activation=leaky
activation=leaky

##########################

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

SPP

[maxpool]
stride=1
size=5

[route]
layers=-2

[maxpool]
stride=1
size=9

[route]
layers=-4

[maxpool]
stride=1
size=13

[route]
layers=-1,-3,-5,-6

End SPP

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = 85

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[route]
layers = -1, -3

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = 54

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[route]
layers = -1, -3

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=128
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=128
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

##########################

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=128
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear

[yolo]
mask = 0,1,2
anchors = 89,100, 103,136, 118,181, 132,106, 141,147, 162,194, 182,134, 211,179, 233,243 # 9488 + 2825 = 12313
classes=1
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
scale_x_y = 1.2
iou_thresh=0.213
cls_normalizer=1.0
#iou_normalizer=0.07
iou_normalizer=0.15
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6
max_delta=5

[route]
layers = -4

[convolutional]
batch_normalize=1
size=3
stride=2
pad=1
filters=128
activation=leaky

[route]
layers = -1, -16

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear

[yolo]
mask = 3,4,5
anchors = 89,100, 103,136, 118,181, 132,106, 141,147, 162,194, 182,134, 211,179, 233,243 # 9488 + 2825 = 12313
classes=1
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
scale_x_y = 1.1
iou_thresh=0.213
cls_normalizer=1.0
#iou_normalizer=0.07
iou_normalizer=0.15
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6
max_delta=5

[route]
layers = -4

[convolutional]
batch_normalize=1
size=3
stride=2
pad=1
filters=256
activation=leaky

[route]
layers = -1, -37

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear

[yolo]
mask = 6,7,8
anchors = 89,100, 103,136, 118,181, 132,106, 141,147, 162,194, 182,134, 211,179, 233,243 # 9488 + 2825 = 12313
classes=1
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
scale_x_y = 1.05
iou_thresh=0.213
cls_normalizer=1.0
#iou_normalizer=0.07
iou_normalizer=0.15
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6
max_delta=5
chart_nano_tr12313
chart_nano_ciou

@Zzh-tju
Copy link
Owner

Zzh-tju commented Jun 9, 2020

具体的工程需要你自己来调试完成,我只能给一些建议。

  1. 尝试更多的系数,iou_normalizer∈[0.07,0.5]
  2. 观察数据集的物体多为中大型还是小型,小型建议使用DIoU
  3. 控制变量的情况下,如果D/CIoU均比GIoU差,那毫无疑问使用GIoU,说明D/CIoU对该数据集失效。当然也可以看看IoU loss是何情况。我们无法保证所提出的trick在任何情况下都最优,事实上一些经典的trick如focal loss,Soft-NMS也都存在失效的情况

@1343464520
Copy link
Author

非常感谢作者的中肯建议!通过控制变量法去尝试后可能是我的数据更适合giou. 曾经也尝试过focal loss也是几乎没有提升,回头换别的任务数据集再试试。再次感谢作者!

@djwilv
Copy link

djwilv commented Jun 17, 2020

你画loss曲线代码能发一下吗?

@1343464520
Copy link
Author

1343464520 commented Jun 17, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants