Differences between YOLOv5 models #7152

Averen19 · 2022-03-26T07:38:14Z

Search before asking

I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

What is the difference between the YOLOv5s, YOLOv5m, and YOLOv5l? I know that the mAP, number of layers and the depth_multiple and width_multiple in the yolov5.yaml files are different between , but is there any documentation that states what are the differences in layers?
Does the width and depth multiple affect the train.py file?
Im trying to do a research paper on the YOLOv5 models and would like to get any kind of help if possible.

Additional

No response

yonghi · 2022-03-26T08:25:48Z

you can see the affect of depth_multiple and width_multiple in this function

yolov5/models/yolo.py

Line 243 in 7a2a118

def parse_model(d, ch): # model_dict, input_channels(3)

glenn-jocher · 2022-03-26T11:09:07Z

@Averen19 yes the YOLOv5 models are all compound-scaled variants of the same architecture. I did this following the EfficientDet compound scaling model, minus the image scaling.

bryanbocao · 2022-03-28T22:31:44Z

Dear @glenn-jocher,

I am doing similar experiments that also need to vary the model size. I see that what yolov5* models (e.g. yolov5n.yaml, yolov5s.yaml, etc.) differ are depth_multiple and width_multiple for scaling but follow the same architecture with 3 heads. So my question are: (1) is the 3-head architecture the smallest one, or "atomic" block that we can use?; (2) If not, can we use even a smaller model, such as with only one head? Thanks!

glenn-jocher · 2022-03-29T08:23:44Z

@bryanbo-cao yes of course. You can modify each model infinitely by removing/adding heads, layers, modules etc. That's the main idea behind the yaml files, to make them easy to modify and view.

bryanbocao · 2022-03-29T18:52:54Z

@bryanbo-cao yes of course. You can modify each model infinitely by removing/adding heads, layers, modules etc. That's the main idea behind the yaml files, to make them easy to modify and view.

Yup. Suppose I am trying to design my own architecture in model/custom_yolov5.yaml file, I guess my previous question was that directly deleting some layers caused some problems. For example, when I tried to simplify the architecture by only deleting some layers for p3 directly in the backbone, I got the following bugs and tried to get some helps:

Traceback (most recent call last):
  File "train.py", line 643, in <module>
    main(opt)
  File "train.py", line 539, in main
    train(opt.hyp, opt, device, callbacks)
  File "train.py", line 124, in train
    model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device)  # create
  File "/home/<user>/yolov5/models/yolo.py", line 103, in __init__
    self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch])  # model, savelist
  File "/home/<user>/yolov5/models/yolo.py", line 291, in parse_model
    args.append([ch[x] for x in f])
  File "/home/<user>/yolov5/models/yolo.py", line 291, in <listcomp>
    args.append([ch[x] for x in f])
IndexError: list index out of range

or

File "/home/<user>/yolov5/models/common.py", line 275, in forward
    return torch.cat(x, self.d)
RuntimeError: Sizes of tensors must match except in dimension 1. Got 32 and 64 in dimension 2 (The offending index is 1)

But later after some investigations of the code and architecture, I realized that the number in the second column [from, number, module, args] in custom_yolov5.yaml means layer number displayed on the leftmost column in the model architecture printed in the command line, and it has to be checked carefully. The main reasons mainly include (1) adding/deleting some layers can change the layer number previously; (2) there are some skip connections between the backbone and head layers in the same scale, specifically in the concatenation layers. Just need to make sure the number refers to the correct one and the dimensions are correct. PS: it might be more clear to use **layer_number**?

                 from  n    params  module                                  arguments
  0                -1  1      1760  models.common.Conv                      [3, 16, 6, 2, 2]
  1                -1  1      4672  models.common.Conv                      [16, 32, 3, 2]
  2                -1  1      4800  models.common.C3                        [32, 32, 1]
  3                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]
  4                -1  2     29184  models.common.C3                        [64, 64, 2]
  5                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]
  6                -1  3    156928  models.common.C3                        [128, 128, 3]
  7                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]
  8                -1  1    296448  models.common.C3                        [256, 256, 1]
  9                -1  1    164608  models.common.SPPF                      [256, 256, 5]
 10                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]
 11                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']
 12           [-1, 6]  1         0  models.common.Concat                    [1]
 13                -1  1     90880  models.common.C3                        [256, 128, 1, False]
 14                -1  1      8320  models.common.Conv                      [128, 64, 1, 1]
 15                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']
 16           [-1, 4]  1         0  models.common.Concat                    [1]
 17                -1  1     22912  models.common.C3                        [128, 64, 1, False]
 18                -1  1     36992  models.common.Conv                      [64, 64, 3, 2]
 19          [-1, 14]  1         0  models.common.Concat                    [1]
 20                -1  1     74496  models.common.C3                        [128, 128, 1, False]
 21                -1  1    147712  models.common.Conv                      [128, 128, 3, 2]
 22          [-1, 10]  1         0  models.common.Concat                    [1]
 23                -1  1    296448  models.common.C3                        [256, 256, 1, False]
 24      [17, 20, 23]  1      9471  models.yolo.Detect                      [2, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256]]
Model Summary: 270 layers, 1766623 parameters, 1766623 gradients

Anyway it was fixed. I liked the yaml style that it is very flexible for changing model architecture!

glenn-jocher · 2022-03-29T20:33:41Z

@bryanbo-cao yes that's right! You can delete some layers, but be careful that later layers that use skip connections from earlier in the model then must also be updated to the new layer index they are coming from.

github-actions · 2022-04-29T00:20:34Z

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Wiki – https://github.com/ultralytics/yolov5/wiki
Tutorials – https://docs.ultralytics.com/yolov5
Docs – https://docs.ultralytics.com

Access additional Ultralytics ⚡ resources:

Ultralytics HUB – https://ultralytics.com/hub
Vision API – https://ultralytics.com/yolov5
About Us – https://ultralytics.com/about
Join Our Team – https://ultralytics.com/work
Contact Us – https://ultralytics.com/contact

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

Robotatron · 2022-12-31T03:18:26Z

@glenn-jocher
The "X" model uses depth_multiple of 1.33.
Is going higher not recommended? Say depth_multiple: 2?
There is probably a reason YOLO5 configs end with "X" and 1.33 and dont go higher, maybe it does not improve performance that much?

bryanbocao · 2022-12-31T03:59:04Z

@glenn-jocher The "X" model uses depth_multiple of 1.33. Is going higher not recommended? Say depth_multiple: 2? There is probably a reason YOLO5 configs end with "X" and 1.33 and dont go higher, maybe it does not improve performance that much?

@Robotatron To me it's just a scaling factor in network depth (# layers). You can try whatever you want YOLOv5 scales from the base network, depending on your customized settings, e.g. use case, hardware constraints of cloud/edge GPU, GPU memory, inference time etc. In general, the gain of detection performance (mAP) will diminish when the network goes deeper.

Robotatron · 2022-12-31T04:03:19Z

@bryanbocao
Thanks. Yes, it was my understanding as well. I guess I just have to try different values for the depth and width and see if training time vs mAP bring any benefit if going bigger then the eXtra large config.

Averen19 added the question Further information is requested label Mar 26, 2022

github-actions bot added the Stale label Apr 29, 2022

github-actions bot closed this as completed May 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differences between YOLOv5 models #7152

Differences between YOLOv5 models #7152

Averen19 commented Mar 26, 2022

yonghi commented Mar 26, 2022

glenn-jocher commented Mar 26, 2022

bryanbocao commented Mar 28, 2022 •

edited

glenn-jocher commented Mar 29, 2022

bryanbocao commented Mar 29, 2022 •

edited

glenn-jocher commented Mar 29, 2022

github-actions bot commented Apr 29, 2022 •

edited by glenn-jocher

Robotatron commented Dec 31, 2022

bryanbocao commented Dec 31, 2022

Robotatron commented Dec 31, 2022

Differences between YOLOv5 models #7152

Differences between YOLOv5 models #7152

Comments

Averen19 commented Mar 26, 2022

Search before asking

Question

Additional

yonghi commented Mar 26, 2022

glenn-jocher commented Mar 26, 2022

bryanbocao commented Mar 28, 2022 • edited

glenn-jocher commented Mar 29, 2022

bryanbocao commented Mar 29, 2022 • edited

glenn-jocher commented Mar 29, 2022

github-actions bot commented Apr 29, 2022 • edited by glenn-jocher

Robotatron commented Dec 31, 2022

bryanbocao commented Dec 31, 2022

Robotatron commented Dec 31, 2022

bryanbocao commented Mar 28, 2022 •

edited

bryanbocao commented Mar 29, 2022 •

edited

github-actions bot commented Apr 29, 2022 •

edited by glenn-jocher