What is the architecture of YOLOV8 segmentation, what is difference from UNET ? #1289

frabob2017 · 2023-03-06T14:54:23Z

Search before asking

I have searched the YOLOv8 issues and discussions and found no similar questions.

Question

Hello May I ask where I can find the principle of YOLOV8 segmentation? I know YOLO detection head to regress X,Y H,W for each grid in object detection. What is the architecture of YOLOV8 segmentation, what is difference from UNET ? I try to google and chathpt it, I do not find a good answer.

Additional

No response

glenn-jocher · 2023-03-06T21:44:14Z

👋 Hello! Thanks for asking about YOLOv5 🚀 architecture visualization. We've made visualizing YOLO 🚀 architectures super easy. There are 3 main ways:

`model.yaml`

Each model has a corresponding yaml file that displays the model architecture. Here is YOLOv5s, defined by yolov5s.yaml:
https://github.com/ultralytics/yolov5/blob/1a3ecb8b386115fd22129eaf0760157b161efac7/models/yolov5s.yaml#L12-L48

TensorBoard Graph

Simply start training a model, and then view the TensorBoard Graph for an interactive view of the model architecture. This example shows YOLOv5s viewed in our Notebook –

# Tensorboard
%load_ext tensorboard
%tensorboard --logdir runs/train

# Train YOLOv5s on COCO128 for 3 epochs
python train.py --weights yolov5s.pt --epochs 3

Netron viewer

Use https://netron.app to view exported ONNX models:

python export.py --weights yolov5s.pt --include onnx --simplify

Good luck 🍀 and let us know if you have any other questions!

frabob2017 · 2023-03-08T18:09:39Z

Thank you Jocher, I try to understand your code and I want to replace nn.conv2d with conv3d. So I can know where I need to modify them. Here is my first question. Since you define Focus class in common.py file, but it is only called in line 318 parse_model(d, ch) function in yolo.py
if m in {
Conv, GhostConv, ****, Focus, ***,
BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x}:

Based on my understanding, you create YOLO network architecture through this loop in line 310 parse_model(d, ch) function in yolo.py

for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args
You get architecture information from yolov5.yaml file. But in yolov5.yaml file, there is no focus module. Where do you implement this module in your code? I feel confused.

for Focus function, the dimension of tensor x show is [b,c,w,h], here b is batch, c is channel, w is width of bounding box, h is height of bounding box? Am I right? x is a 4 dimension tensor. Am I right? I do not understand why you split w,h into half? At beginning, I think this is Cross Stage Partial Network (CSPNet), but it is to split channels instead of w,h into half.

class Focus(nn.Module):
# Focus wh information into c-space
def init(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
super().init()
self.conv = Conv(c1 * 4, c2, k, s, p, g, act=act)
# self.contract = Contract(gain=2)

def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
    return self.conv(torch.cat((x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]), 1))
    # return self.conv(self.contract(x))

Finally, could you help me take a look of this question? #1220
#1220

I take effort try to understand your code, Maybe some questions may be formulated poorly ( I do not understand very well yet), Please forgive that.

github-actions · 2023-04-08T00:13:39Z

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

pderrenger · 2023-11-16T05:43:30Z

@frabob2017 hello there! 🌟 Thanks for your interest in YOLOv5's architecture and your comprehensive questions. Let's dive into your queries:

Focus Module Implementation: The Focus module is indeed implemented in the common.py file. It's utilized in the YOLOv5 architecture within the parse_model function in the yolo.py file. The yolov5.yaml file doesn't explicitly describe the Focus module, as it's a tiny module specific to YOLOv5 and isn't externally configurable like the backbone and head modules. Therefore, it's directly defined in code rather than being specified in the YAML configuration.
Tensor Dimensions: You've got it mostly right! The tensor x in the Forward function of the Focus module has dimensions [b, c, w, h], where b is the batch size, c is the number of channels, and w, h are the width and height of the feature map. The method concatenates and downsamples (by 2x) four sub-regions of the feature map, compressing the spatial information and expanding the channel dimension. This technique is used to increase receptive field without significantly affecting resolution.
Github Issue try to understand YOLO source code for further modification #1220: I will definitely take a look at the issue you've linked and provide the best assistance I can. No worries if some questions feel poorly formulated, understanding complex code can be tricky, and your effort is greatly appreciated!

Great questions, and if you have further inquiries or need clarification, don't hesitate to ask! Let's continue exploring YOLOv5 together! 🚀

frabob2017 added the question Further information is requested label Mar 6, 2023

github-actions bot added the Stale label Apr 8, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the architecture of YOLOV8 segmentation, what is difference from UNET ? #1289

What is the architecture of YOLOV8 segmentation, what is difference from UNET ? #1289

frabob2017 commented Mar 6, 2023

glenn-jocher commented Mar 6, 2023

frabob2017 commented Mar 8, 2023 •

edited

github-actions bot commented Apr 8, 2023

pderrenger commented Nov 16, 2023

What is the architecture of YOLOV8 segmentation, what is difference from UNET ? #1289

What is the architecture of YOLOV8 segmentation, what is difference from UNET ? #1289

Comments

frabob2017 commented Mar 6, 2023

Search before asking

Question

Additional

glenn-jocher commented Mar 6, 2023

model.yaml

TensorBoard Graph

Netron viewer

frabob2017 commented Mar 8, 2023 • edited

github-actions bot commented Apr 8, 2023

pderrenger commented Nov 16, 2023

`model.yaml`

frabob2017 commented Mar 8, 2023 •

edited