Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the architecture of YOLOV8 segmentation, what is difference from UNET ? #1289

Closed
1 task done
frabob2017 opened this issue Mar 6, 2023 · 4 comments
Closed
1 task done
Labels
question Further information is requested Stale

Comments

@frabob2017
Copy link

Search before asking

Question

Hello May I ask where I can find the principle of YOLOV8 segmentation? I know YOLO detection head to regress X,Y H,W for each grid in object detection. What is the architecture of YOLOV8 segmentation, what is difference from UNET ? I try to google and chathpt it, I do not find a good answer.

Additional

No response

@frabob2017 frabob2017 added the question Further information is requested label Mar 6, 2023
@glenn-jocher
Copy link
Member

👋 Hello! Thanks for asking about YOLOv5 🚀 architecture visualization. We've made visualizing YOLO 🚀 architectures super easy. There are 3 main ways:

model.yaml

Each model has a corresponding yaml file that displays the model architecture. Here is YOLOv5s, defined by yolov5s.yaml:
https://github.com/ultralytics/yolov5/blob/1a3ecb8b386115fd22129eaf0760157b161efac7/models/yolov5s.yaml#L12-L48

TensorBoard Graph

Simply start training a model, and then view the TensorBoard Graph for an interactive view of the model architecture. This example shows YOLOv5s viewed in our NotebookOpen In Colab Open In Kaggle

# Tensorboard
%load_ext tensorboard
%tensorboard --logdir runs/train

# Train YOLOv5s on COCO128 for 3 epochs
python train.py --weights yolov5s.pt --epochs 3

Screenshot 2021-04-11 at 01 10 09

Netron viewer

Use https://netron.app to view exported ONNX models:

python export.py --weights yolov5s.pt --include onnx --simplify

Screen Shot 2022-04-29 at 11 09 23 AM

Good luck 🍀 and let us know if you have any other questions!

@frabob2017
Copy link
Author

frabob2017 commented Mar 8, 2023

Thank you Jocher, I try to understand your code and I want to replace nn.conv2d with conv3d. So I can know where I need to modify them. Here is my first question. Since you define Focus class in common.py file, but it is only called in line 318 parse_model(d, ch) function in yolo.py
if m in {
Conv, GhostConv, ****, Focus, ***,
BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x}:

Based on my understanding, you create YOLO network architecture through this loop in line 310 parse_model(d, ch) function in yolo.py

for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args
You get architecture information from yolov5.yaml file. But in yolov5.yaml file, there is no focus module. Where do you implement this module in your code? I feel confused.

for Focus function, the dimension of tensor x show is [b,c,w,h], here b is batch, c is channel, w is width of bounding box, h is height of bounding box? Am I right? x is a 4 dimension tensor. Am I right? I do not understand why you split w,h into half? At beginning, I think this is Cross Stage Partial Network (CSPNet), but it is to split channels instead of w,h into half.

class Focus(nn.Module):
# Focus wh information into c-space
def init(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
super().init()
self.conv = Conv(c1 * 4, c2, k, s, p, g, act=act)
# self.contract = Contract(gain=2)

def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
    return self.conv(torch.cat((x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]), 1))
    # return self.conv(self.contract(x))

Finally, could you help me take a look of this question? #1220
#1220

I take effort try to understand your code, Maybe some questions may be formulated poorly ( I do not understand very well yet), Please forgive that.

@github-actions
Copy link

github-actions bot commented Apr 8, 2023

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label Apr 8, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 18, 2023
@pderrenger
Copy link
Member

@frabob2017 hello there! 🌟 Thanks for your interest in YOLOv5's architecture and your comprehensive questions. Let's dive into your queries:

  1. Focus Module Implementation: The Focus module is indeed implemented in the common.py file. It's utilized in the YOLOv5 architecture within the parse_model function in the yolo.py file. The yolov5.yaml file doesn't explicitly describe the Focus module, as it's a tiny module specific to YOLOv5 and isn't externally configurable like the backbone and head modules. Therefore, it's directly defined in code rather than being specified in the YAML configuration.

  2. Tensor Dimensions: You've got it mostly right! The tensor x in the Forward function of the Focus module has dimensions [b, c, w, h], where b is the batch size, c is the number of channels, and w, h are the width and height of the feature map. The method concatenates and downsamples (by 2x) four sub-regions of the feature map, compressing the spatial information and expanding the channel dimension. This technique is used to increase receptive field without significantly affecting resolution.

  3. Github Issue try to understand YOLO source code for further modification #1220: I will definitely take a look at the issue you've linked and provide the best assistance I can. No worries if some questions feel poorly formulated, understanding complex code can be tricky, and your effort is greatly appreciated!

Great questions, and if you have further inquiries or need clarification, don't hesitate to ask! Let's continue exploring YOLOv5 together! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

3 participants