-
-
Notifications
You must be signed in to change notification settings - Fork 15.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training v9 with transformer from v5 #13090
Comments
@gchinta1 hello, Thank you for reaching out and for your interest in experimenting with YOLOv5 and transformers! To assist you effectively, we need a bit more information.
These details will help us diagnose the issue more accurately. Looking forward to your response so we can help you resolve this! |
I am trying to use for transformer layers and block in other yolo algorithm just find the difference in that yolo .. that's why I trying to understand the architecture and how I can make it without C3 module . So I am trying to make the transformer to already use the c3 module c3tr all of them so it will be good at calculations . Thank you |
Hello @gchinta1, Thank you for providing more context on your experiment with integrating transformer layers into YOLOv5. It sounds like an exciting project! To help you further, let's address a few key points:
Here is a basic example of how you might integrate a transformer block into the YOLOv5 architecture: import torch
import torch.nn as nn
from models.common import TransformerBlock
class CustomYOLOv5(nn.Module):
def __init__(self):
super(CustomYOLOv5, self).__init__()
# Define your transformer block
self.transformer = TransformerBlock(dim=256, num_heads=8, ff_dim=512, dropout=0.1)
# Other layers...
def forward(self, x):
x = self.transformer(x)
# Forward pass through other layers...
return x
# Example usage
model = CustomYOLOv5() Please provide the specific transformer model you are integrating and any modifications you have made to the YOLOv5 codebase. This will help us give more targeted advice. Looking forward to your response so we can assist you further! |
hi again, this my work
class TransformerBlock(nn.Module):
instraead of c3
and my yaml file parametersnc: 80 # number of classes anchorsanchors: 3 gelan backbonebackbone: conv down[-1, 1, Conv, [64, 3, 2]], # 0-P1/2 conv down[-1, 1, Conv, [128, 3, 2]], # 1-P2/4 elan-1 block[-1, 1, RepNCSPELAN4, [256, 128, 64, 1]], # 2 avg-conv down[-1, 1, Conv, [256, 3, 2]], # 3-P3/8 elan-2 block[-1, 1, RepNCSPELAN4, [512, 256, 128, 1]], # 4 avg-conv down[-1, 1, Conv, [512, 3, 2]], # 5-P4/16 elan-2 block[-1, 1, RepNCSPELAN4, [512, 512, 256, 1]], # 6 avg-conv down[-1, 1, Conv, [512, 3, 2]], # 7-P5/32 elan-2 block[-1, 1, RepNCSPELAN4, [512, 512, 256, 1]], # 8 gelan headhead: elan-spp block[-1, 1, SPPELAN, [512, 256]], # 9 up-concat merge[-1, 1, nn.Upsample, [None, 2, 'nearest']], elan-2 block[-1, 1, RepNCSPELAN4, [512, 512, 256, 1]], # 12 up-concat merge[-1, 1, nn.Upsample, [None, 2, 'nearest']], elan-2 block[-1, 1, RepNCSPELAN4, [256, 256, 128, 1]], # 15 (P3/8-small) avg-conv-down merge[-1, 1, Conv, [256, 3, 2]], elan-2 block[-1, 1, RepNCSPELAN4, [512, 512, 256, 1]], # 18 (P4/16-medium) avg-conv-down merge[-1, 1, Conv, [512, 3, 2]], elan-2 block[-1, 1, RepNCSPELAN4, [512, 512, 256, 1]], # 21 (P5/32-large) detect[[15, 18, 21], 1, DDetect, [nc]], # Detect(P3, P4, P5) when i start training teh epochs and loss numbers starts normaly and and then when it finishing is making them nan and no val values |
Hello @gchinta1, Thank you for sharing your detailed implementation and YAML configuration. It looks like you've put a lot of effort into integrating transformer layers into the YOLOv5 architecture. Let's try to diagnose the issue with the NaN values during training. Steps to Diagnose and Resolve the Issue
Example Code with Debugging StatementsHere's an example of how you might integrate debugging statements into your import torch
import torch.nn as nn
class TransformerLayer(nn.Module):
def __init__(self, c, num_heads):
super().__init__()
self.q = nn.Linear(c, c, bias=False)
self.k = nn.Linear(c, c, bias=False)
self.v = nn.Linear(c, c, bias=False)
self.ma = nn.MultiheadAttention(embed_dim=c, num_heads=num_heads, batch_first=True)
self.fc1 = nn.Linear(c, c, bias=False)
self.fc2 = nn.Linear(c, c, bias=False)
def forward(self, x):
q, k, v = self.q(x), self.k(x), self.v(x)
attn_output, _ = self.ma(q, k, v)
x = x + attn_output
x = x + self.fc2(self.fc1(x))
if torch.isnan(x).any():
print("NaN detected in TransformerLayer")
return x
class TransformerBlock(nn.Module):
def __init__(self, c1, c2, num_heads, num_layers):
super().__init__()
self.conv = Conv(c1, c2) if c1 != c2 else nn.Identity()
self.linear = nn.Linear(c2, c2) # learnable position embedding
self.tr = nn.Sequential(*(TransformerLayer(c2, num_heads) for _ in range(num_layers)))
self.c2 = c2
def forward(self, x):
x = self.conv(x)
if torch.isnan(x).any():
print("NaN detected after conv")
b, c, w, h = x.shape
x = x.flatten(2).permute(2, 0, 1) # shape (wh, b, c)
x = self.tr(x + self.linear(x))
if torch.isnan(x).any():
print("NaN detected after transformer")
x = x.permute(1, 2, 0).reshape(b, self.c2, w, h)
return x Next Steps
If the issue persists, please provide any additional error messages or observations from the debugging statements. This will help us further diagnose and resolve the issue. Thank you for your patience and collaboration. Let's work together to get your model training successfully! 🚀 |
Thank you for help Glenn the line in training script fix the issue 😃.. talk to you next time I will need something 😅 |
Hello @gchinta1, I'm thrilled to hear that the solution worked for you! 😃 Your persistence and detailed information made it easier for us to diagnose and resolve the issue. If you have any more questions or need further assistance in the future, don't hesitate to reach out. The YOLO community and the Ultralytics team are always here to help. Happy training and best of luck with your project! 🚀 Talk to you next time! 😊 |
Search before asking
Question
Hi Glenn , I hope you are good . I am trying to training yolo with transformer just to see the difference but I am getting nan values on epochs.. it startes calculating the loss in the first one but I get 0 final val values. And in the other epochs all are nan numbers . What is the issue for this? Thank you
Additional
Hi Glenn , I hope you are good . I am trying to training yolo with transformer just to see the difference but I am getting nan values on epochs.. it startes calculating the loss in the first one but I get 0 final val values. And in the other epochs all are nan numbers . What is the issue for this? Thank you
The text was updated successfully, but these errors were encountered: