-
Can somebody please explain to me what the outputs are ? I got that the 39 are classes + 5, even tough i have no idea why +5. I find that first "output" contains info, the other 3 contain same info splitted (gues makes sense with concat as last layer) that has to go throgh a sigmoid function first. Can please somebody with a little more insight state what information (x,y,w,h,c probabilities) they contain and most importantly at what position? How is the number 42588 built ? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 6 replies
-
@Angele the Netron view at the outputs is very clear, you might want to take a look there instead of at the inputs. |
Beta Was this translation helpful? Give feedback.
-
In case you meant looking at the .pt model: |
Beta Was this translation helpful? Give feedback.
-
The shape is predictions times (stats + class prediction) . So to get top left corner of your first rect make: ( output[0, 0] - output [0, 2] / 2 ) The other outputs, like 104,104, 52,52 or 26,26 for yolo5l can be ignored, if you don't want to build your own yolo. |
Beta Was this translation helpful? Give feedback.
The shape is predictions times (stats + class prediction) .
Yolo5s makes 25200 predictions while Yolo5l makes 42588 by design.
Hence the [42588, (x, y, full_width, full_heigh, confidence, .... class predictions)]
So to get top left corner of your first rect make: ( output[0, 0] - output [0, 2] / 2 )
To see what prediction for class 0 is do output[0, 5]
To see if an object is there get output[0, 4]
The other outputs, like 104,104, 52,52 or 26,26 for yolo5l can be ignored, if you don't want to build your own yolo.