-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Target classes exceed model classes #678
Comments
@joehoeller hey bud. This means that you've supplied class numbers in your labels that exceed the class count you specified in your *.data and *.cfg files. Classes are zero indexed, so for example if you specify classes=3 in *.data and *.cfg, your labels may only have classes 0, 1 and 2 present. |
So from what I can tell, my labels say 1,2,3 for classes. I can’t seem to
find any other numbers.
Should I just modify my classes to 3 then?
|
@joehoeller your *.cfg and *.data simply need to match up in classes. It's ok to use the default yolov3-spp.cfg for example with 80 classes and supply labels that only go up to 15 classes (the remaining 65 spots on the output vectors will go unused, but it won't really hurt your training). If you modify your classes to 3, then your labels must either be 0, 1, or 2 (zero indexed). |
Were you able to solve this? I am starting to think that you might have reference it wrongly in the .data part. Assuming your .cfg file is properly formatted and being called properly. |
I updated Dark Chocolate, the COCOJSON -> Darknet converter, so now classes are 0 indexed: https://github.com/joehoeller/Dark-Chocolate, and outputs: My
The
I tried the [convolutional] [yolo] The error still persists:
|
@joehoeller that all looks right. Maybe I should update the error message to output more detail, hold on. Ok I've updated the assert statement in 0dd0fa7 Now it should give you more specific information. Can you |
Ok, nice!!! Now I know what the problem is:
Let me go fix and I'll report back ;) |
@joehoeller if your terminal window is too short output from tqdm progress bar will wrap. The images look like your labels need to offset the box centers by box width / 2 and box height / 2 |
No prob, I am on it -> "...box width / 2 and box height / 2", thnx for the tip!
width=640
|
I adjusted the math, does this look right to you? Darknet/Yolo is all 0.0 to 1.0 values.
Previous vals:
|
@joehoeller the image size information in the cfg is not used. All of the files (train.py, detect.py, test.py) use the Its hard to tell by eye if the new vals are correct, you just need to look at your test_batch0.jpg etc. to see that they overlay your objects correctly. |
Got it. thnx!!
…On Tue, Dec 3, 2019 at 6:22 PM Glenn Jocher ***@***.***> wrote:
@joehoeller <https://github.com/joehoeller> the image size information in
the cfg is not used. All of the files (train.py, detect.py, test.py) use
the --img-size argument, which is 416 default, and must be a 32-multiple.
Its hard to tell by eye if the new vals are correct, you just need to look
at your test_batch0.jpg etc. to see that they overlay your objects
correctly.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHCKTYGG6ZPU5OLVBYDQW3Z3BA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF3I5UA#issuecomment-561417936>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHD6UIMISJ5GKEPQUK3QW3Z3BANCNFSM4JUOPTIQ>
.
|
@joehoeller yeah that looks a lot better. This happens because COCO box origins are in a corner (top left I think), while darknet format has origins in the center. Note that the overlays you see are the labels, not the predicted boxes. Now you want to let it train for a few hours or a day or so and check your results.png. So the best way to do it is to note the epoch your validation losses (on the bottom row) start increasing, and then restart your training again from zero to that |
Cool. Thanks!! 👌🏽
Yeah, I calculated it’s another 18h @4min per epoch (277 epoch i think it
was).
…On Tue, Dec 3, 2019 at 7:14 PM Glenn Jocher ***@***.***> wrote:
@joehoeller <https://github.com/joehoeller> yeah that looks a lot better.
This happens because COCO box origins are in a corner (top left I think),
while darknet format has origins in the center.
Note that the overlays you see are the labels, not the predicted boxes.
Now you want to let it train for a few hours or a day or so and check your
results.png.
So the best way to do it is to note the epoch your validation losses (on
the bottom row) start increasing, and then restart your training to that
--epochs number (to lock in the LR drops that are programmed at 80% and
90% of --epochs.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHH6JTB24ODXZ4BIIQLQW376NA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF3L4QY#issuecomment-561430083>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHFCGC44TF5V5TQZAG3QW376NANCNFSM4JUOPTIQ>
.
|
@glenn-jocher 4 Ques: Q1. Is there a way to stop and then resume training? Q2. You said, "note the epoch your validation losses (on the bottom row) start increasing, ", but this is all I get back: Q3. When it is done, how to I access metrics like False Positive, False Negative, mAP etc? (Run the cmd for the utils or...?) Q4. In the image above, is it really producing a mAP score of 0.841 by epoch 109, or is it overfitting? |
UPDATE: I fixed the PR and updated the math, the COCO JSON -> Darknet conversion tool (Dark Chocolate) works now: daddydrac/Dark-Chocolate#2 |
@glenn-jocher plz see 4 ques above |
I been super busy with finals and final projects at Duke. So haven't been able to work on this much. But let me know later if you need a code review on Dark Chocolate. |
Yes plz. Go ahead and review even you can. Good luck w finals!!
…On Wed, Dec 4, 2019 at 10:29 PM Francisco Reveriano < ***@***.***> wrote:
UPDATE: I fixed the PR and updated the math, the COCO JSON -> Darknet
conversion tool (Dark Chocolate) works now: daddydrac/Dark-Chocolate#2
<daddydrac/Dark-Chocolate#2>
I been super busy with finals and final projects at Duke. So haven't been
able to work on this much. But let me know later if you need a code review
on Dark Chocolate.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHG3TGXFNVSR3Q5JKQ3QXB7R7A5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF7OEUI#issuecomment-561963601>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHFQYH5LJXWNPRV4XL3QXB7R7ANCNFSM4JUOPTIQ>
.
|
@glenn-jocher Please provide feedback - > Done training and got great results, mAP 0.96, which (I think) is competition grade. So now I am wondering about the following:
|
Yeah. if you can request me to do a code review on github. I will be happy to do that. |
I tried, it wont let me. I am not sure why, I'll keep trying. |
@joehoeller @FranciscoReveriano sorry I've been pretty busy lately. To answer your questions:
|
Thanks got it!
Great scores by the way:
* Precision Recall mAP
F1*
all 8.86e+03 6.76e+04 0.283 0.961 0.922 0.435
person 8.86e+03 2.24e+04 0.276 0.945 0.9 0.427
bicycle 8.86e+03 3.99e+03 0.239 0.971 0.932 0.384
car 8.86e+03 4.13e+04 0.333 0.966 0.936 0.495
…On Fri, Dec 6, 2019 at 2:37 PM Glenn Jocher ***@***.***> wrote:
@joehoeller <https://github.com/joehoeller> @FranciscoReveriano
<https://github.com/FranciscoReveriano> sorry I've been pretty busy
lately. To answer your questions:
1. The final results should be in results.txt and results.png.
2. For the class by class results run python3 test.py --cfg ...
--weights ... --data ... etc.
3. See 2
4. The column headers are in utils.utils.plot_results()
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHHRZQPFMDM4M7RI66DQXKZXXA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGFI5XI#issuecomment-562728669>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHHR6NG7PEPU2YQTDFLQXKZXXANCNFSM4JUOPTIQ>
.
|
okay sure |
There’s only 3 blocks you need to edit.
…On Tue, Feb 25, 2020 at 1:08 AM Joel Prabhod ***@***.***> wrote:
@joehoeller <https://github.com/joehoeller> Still the same error.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHGV7UTXS23CUIVXCLTRES7XNA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM22DCY#issuecomment-590717323>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHBTVB3PZWXQW3RXKG3RES7XNANCNFSM4JUOPTIQ>
.
|
@joehoeller yes ive updated the three blocks of the filters, classes. Do I have to include coco images in the training set? |
Did you add .tiff to the extensions? Torch doesn’t support .tiff out of the
box.
…On Tue, Feb 25, 2020 at 8:10 AM Joel Prabhod ***@***.***> wrote:
@joehoeller <https://github.com/joehoeller> yes ive updated the three
blocks of the filters, classes.
Do I have to include coco images in the training set?
Or just the ‘.tiff’ will do?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHC3CKOMJZXKJIERZ53REURHDA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM4DBVQ#issuecomment-590885078>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHE5FGMGRMYQX45ZA6LREURHDANCNFSM4JUOPTIQ>
.
|
@joehoeller ive added .tiff extension in the utils/dataset.py Im just thinking if i should try training with .jpeg first tomorrow and see if I see the error again. So one question to you, did u train both the 8bit jpeg and coco data for the Thermal object detection? |
I believe I had jpg images.
…On Tue, Feb 25, 2020 at 9:58 AM Joel Prabhod ***@***.***> wrote:
@joehoeller <https://github.com/joehoeller> ive added .tiff extension in
the utils/dataset.py
Im just thinking if i should try training with .jpeg first tomorrow and
see if I see the error again.
So one question to you, did u train both the 8bit jpeg and coco data for
the Thermal object detection?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHF654VRFQ5HABYF5UDREU52TA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM4QOSI#issuecomment-590939977>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHBRSO774GEWCMHB223REU52TANCNFSM4JUOPTIQ>
.
|
You just need the right math. Try using stock cfg file and make sure ur
classes & file paths match.
…On Tue, Feb 25, 2020 at 10:01 AM joe hoeller ***@***.***> wrote:
I believe I had jpg images.
On Tue, Feb 25, 2020 at 9:58 AM Joel Prabhod ***@***.***>
wrote:
> @joehoeller <https://github.com/joehoeller> ive added .tiff extension in
> the utils/dataset.py
>
> Im just thinking if i should try training with .jpeg first tomorrow and
> see if I see the error again.
>
> So one question to you, did u train both the 8bit jpeg and coco data for
> the Thermal object detection?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#678?email_source=notifications&email_token=ABHVQHF654VRFQ5HABYF5UDREU52TA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM4QOSI#issuecomment-590939977>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABHVQHBRSO774GEWCMHB223REU52TANCNFSM4JUOPTIQ>
> .
>
|
@joehoeller okay sure. Will try to fix it tomorrow. |
@joel5638 18 classes means |
@glenn-jocher yeah. I changed the filters and updated n=18. Ive updated the same in the cfg and tried. Doesnt work. Some minor change is missing out. So when you asked me to run ‘python3 train.py —img640. I got an error stating ‘cannot load ../coco/train2014/... But im not using any coco dataset. Im only training the model on 8000 thermal images but no RGB. |
@joel5638 the example command I gave you is an example of how to use the The image format is irrelevant as long as opencv can open them. |
@Glenn: I thought torch didn’t do .tiff out of the box?
…On Tue, Feb 25, 2020 at 1:43 PM Glenn Jocher ***@***.***> wrote:
@joel5638 <https://github.com/joel5638> the example command I gave you is
an *example* of how to use the --img-size argument. Obviously you apply
the argument to your own command.
The image format is irrelevant as long as opencv can open them.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHDU6GLXFKPL6SKRBQLREVYHRA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM5HKPQ#issuecomment-591033662>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHELRMGC7AST42IHUP3REVYHRANCNFSM4JUOPTIQ>
.
|
@joehoeller cv2 loads images and videos:
|
Good to know ;)
…On Tue, Feb 25, 2020 at 3:27 PM Glenn Jocher ***@***.***> wrote:
@joehoeller <https://github.com/joehoeller> cv2 loads images and videos.
From
https://docs.opencv.org/4.2.0/d4/da8/group__imgcodecs.html#ga288b8b3da0892bd651fce07b3bbd3a56
Currently, the following file formats are supported:
Windows bitmaps - *.bmp, *.dib (always supported)
JPEG files - *.jpeg, *.jpg, *.jpe (see the Note section)
JPEG 2000 files - *.jp2 (see the Note section)
Portable Network Graphics - *.png (see the Note section)
WebP - *.webp (see the Note section)
Portable image format - *.pbm, *.pgm, *.ppm *.pxm, *.pnm (always supported)
PFM files - *.pfm (see the Note section)
Sun rasters - *.sr, *.ras (always supported)
TIFF files - *.tiff, *.tif (see the Note section)
OpenEXR Image files - *.exr (see the Note section)
Radiance HDR - *.hdr, *.pic (always supported)
Raster and Vector geospatial data supported by GDAL (see the Note section)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHHHYSF3DDRF7X6OO3TREWELBA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM5SCTA#issuecomment-591077708>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHDZJ72QJBFGMI7M2UDREWELBANCNFSM4JUOPTIQ>
.
|
@joehoeller @glenn-jocher Im using @joehoeller 's thermal object detection repository. He mentioned 18 classes and ive added the n=18. so filters would be (18+1+4)x3 = 69 which is constant. but it doesnt seem to work. Ive also tried 66 as filters. I'm lost and confused what is going wrong :( |
Post the error from the command line
…On Tue, Feb 25, 2020 at 9:48 PM Joel Prabhod ***@***.***> wrote:
@joehoeller <https://github.com/joehoeller> @glenn-jocher
<https://github.com/glenn-jocher> Im using @joehoeller
<https://github.com/joehoeller> 's thermal object detection repository.
He mentioned 18 classes and ive added the n=18. so filters would be
(18+1+4)x3 = 69 which is constant. but it doesnt seem to work.
Ive also tried 66 as filters. I'm lost and confused what is going wrong :(
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHEGLAMEXRCP5FI3VITREXRAFA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM6V5XA#issuecomment-591224540>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHCSLG2JX53YT6MKTQLREXRAFANCNFSM4JUOPTIQ>
.
|
It is the same error. RuntimeError: shape '[16, 3, 23, 13, 13]' is invalid for input of size 178464
Nothing seems to work. |
What does your name and data file look like?
…On Tue, Feb 25, 2020 at 10:16 PM Joel Prabhod ***@***.***> wrote:
It is the same error.
RuntimeError: shape '[16, 3, 23, 13, 13]' is invalid for input of size
178464
1. I have resized the .tiff image size to 160x120 and using them.
2. Ive updated the classes as 18 and filters as 69 in yolov3-spp-r.cfg
3. Ive also tried classes as 17 and filters as 66 in yolov3-spp-r.cfg
Nothing seems to work.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#678?email_source=notifications&email_token=ABHVQHC6C55SKOH6A6HWS73REXULJA5CNFSM4JUOPTI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM6XJXQ#issuecomment-591230174>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHVQHFMFGEMO7DGT5PQX63REXULJANCNFSM4JUOPTIQ>
.
|
my custom.data looks like this : my custom.names looks like this: person my training_img_paths.txt looks like this: ./coco/images/train/FLIR_00211.tiff |
@joehoeller I just cloned your repository again and just added paths to the FLIR dataset with jpeg thermal images and the cfg you used for training. I ran this command : python3 train.py --data data/custom.data --cfg cfg/yolov3-spp-r.cfg --weights weights/yolov3-spp.weights when I Change the filters to 69 and classes to 18 in the cfg file. This is the error Error : AssertionError: Model accepts 18 classes labeled from 0-17, however you labelled a class 18 But when I Change the filters to 66 and classes to 18 in the cfg file. This is the error. Error : RuntimeError: shape '[16, 3, 23, 13, 13]' is invalid for input of size 178464 Looks like nothing seems to work out with the cfg file |
Just for fun, try the stock spp cfg file, & see if it works: |
@joehoeller @joel5638 yes you can always use the default yolov3-spp.cfg (with no changes) to train custom datasets with up to 80 classes, like an 18 class dataset. It's not an optimal solution, but it works. |
@joehoeller @glenn-jocher I just tried with stock cfgand i see the same result. AssertionError: Model accepts 18 classes labeled from 0-17, however you labelled a class 18. See https://docs.ultralytics.com/yolov5/tutorials/train_custom_data |
@joel5638 if your data is only labeled for classes between 0 and 17 it’s not possible to see that message. I thought you’d already fixed your labels? |
@joehoeller @glenn-jocher Finally the training has started. I deleted all the 8000 images and their labels and only took 10 images with their labels and tried and it worked out. So, I'm assuming that the images in the train folder need their respective labels to train. |
Oh good! The images used for training should not all need label files. We routinely take custom datasets and add COCO images sans labels for backgrounds. All that’s required is that the background images are listed in the *.txt file along with the rest of the (labeled) images. |
Oh wow. Now I get the picture. |
@joel5638 you're very welcome! It's a pleasure to assist. Our community is very helpful and supportive, and I'm sure they'll also be pleased to help you out. Keep up the good work! |
Created custom files for training:
(4 + 1 + 13) * 3 = 54
13 classes, 54 filters
*.names has 13 names in it
*.cfg was converted properly w 13 for classes, 54 for filters in all 3 yolo blocks
yolov3/utils/utils.py", line 451, in build_targets
assert c.max() <= model.nc, 'Target classes exceed model classes'
AssertionError: Target classes exceed model classes
The text was updated successfully, but these errors were encountered: