-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it difficult to train/finetune ConvNeXtv2 compared with ConvNeXtv1? #15
Comments
Thanks for noting this issue. |
Thank for your explanation. I will try these models again. |
I have been trying to train model ConvNeXt-V2-Tiny again following your new setup for the optimization. However, my obtained results, which don't not improve an overall accuracy as well as need much GPU memory comparing with V1, are still much lower than that of using ConvNeXt-Tiny. Can you double check the optimization recipe using CIFAR, MNIST, ect., for instance? |
Can confirm it's difficult to fine-tune. ConvNextV1-base gets me 86%-88% on my dataset within 5 epochs while ConvNextV2-Base can't seem to get over 81% no matter how I tweak the hyperparameters. |
any updates on this issue? I'm having the same problem |
@Metal079 Any updates ? |
No |
Dear authors,
I have played around both ConvNeXt v1 and yours using TIMM codebase with my own datasets.
Using V1 I don't struggle with training/finetuning for my datasets and am pleasure with my obtained overall performance for TIMM's variants.
However, I can not achieve any comparative performance (overall accuracy as well as computed costs, of course) using your V2 variants with regarding every pretrained weights.
Can you give me any tip, trick, or treat for a set of your hyperparameters?
Thank in advance.
Linh
The text was updated successfully, but these errors were encountered: