Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning with small dataset #24

Closed
LeDuySon opened this issue May 12, 2024 · 10 comments
Closed

Finetuning with small dataset #24

LeDuySon opened this issue May 12, 2024 · 10 comments

Comments

@LeDuySon
Copy link

First of all, thank you for your great project. I want to ask you for some recommendation about finetuning with a small dataset (Around 400 images), my problem is main car segmentation (Only segment one car even if the image contains multiple cars, the main car is the biggest one and in the middle of the image)

  • Which layers should i freeze?
  • What learning rate should i start with?
  • Because the car segmentation is pretty simple, should i turn off any loss components?
  • Do you have any idea of how many images is pretty enough for training?
    Due to the limit of resources so i can't try all these thing, your suggestion will be really meaningful to me.
@ZhengPeng7
Copy link
Owner

  1. Model: Since you have only a little data, I suggest using a smaller model, e.g., choose the swin_v1_tiny as the backbone in config.py (remember to put the weights of backbone in the right place).
  2. Freezing layers: I'm not sure whether freezing some layers can help the training. But if you want, you can turn on the freeze_bb option in config.py to easily freeze the layers of the backbone.
  3. Loss: You can turn off the ssim loss in config.py since it benefits segmentation in fine regions, which is unnecessary in your case. In my experience, IoU loss converges much faster but decreases in accuracy. But if you want to see the results faster, you can leave it on only.
  4. If you do not have extra data, you can split 40 images for validation.

If you have further questions, feel free to leave messages :)

@LeDuySon
Copy link
Author

Thank you so much for your detailed answer.

About the swin_v1_tiny, can we have a massive dataset training with this one? I found the massive training one is much better in general case compare to the one that trained on only one dataset.

About the loss function, i think about it again because my input image can sometime be like this, kinda complex so i will need to try it myself, do you have any recommendation about where to hire gpu?

image

@ZhengPeng7
Copy link
Owner

Thanks for your feedback! However, it takes a lot of time to do the massive training, even with swin-tiny, which I haven't started. I can only say I might spare my own and GPUs' time for it in the future.

If your cases are similar to the image above, I recommend using the default settings of losses in my project.

About renting GPUs, I personally recommend those on autodl, which is the cheapest platform I used. But if you are not in China (you know there are firewalls to block people from things like Google), I recommend finding some GPUs on vast.ai. BTW, if you want to use the default training setting (bs=2, bb=Swin-L), you need GPUs with more than 37G memory. If you want to train with swin-tiny, you can use GPUs with 24G memory. Batch size is better to be larger than 1 (I've tested full training with bs=1).

@LeDuySon
Copy link
Author

Thank you!
Have you tried to export this model to Onnx? I plan to deploy this model to triton inference server latter after the training so if you have not, maybe i will try to do it and get back to you

@ZhengPeng7
Copy link
Owner

Sorry, I haven't done this kind of thing. But if you encounter some problems while doing the deployment, which you think I may know about, feel free to leave messages here. Good luck!

@LeDuySon
Copy link
Author

Thank you!

@LeDuySon
Copy link
Author

Hi @ZhengPeng7 , i know this one is not related to this discussion but i can't load the BiRefNet_DIS_ep500-swin_v1_tiny anymore? Do you know why? I have changed the backbone in config to swin_v1_t but when loading the checkpoint, it just shows mismatched between many layers

@ZhengPeng7
Copy link
Owner

There were some differences between the previous codes and the descriptions in the paper in terms of model architecture. I made the modifications so that they are 100% correct with each other now.
I'll try to train a swin-tiny version in the massive training setting. I'll reply to you once it's done.

@LeDuySon
Copy link
Author

Thanks man!

@ZhengPeng7
Copy link
Owner

Feel free to reopen it if you have any more question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants