Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we fine-tuning the Text-to-Code Retrieval task? #146

Open
pdhung3012 opened this issue Aug 31, 2023 · 2 comments
Open

Can we fine-tuning the Text-to-Code Retrieval task? #146

pdhung3012 opened this issue Aug 31, 2023 · 2 comments

Comments

@pdhung3012
Copy link

Hello
I wonder if we can finetune the text-to-code retrieval task for Text-to-Code Retrieval like UniXcoder at here.
I have run the zero-shot code retrieval for Javascript. It shows that the best accuracy I can get for code retrieval is 70.2%, which is lower than the fine-tuned CodeT5+ at 71.3% (reported in CodeT5+ paper at here. So I want to check if I can increase the zero-shot result by fine-tuning.

Thank you

@yuewang-cuhk
Copy link
Contributor

Yes, you can definitely finetune on labeled datasets using contrastive loss (or combined with the matching loss) to further boost the retrieval performance. We plan to release the finetuning scripts in the future if there are many asks for this.

@gzt4se
Copy link

gzt4se commented Jan 4, 2024

Yes, you can definitely finetune on labeled datasets using contrastive loss (or combined with the matching loss) to further boost the retrieval performance. We plan to release the finetuning scripts in the future if there are many asks for this.

I would like to ask if there are now open-source finetune scripts to share for Text-to-Code Retrieval using codet5+, thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants