Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to decide ideal mixture rates ? #299

Closed
StephennFernandes opened this issue Aug 12, 2022 · 1 comment
Closed

how to decide ideal mixture rates ? #299

StephennFernandes opened this issue Aug 12, 2022 · 1 comment

Comments

@StephennFernandes
Copy link

what is the best way to decide on which mixture ratio is optimal?

In the mT5 paper the alpha value 0.3 gave the best balance between ideal performance for high and low resource languages.

However I am pretraining mT5 on Indian languages, and I have a diverse variety of indian multi-lingual corpus, where Hindi has 60M+ samples and Kashmiri has around 100k samples.

So I wanted to know if I could h-param tune somehow on t5x, or would just using alpha=0.3 work fine in my use case?

@gauravmishra
Copy link
Collaborator

Hi @StephennFernandes, deciding mixture rates is a research problem, so this is not a straight-forward question to answer. I'd recommend doing an hparam search to arrive at a good set of mixing rates if possible (or surveying other papers to find acceptable rates).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants