You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
what is the best way to decide on which mixture ratio is optimal?
In the mT5 paper the alpha value 0.3 gave the best balance between ideal performance for high and low resource languages.
However I am pretraining mT5 on Indian languages, and I have a diverse variety of indian multi-lingual corpus, where Hindi has 60M+ samples and Kashmiri has around 100k samples.
So I wanted to know if I could h-param tune somehow on t5x, or would just using alpha=0.3 work fine in my use case?
The text was updated successfully, but these errors were encountered:
Hi @StephennFernandes, deciding mixture rates is a research problem, so this is not a straight-forward question to answer. I'd recommend doing an hparam search to arrive at a good set of mixing rates if possible (or surveying other papers to find acceptable rates).
what is the best way to decide on which mixture ratio is optimal?
In the mT5 paper the alpha value 0.3 gave the best balance between ideal performance for high and low resource languages.
However I am pretraining mT5 on Indian languages, and I have a diverse variety of indian multi-lingual corpus, where Hindi has 60M+ samples and Kashmiri has around 100k samples.
So I wanted to know if I could h-param tune somehow on t5x, or would just using alpha=0.3 work fine in my use case?
The text was updated successfully, but these errors were encountered: