can we possibly train (ideally continual pretraining) Gemma 3 series of models on t5x ? #1629
StephennFernandes
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
is there a possibility to train (ideally continual pretraining) Gemma 3 series of models on t5x ?
why choose t5x over maxtext ?
because i plan to continually pretrain gemma 3 on Mixture of Denoisers and over bi-directional attention similar to UL2 and PaLM 2 models
i am also open to suggestions whether it would be ideal and less clumbersome to explore this option on maxtext
Beta Was this translation helpful? Give feedback.
All reactions