Skip to content

Pipeline to train encoder-decoder LMs #3

@akutuzov

Description

@akutuzov

We plan to have a bunch of T5-like language models trained on the HPLT v3 datasets:

  1. Monolingual models, following more or less the same list of languages as HPLT v2 BERT models
  2. A large multilingual model, aiming at providing a modern alternative to mT5

Metadata

Metadata

Labels

enhancementNew feature or request

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions