This repository was archived by the owner on Apr 12, 2026. It is now read-only.
v0.4.0 — experimental ELECTRA
Added
- Replaced Token Detection (ELECTRA-like) pretraining
- Some of the API is still provisional, the priority was to get it out, a nicer interface will
hopefully come later.
- Some of the API is still provisional, the priority was to get it out, a nicer interface will
--val-check-periodand--step-save-periodallowing to evaluate and save a model decoupled
from epochs. This should be useful for training with very long epochs.- The datasets path in
zeldarose-transformercan now be 🤗 hub handles. See--help.
Changed
- The command line options have been changed to reflect change in Lightning
--acceleratoris now used for devices, tested values are"cpu"and"gpu"--strategynow specifies how to train, tested values areNone(missing),"ddp",
"ddp_sharded""ddp_spawn"and"ddp_sharded_spawn".- No more option to select sharded training, use the strategy alias for that
--n-gpushas been renamed to--num-devices.--n-workersand--n-nodeshave been respectively renamed to--num-workersand
--num-nodes.
- Training task configs now have a
typeconfig key to specify the task type - Lightning progress bars are now provided by Rich
- Now supports Pytorch 1.11 and Python 3.10
Internal
- Tests now run in Pytest using the console-scripts
plugin for smoke tests. - Smoke tests now include
ddp_spawntests and tests on gpu devices if available. - Some refactoring for better factorization of the common utilities for MLM and RTD.