This repository was archived by the owner on Apr 12, 2026. It is now read-only.

v0.4.0 — experimental ELECTRA

LoicGrobol released this 18 Mar 11:32

· 355 commits to main since this release

9deef11

Added

Replaced Token Detection (ELECTRA-like) pretraining
- Some of the API is still provisional, the priority was to get it out, a nicer interface will
  hopefully come later.
--val-check-period and --step-save-period allowing to evaluate and save a model decoupled
from epochs. This should be useful for training with very long epochs.
The datasets path in zeldarose-transformer can now be 🤗 hub handles. See --help.

Changed

The command line options have been changed to reflect change in Lightning
- --accelerator is now used for devices, tested values are "cpu" and "gpu"
- --strategy now specifies how to train, tested values are None (missing), "ddp",
  "ddp_sharded" "ddp_spawn" and"ddp_sharded_spawn".
- No more option to select sharded training, use the strategy alias for that
- --n-gpus has been renamed to --num-devices.
- --n-workers and --n-nodes have been respectively renamed to --num-workers and
  --num-nodes.
Training task configs now have a type config key to specify the task type
Lightning progress bars are now provided by Rich
Now supports Pytorch 1.11 and Python 3.10

Internal

Tests now run in Pytest using the console-scripts
plugin for smoke tests.
Smoke tests now include ddp_spawn tests and tests on gpu devices if available.
Some refactoring for better factorization of the common utilities for MLM and RTD.

Assets 2