Skip to content
This repository was archived by the owner on Apr 12, 2026. It is now read-only.

v0.4.0 — experimental ELECTRA

Choose a tag to compare

@LoicGrobol LoicGrobol released this 18 Mar 11:32
· 355 commits to main since this release
9deef11

Added

  • Replaced Token Detection (ELECTRA-like) pretraining
    • Some of the API is still provisional, the priority was to get it out, a nicer interface will
      hopefully come later.
  • --val-check-period and --step-save-period allowing to evaluate and save a model decoupled
    from epochs. This should be useful for training with very long epochs.
  • The datasets path in zeldarose-transformer can now be 🤗 hub handles. See --help.

Changed

  • The command line options have been changed to reflect change in Lightning
    • --accelerator is now used for devices, tested values are "cpu" and "gpu"
    • --strategy now specifies how to train, tested values are None (missing), "ddp",
      "ddp_sharded" "ddp_spawn" and"ddp_sharded_spawn".
    • No more option to select sharded training, use the strategy alias for that
    • --n-gpus has been renamed to --num-devices.
    • --n-workers and --n-nodes have been respectively renamed to --num-workers and
      --num-nodes.
  • Training task configs now have a type config key to specify the task type
  • Lightning progress bars are now provided by Rich
  • Now supports Pytorch 1.11 and Python 3.10

Internal

  • Tests now run in Pytest using the console-scripts
    plugin
    for smoke tests.
  • Smoke tests now include ddp_spawn tests and tests on gpu devices if available.
  • Some refactoring for better factorization of the common utilities for MLM and RTD.