Skip to content

Latest commit

 

History

History
256 lines (164 loc) · 8.31 KB

CHANGELOG.md

File metadata and controls

256 lines (164 loc) · 8.31 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

0.9.0 - 2024-04-17

Fixed

  • Training a m2m100 model on a language (code) not originally included in its tokenizer now works.

Changed

  • Pytorch compatibility changed to >= 2.0, < 2.3
  • 🤗 datasets compatibility changed to >= 2.18, < 2.19

0.8.0 - 2023-10-06

Fixed

  • Fixed multiple save when using step-save-period in conjunction with bach accumulation (close issue #30)

Changed

  • Maximum Pyorch compatibility bumped to 2.1
  • max_steps and max_epochs can now be set in the tuning config. Setting them via command line options is deprecated and will be removed in a future version.

0.7.3 - 2023-02-27

Fixed

  • Behaviour when asking for denoising in mBART with a model that has no mask token.

0.7.2 - 2023-02-26

Fixed

  • In mBART training, loss scaling now works as it was supposed to.
  • We have a documentation now! Check it out at https://zeldarose.readthedocs.io, it will get better over time (hopefully!).

0.7.1 - 2023-02-25

Fixed

  • Translate loss logging is not always zero anymore.

0.7.0 2023-02-25

Added

  • The --tf32-mode option allows to select the level of NVidia Ampère matmul optimisations.
  • The --seed option allows to fix a random seed.
  • The mbart task allows training general seq2seq and translation models.
  • A zeldarose command that serves as entry point for both tokenizer and transformer training.

Changed

  • BREAKING --use-fp16 has been replaced by --precision, which allows to also use fp64 and bfloat. Previous behaviour can be emulated with --precision 16.
  • Remove the GPU stats logging from the profile mode since Lightning stopped supporting it
  • Switched TOML library from toml to tomli
  • BREAKING Bumped the min version of several dependencies
    • pytorch-lightning >= 1.8.0
    • torch >= 1.12
  • Bumped max version of several dependencies
    • datasets < 2.10
    • pytorch-lightning < 1.9
    • tokenizers < 0.14

0.6.0 — 2022-07-28

Changed

  • Bumped torchmetrics minimal version to 0.9
  • Bumped datasets minimal version to 2.4
  • Bumped torch max version to 1.12

Fixed

  • Dataset fingerprinting/caching issues #31

0.5.0 — 2022-03-31

Added

Changed

  • Move packaging config to pyproject.toml and require setuptools>=61.
  • click_pathlib is no longer a dependency and click has a minimal version of 8.0.3

0.4.0 — 2022-03-18

Added

  • Replaced Token Detection (ELECTRA-like) pretraining
    • Some of the API is still provisional, the priority was to get it out, a nicer interface will hopefully come later.
  • --val-check-period and --step-save-period allowing to evaluate and save a model decoupled from epochs. This should be useful for training with very long epochs.
  • The datasets path in zeldarose-transformer can now be 🤗 hub handles. See --help.

Changed

  • The command line options have been changed to reflect change in Lightning
    • --accelerator is now used for devices, tested values are "cpu" and "gpu"
    • --strategy now specifies how to train, tested values are None (missing), "ddp", "ddp_sharded" "ddp_spawn" and"ddp_sharded_spawn".
    • No more option to select sharded training, use the strategy alias for that
    • --n-gpus has been renamed to --num-devices.
    • --n-workers and --n-nodes have been respectively renamed to --num-workers and --num-nodes.
  • Training task configs now have a type config key to specify the task type
  • Lightning progress bars are now provided by Rich
  • Now supports Pytorch 1.11 and Python 3.10

Internal

  • Tests now run in Pytest using the console-scripts plugin for smoke tests.
  • Smoke tests now include ddp_spawn tests and tests on gpu devices if available.
  • Some refactoring for better factorization of the common utilities for MLM and RTD.

0.3.4 — 2021-12-21

Changed

  • Bump lightning to 1.5.x

0.3.3 — 2021-11-01

Changed

  • max_steps is automatically inferred from the tuning config if a number of lr decay steps is given
  • max_epochs is now optional (if both max_steps and max_epochs are unset and no lr schedule is provided, Lightning's default will be used)
  • find_unused_parameters is now disabled in DDP mode, unless in profile mode
  • Bumped lightning to 1.4.x

Fixed

  • Linear decay now properly takes the warmup period into account

0.3.2 — 2021-05-31

Fixed

  • Accuracy should stop NaN-ing
  • Empty lines in datasets are now ignored

0.3.1 — 2021-05-19

Fixed

  • Stop saving tokenizers in legacy format also when training transformers
  • The RoBERTa tokenizers now correctly use ByteLevel processing, to make it consistent with 🤗 transformers
  • Add back automatic truncation of inputs in training transformers

Removed

  • The --overwrite-cache option, which was a no-op since 0.2.0 has been removed. Resetting the cache should be done manually if needed (but usually shouldn't be needed).

0.3.0 — 2021-04-23

Changed

  • Stop saving tokenizers in legacy format
  • Create data dir if they don't exist

0.2.0 — 2021-04-23

Added

  • --checkpoint option to load an existing lightning checkpoint
  • DDP sharding is now also possible with ddp_spawn

Changed

  • Text datasets are now loaded line-by-line by default and the block mode has been removed.
  • We now use 🤗 datasets as backend, so the datasets are implemented as memory-mapped files with dynamic loaders instead of being held in RAM. This significantly decrease RAM consumption for a very decent speed cost and allows us to train on much larger datasets.
  • GPU usage is now logged in --profile mode when relevant.
  • LR is now logged.

Removed

  • The --line-by-line flag has been removed, since this is now the default behaviour.
  • The zeldarose-create-cache has been removed, since dataset processing now works correctly in ddp.
  • The data module has been completely rewritten and the Dataset classes are no more.
  • mlm.masked_accuracy since it was not used anywhere.

Fixed

  • Logging has been improved for internal pytorch warnings and pytorch-lightning and 🤗 transformers.

0.1.1 — 2021-04-06

Fixed

  • Updated some obsolete doc

0.1.0 — 2021-04-06

Initial release