Releases: ludwig-ai/ludwig
Releases · ludwig-ai/ludwig
Ludwig v0.10.4
What's Changed
- Small typo in dequantization script by @arnavgarg1 in #3993
- Docs: update visualize.py by @eltociear in #4001
- [MAINTENANCE] Most Recent Version of matplotlib breaks ptitprince and seaborn method calls. by @alexsherstinsky in #4007
- [MAINTENANCE] Make the implementation for the fix of the ViTEncoder to ensure that the transformers.ViTModel returns the output_attentions more elegant (and cuts on the amount of code) by @alexsherstinsky in #4008
- Fix mnist source by @mhabedank in #4011
- Support for freezing pretrained vision model layers with regex by @ethanreidel in #3981
- Add Phi-3 Support by @arnavgarg1 in #4014
New Contributors
- @eltociear made their first contribution in #4001
- @mhabedank made their first contribution in #4011
Full Changelog: v0.10.3...v0.10.4
Ludwig v0.10.3
What's Changed
- Replace Slack links with Discord links. by @alexsherstinsky in #3988
- Allow image bytes type during preprocessing by @vijayi1 in #3971
- Fix for 'upload_to_hf_hub()' path mismatch with 'save()' by @sanjaydasgupta in #3977
- Minor change to fix the incorrect response truncation by @amankhandelia in #3986
- Pin minimum transformers to 4.39 to reduce Llama/Gemma memory pressure by @arnavgarg1 in #3976
- Actually add support for RSLoRA and DoRA by @arnavgarg1 in #3984
New Contributors
- @amankhandelia made their first contribution in #3986
Full Changelog: v0.10.2...v0.10.3
Release v0.10.2
What's New
- Add support for RSLoRA and DoRA by @arnavgarg1 in #3948
To enable, set the corresponding flag totrue
in config (can be used in conjunction):
adapter:
type: lora
use_rslora: false
use_dora: false
- Add support for eval batch size tuning for LLMs on local backend by @arnavgarg1 in #3957
To enable, set "eval_batch_size" to "auto" in the trainer section:
trainer:
eval_batch_size: auto
- Enable loading model weights from training checkpoint by @geoffreyangus in #3969
To enable, passfrom_checkpoint=True
toLudwigModel.load()
:
LudwigModel.load(model_dir, from_checkpoint=True)
Full Changelog
- Save ludwig-config with model-weights in output directory by @sanjaydasgupta in #3965
- Add unit tests for image utils unet functions by @vijayi1 in #3921
- fix: Update imdb_genre_prediction dataset yaml to match dataset by @jeffreyftang in #3944
- Fix kube apt source by @noyoshi in #3952
- Temporarily disable expensive text metrics by @arnavgarg1 in #3954
- [MAINTENANCE] Comment Out PyTorch Nightly Test by @alexsherstinsky in #3955
- [BUGFIX] Fixing integration test failures. by @alexsherstinsky in #3959
- [MAINTENANCE] Use latest version of psutil library. by @alexsherstinsky in #3956
New Contributors
- @sanjaydasgupta made their first contribution in #3965
Full Changelog: v0.10.1...v0.10.2
Ludwig v0.10.1
What's Changed
- Fixed a critical bug in Gemma model fine-tuning that prevented the model from learning when to stop generation. This is accomplished by using eos token in target tensor for instruction-tuning by @geoffreyangus in #3945
Full Changelog: v0.10.0...v0.10.1
v0.10.0
What's Changed
- Add Phi-2 to model presets by @arnavgarg1 in #3912
- Add default LoRA target modules for Phi-2 by @arnavgarg1 in #3911
- Add support for prompt lookup decoding during generation by @arnavgarg1 in #3917
- Pin pyarrow to < 15.0.0 by @arnavgarg1 in #3918
- Add unet encoder-decoder and image output feature by @vijayi1 in #3913
- fix: Add Nested quantization check by @jeffkinnison in #3916
- fix typo in save_dequantized_base_model log statement by @arnavgarg1 in #3923
- Add example for base model dequantization/upscaling by @arnavgarg1 in #3924
- fix: Always return a list of quantization bits values from
get_quantization
by @jeffkinnison in #3926 - fix: set
use_reentrant
toTrue
to fixMixtral-7b
bug by @geoffreyangus in #3928 - Disabling AdaptionPrompt till PEFT is fixed. by @alexsherstinsky in #3935
- Add default LoRA target modules for Gemma by @arnavgarg1 in #3936
- Pinning transformers to 4.38.1 or above in order to ensure support for Gemma by @alexsherstinsky in #3940
- Ludwig release version change by @alexsherstinsky in #3941
New Contributors
Full Changelog: v0.9.3...v0.10.0
v0.9.3
What's Changed
- [MAINTENANCE] Use Trusted Publishers credentials instead of User/Password for uploading releases to PyPi by @alexsherstinsky in #3892
- Add support for official
microsoft/phi-2
by @arnavgarg1 in #3880 - Ensure correct padding token for Phi and Pythia models by @arnavgarg1 in #3899
- Enable AdaLoRA tests for LLM adapter by @jeffkinnison in #3896
- Cast
LLMEncoder
output totorch.float32
, freeze final layer at init. by @jeffkinnison in #3900 - Enable IA3 adapters in
LLMEncoder
by @jeffkinnison in #3902 - [Maintenance] Remove torch nightly pin by @arnavgarg1 in #3903
- Pin deepspeed to < 0.13 and pandas to < 2.2.0 by @arnavgarg1 in #3906
- Add batch size tuning for LLMs by @Infernaught in #3871
Full Changelog: v0.9.2...v0.9.3
v0.9.2: Fixes for OOM and other errors in Ludwig 0.9.1
What's Changed
- fix: Handle missing and unexpected keys during LLMEncoder state dict load by @jeffkinnison in #3841
- fix: Add
name
anddescription
classmethods toIA3Config
by @jeffkinnison in #3844 - Improve IA3 long description by @arnavgarg1 in #3845
- fix: Handle missing and unexpected keys during LLMEncoder state dict load, part 2 by @jeffkinnison in #3843
- Update description for max_new_tokens to explain the dynamic setting behavior in our docs by @arnavgarg1 in #3847
- Add default LoRA target modules for Mixtral and Mixtral instruct by @arnavgarg1 in #3852
- QOL: Fail config validation if a user tries to use ECD with a text output feature and an LLM encoder. by @justinxzhao in #3792
- Pin minimum transformers to 4.36 for Mixtral and Phi support by @arnavgarg1 in #3854
- Revert hack that leads to OOM during fine-tuning by @arnavgarg1 in #3858
- Add support for exporting models to Carton by @VivekPanyam in #3797
- [Maintenance] Bump minimum tokenizers to 0.15 by @arnavgarg1 in #3856
- fix: correct typo in FeatureCollection by @dennisrall in #3863
- Convert test main script in algorithm_utils to unit test by @dennisrall in #3864
- Allow hyperopt config to be loaded from a file by @arnavgarg1 in #3865
- fix: unify ludwig training set metadata and hf pad token by @geoffreyangus in #3860
- Add a utility to detect LLM usage in a config by @jeffkinnison in #3869
- Early stop training if model weights have nan or inf tensors by @arnavgarg1 in #3740
- Scrub credentials from model_hyperparameters.json and description.json by @Infernaught in #3866
- [Maintenance] Bump minimum torch version to 2.0.0 by @arnavgarg1 in #3873
- [Maintenance] Fix docker images by pinning ray==2.3.1, daft==0.1.20, unpinning proto, and using torch 2.1.1. by @justinxzhao in #3872
- [BUGFIX] Guard against UnicodeEncodeError when saving validation results in Google Colab environment by @alexsherstinsky in #3875
- Docker image fixes part 2: pin to torch==2.1.0, add dependency for urllib<2 by @arnavgarg1 in #3877
- Add custom
prepare_for_trianing
logic to ECD model for LLM encoder adapter initialization by @jeffkinnison in #3874 - qol: Fix some lints. by @justinxzhao in #3868
- [Maintenance] Docker Image Fix part 3: fix torchaudio 2.1.0 dependencies by installing
libsox-dev
and update API by @arnavgarg1 in #3879 - Add streaming support for zero shot inference by @arnavgarg1 in #3878
- [Maintenance] Remove torchdata pin for nightly install by @arnavgarg1 in #3855
- Add per-step token utilization to tensorboard and progress tracker. by @justinxzhao in #3867
- Set use_reentrant to False for gradient checkpointing by @arnavgarg1 in #3882
- [BUGFIX] Pinning torch nightly to January 13, 2024 to avoid AttributeError by @alexsherstinsky in #3885
New Contributors
- @VivekPanyam made their first contribution in #3797
Full Changelog: v0.9.1...v0.9.2
v0.9.1
What's Changed
- fix: Handle missing and unexpected keys during LLMEncoder state dict load by @jeffkinnison in #3841
- fix: Add
name
anddescription
classmethods toIA3Config
by @jeffkinnison in #3844 - Improve IA3 long description by @arnavgarg1 in #3845
- bump ludwig version by @geoffreyangus in #3846
Full Changelog: v0.9...v0.9.1
v0.9: Mixtral, Phi, Zephyr, and text classification for LLMs
What's Changed
- int: Rename original
combiner_registry
tocombiner_config_registry
, update decorator name by @ksbrar in #3516 - Add mechanic to override default values for generation during model.predict() by @justinxzhao in #3520
- [feat] Support for numeric date feature inputs by @jeffkinnison in #3517
- Add new sythesized
response
column for text output features during postprocessing by @arnavgarg1 in #3521 - Disable flaky twitter bots dataset loading test. by @justinxzhao in #3439
- Add test that verifies that the generation config passed in at model.predict() is used correctly. by @justinxzhao in #3523
- Move loss metric to same device as inputs by @Infernaught in #3522
- Add comment about batch size tuning by @arnavgarg1 in #3526
- Ensure user sets backend to local w/ quantization by @Infernaught in #3524
- README: Update LLM fine-tuning config by @arnavgarg1 in #3530
- Revert "Ensure user sets backend to local w/ quantization (#3524)" by @tgaddair in #3531
- Improve observability during LLM inference by @arnavgarg1 in #3536
- [bug] Pin pydantic to < 2.0 by @jeffkinnison in #3537
- [bug] Support preprocessing
datetime.date
date features by @jeffkinnison in #3534 - Remove obsolete prompt tuning example. by @justinxzhao in #3540
- Add Ludwig 0.8 notebook to the README by @arnavgarg1 in #3542
- Add
effective_batch_size
to auto-adjust gradient accumulation by @tgaddair in #3533 - Refactor evaluation metrics to support decoded generated text metrics like BLEU and ROUGE. by @justinxzhao in #3539
- Fix sequence generator test. by @justinxzhao in #3546
- Revert "Add Cosine Annealing LR scheduler as a decay method (#3507)" by @justinxzhao in #3545
- Set default max_sequence_length to None for LLM text input/output features by @arnavgarg1 in #3547
- Add skip_all_evaluation as a mechanic to skip all evaluation. by @justinxzhao in #3543
- Roll-forward with fixes: Fix interaction between scheduler.step() and gradient accumulation steps, refactor schedulers to use
LambdaLR
, and add cosine annealing LR scheduler as a decay method. by @justinxzhao in #3555 - fix: Move model to the correct device for eval by @jeffkinnison in #3554
- Report loss in tqdm to avoid log spam by @tgaddair in #3559
- Wrap each metric update in try/except. by @justinxzhao in #3562
- Move DDP model to device if it hasn't been wrapped yet by @tgaddair in #3566
- ensure that there are enough colors to match the score index in visua… by @thelinuxkid in #3560
- Pin Transformers to 4.31.0 by @arnavgarg1 in #3569
- Add test to show global_max_sequence_length can never exceed an LLMs context length by @arnavgarg1 in #3548
- WandB: Add metric logging support on eval end and epoch end by @arnavgarg1 in #3586
- schema: Add
prompt
validation check by @ksbrar in #3564 - Unpin Transformers for CodeLlama support by @arnavgarg1 in #3592
- Add support for Paged Optimizers (Adam, Adamw), 8-bit optimizers, and new optimizers: LARS, LAMB and LION by @arnavgarg1 in #3588
- FIX: Failure in TabTransformer Combiner Unit test by @jimthompson5802 in #3596
- fix: Move target tensor to model output device in
check_module_parameters_updated
by @jeffkinnison in #3567 - Allow user to specify huggingface link or local path to pretrained lora weights by @Infernaught in #3572
- Add codellama to tokenizer list for set_pad_token by @Infernaught in #3598
- Set default eval batch size to 2 for LLM fine-tuning by @arnavgarg1 in #3599
- [CI] Explicitly set eval batch size in determinism tests, introduce a new integration test group, and exclude slow tests. by @justinxzhao in #3590
- [CI] Run sudo apt-get update in GHAs. by @justinxzhao in #3608
- Store steps_per_epoch in Trainer by @hungcs in #3601
- Updated characters, underscore and comma preprocessors to be TorchScriptable. by @martindavis in #3602
- [CI] Deflake: Explicitly set eval batch size for mlflow test. by @justinxzhao in #3612
- Fix registration for char error rate. by @justinxzhao in #3604
- fix: Load 8-bit quantized models for eval after fine-tuning by @jeffkinnison in #3606
- Add Code Alpaca and Consumer Complaints Datasets by @connor-mccorm in #3611
- Add support for gradient checkpointing for LLM fine-tuning by @arnavgarg1 in #3613
- Bump min support transformers to 4.33.0 by @tgaddair in #3616
- [CI] Fix failing tests on master by @arnavgarg1 in #3617
- Eliminate short-circuiting for loading from local by @Infernaught in #3600
- Refactor integration tests into matrix by @tgaddair in #3618
- fix: Check underlying model device type when moving 8-bit quantized models to GPU at eval by @jeffkinnison in #3622
- Fixed range validation for text generation penalty parameters by @tgaddair in #3623
- Update comment for predict to update Ludwig docs by @Infernaught in #3535
- Avoid deprecation warnings on pandas Series.fillna by @carlogrisetti in #3631
- QoL: Default to using fast tokenizer for Llama models by @arnavgarg1 in #3625
- fixed typo in EfficientNet's model variant from v2_ to v2_s by @saad-palapa in #3628
- Add pytorch profiler and additional tensorboard logs for GPU memory usage. by @justinxzhao in #3607
- Pin minimum transformers version to
4.33.2
by @arnavgarg1 in #3637 - Add function to free GPU memory by @Infernaught in #3643
- ❗ Enable LLM fine-tuning tests when no quantization is specified by @arnavgarg1 in #3626
- Add check to ensure selected backend works with quantization for LLMs by @arnavgarg1 in #3646
- [CI] Use a torch-nightly-compatible version of torchaudio by @justinxzhao in #3644
- Set do_sample default to True by @Infernaught in #3641
- FIX: Failure in audio feature related test by @jimthompson5802 in #3651
- Remove unnecessary peft config updating by @Infernaught in #3642
- FIX: docker build error for ludwig-gpu by @jimthompson5802 in #3658
- Exclude getdaft on Windows by @carlogrisetti in #3629
- Add daft back for windows since the wheels are now officially published by @arnavgarg1 in #3663
- fix: The final batch of an epoch is skipped when batch size is 1 by @jeffkinnison in #3653
- Place metric functions for BLEU and Rogue on correct devices when using multiple GPUs by @arnavgarg1 in #3671
- Remove duplicate metrics by @Infernaught in #3670
- Increment epochs based on last_batch() instead of at the end of the train loop. by @justinxzhao in #3668
- [FEATURE] Support Merging LoRA Weights Into Base Model (Issue-3603) by @alexsherstinsky in #3649
- [FEATURE] Include Mistral-7B model in list of supported base models by @alexsherstinsky in #3674
- [MAINTENANCE] Partially reconcile type hints, fix some warnings, and fix comments in parts of the codebase. by @alexsherstinsky in #3673
- Improve error message for when an LLM base model can't be loaded. by @justinxzhao in #3675
- Fix eos_token and pad_token issue by @Infernaught in https://g...
v0.8.6
What's Changed
- Add consumer complaints generation dataset by @connor-mccorm in #3685
- Set the metadata only during first training run by @Infernaught in #3684
- Add ability to upload Ludwig models to Predibase. by @martindavis in #3687
- Log additional per-GPU information in model metadata files and GPU utilization on tensorboard. by @justinxzhao in #3712
- QoL: Only log generation config being used once at inference time by @arnavgarg1 in #3715
- [MAINTENANCE] Adding typehint annotations in backend and data components and fixing mypy errors. by @alexsherstinsky in #3709
- QoL: Limit top-level trainer logging messages such as saving model or resuming model training to main coordinator process by @arnavgarg1 in #3718
- Add sample_size as a global preprocessing parameter by @Infernaught in #3650
- QOL: Update recommended vscode settings. by @justinxzhao in #3717
- Add new fine-tuning notebooks to README by @arnavgarg1 in #3722
- Dynamically set
max_new_tokens
based on output feature length, GMSL and model window size by @arnavgarg1 in #3713 - Fix issue while logging cuda device utilization to tensorboard by @arnavgarg1 in #3727
Full Changelog: v0.8.5...v0.8.6