Release v0.9.0 · axolotl-ai-cloud/axolotl

What's Changed

[llama4] fix the mm yaml, add scout single gpu yaml by @winglian in #2510
upgrade transformers to 4.51.1 by @winglian in #2508
fix: liger swiglu for llama4 by @NanoCode012 in #2504
Add Llama4 maverick examples and wandb links by @winglian in #2512
fix: allow merge lora on pre-quantized model by @NanoCode012 in #2511
feat: add CNAME by @NanoCode012 in #2513
Update rlhf.qmd by @bursteratom in #2519
add mocks for loading datasets in cli train tests by @winglian in #2497
Feat(examples): add deepcogito by @NanoCode012 in #2516
feat(doc): explain deepspeed configs by @NanoCode012 in #2514
chore: update doc links by @NanoCode012 in #2509
[ci] make e2e tests a bit faster by reducing test split size by @winglian in #2522
remove strict=false from example yamls by @winglian in #2523
feat: add examples for deepcoder by @NanoCode012 in #2517
make sure the all of the model is on the same device by @winglian in #2524
feat: update cce to latest by @NanoCode012 in #2521
Fix: add delinearization and make qlora work with fsdp2 by @NanoCode012 in #2515
batch api HF adapter for ring-flash-attn; cleanup and improvements by @djsaunde in #2520
re-enable DS zero3 ci with updated transformers by @winglian in #2533
adding codecov reporting by @djsaunde in #2372
fix: preprocess yielding whole dataset to each worker by @NanoCode012 in #2503
fix(doc): cut cross entropy installation instructions broken in qmd by @NanoCode012 in #2532
zero val fix for beta by @winglian in #2538
fix: upgrade liger to 0.5.8 and use native Gemma3 patches by @chiwanpark in #2527
don't run multigpu tests twice, run SP in separate test by @winglian in #2542
Fixed Rex Scheduler Warm Up by @Catgat in #2535
prevent rate limiting to hf when using dispatch batches by @winglian in #2536
make sure to download fixtures for kd test by @winglian in #2541
fix missing host/port for vllm by @winglian in #2543
feat: add glm and glm4 multipack and cce by @NanoCode012 in #2546
Codecov fixes / improvements by @djsaunde in #2549
add base docker image with pytorch 2.7.0 and variant for cuda 12.8 by @winglian in #2551
builds for torch==2.7.0 by @winglian in #2552
Fix(doc): add delinearize instruction by @NanoCode012 in #2545
disable codecov pr annotations by @djsaunde in #2556
make sure to validate the config before normalizing so defaults get set by @winglian in #2554
Sequence parallel training context manager by @djsaunde in #2553
don't fail on codecov upload for external contributor PRs by @winglian in #2564
fix: gradient checkpointing functools.partial object has no attribute self by @ekojsalim in #2563
make cce default to true when using the plugin by @winglian in #2562
chore(doc): minor update docker tags on doc by @NanoCode012 in #2559
fix: crash when pretraining_dataset with dispatch_batches is false by @chiwanpark in #2558
fix support for wandb run_name for rl trainers by @winglian in #2566
add e2e smoke test for using activation/gradient checkpointing with offload by @winglian in #2565
don't use is_main_process during config validation by @winglian in #2569
update trl to 0.17.0 by @winglian in #2560
Fix bug in grpo reward module import by @dhruvmullick in #2571
fix(doc): clarify vllm usage with grpo by @NanoCode012 in #2573
replace references to random 68m model w 135m smollm2 by @winglian in #2570
Add runpod sls handler by @KAJdev in #2530
Add Post_model_load, post_lora_load, post_train, post_train_unload function calls by @divyanshuaggarwal in #2539
grab sys prompt too from dataset by @winglian in #2397
feat: add eos_tokens and train_on_eot for chat_template EOT parsing by @NanoCode012 in #2364
add preview-docs workflow by @djsaunde in #2432
support val_set_size for splitting test split from train with DPO by @winglian in #2572
Feat: Add qwen3 and CCE for qwen family by @NanoCode012 in #2518
chat template and example for qwen3 by @winglian in #2577
automatically split out reasoning trace from dataset by @winglian in #2579
v0.9.0 release by @winglian in #2578

New Contributors

@Catgat made their first contribution in #2535
@ekojsalim made their first contribution in #2563
@dhruvmullick made their first contribution in #2571
@KAJdev made their first contribution in #2530
@divyanshuaggarwal made their first contribution in #2539

Full Changelog: v0.8.0...v0.9.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.9.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!