axolotl-ai-cloud / axolotl Public

Notifications
Fork 971
Star 8.8k

Code
Issues 203
Pull requests 68
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: axolotl-ai-cloud/axolotl

Improve Adapter/LoRA handling

#1095 opened Jan 11, 2024 by winglian

Open 4

Labels 20 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

203 Open 545 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

"CUDA error: invalid argument" with FSDP + QLora finetuning bug

Something isn't working

#2409 opened Mar 12, 2025 by peterwilli

6 of 8 tasks

Process hanged when using cpu offloading bug

Something isn't working

#2405 opened Mar 11, 2025 by mohit-217

5 of 8 tasks

FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. enhancement

New feature or request

#2402 opened Mar 11, 2025 by NanoCode012

5 tasks done

EXTREMELY SLOW (unusable) towards end of tokenization of dataset with long multi turn conversations bug

Something isn't working

#2396 opened Mar 7, 2025 by Nero10578

6 of 8 tasks

LoRA example from quickstart guide not working with Docker container bug

Something isn't working

#2395 opened Mar 7, 2025 by daniel-dona

6 of 8 tasks

ImportError: cannot import name 'shard_checkpoint' from 'transformers.modeling_utils' (transformers 4.49.0) bug

Something isn't working

#2387 opened Mar 6, 2025 by shing100

6 of 8 tasks

AxolotlGRPOTrainer still shuffles combined datasets even with curriculum_sampling flag enabled bug

Something isn't working

#2376 opened Mar 2, 2025 by sidmadala

6 of 8 tasks

Unable to preprocess GRPO dataset bug

Something isn't working

#2368 opened Feb 28, 2025 by junethai-mendel

6 of 8 tasks

Model is not getting saved after fine-tuning with weights and biases config: wandb_log_model bug

Something isn't working

waiting for reporter waiting on upstream

#2337 opened Feb 17, 2025 by HeenaRajan

6 of 8 tasks

no pad_token or eos_token in wandb eval table "Eval - Predictions vs Ground Truth" bug

Something isn't working

#2330 opened Feb 13, 2025 by BaiMoHan

6 of 8 tasks

Mistral-Small-3 support enhancement

New feature or request

#2308 opened Feb 3, 2025 by win4r

5 tasks done

axolotl CLI autocomplete enhancement

New feature or request

#2297 opened Jan 29, 2025 by winglian

5 tasks done

Refactor training_args_cls logic in trainer_builder.py into a utility function. enhancement

New feature or request

#2288 opened Jan 27, 2025 by SalmanMohammadi

>=4-nodes（4*4gpu） training hangs at zero_first bug

Something isn't working

#2275 opened Jan 22, 2025 by sankexin

6 of 8 tasks

Refactor Dataset Configuration for Modular Typing, Discriminated Unions, and Backward Compatibility enhancement

New feature or request

#2271 opened Jan 21, 2025 by NJordan72

5 tasks done

Unable to run Multi-GPU ORPO training on Gemma model bug

Something isn't working

#2267 opened Jan 17, 2025 by chimezie

6 of 8 tasks

FSDP+LORA on multiple gpu(A100 80gb*4) ValueError: Cannot flatten integer dtype tensors bug

Something isn't working

#2250 opened Jan 10, 2025 by Paxwell-Paxwell

6 of 8 tasks

[Bug] Resuming training on a pretraining loop does not continue data loading from where it left off bug

Something isn't working

#2229 opened Jan 2, 2025 by NanoCode012

6 of 8 tasks

max_grad_norm doesn't appear to be clipping gradients bug

Something isn't working

waiting for reporter waiting on upstream

#2214 opened Dec 22, 2024 by DevonPeroutky

6 of 8 tasks

"RuntimeError: Invalid device argument : did you call init? "When setting CUDA_VISIBLE_DEVICES bug

Something isn't working

waiting for reporter

#2199 opened Dec 18, 2024 by zhanghanxing2022

6 of 8 tasks

load_from_disk for rl tpye training enhancement

New feature or request

#2192 opened Dec 15, 2024 by leeparkuky

5 tasks done

APOLLO optimizer enhancement

New feature or request

#2175 opened Dec 11, 2024 by fblgit

5 tasks done

When starting with DPO datasets, failed error with TypeError. bug

Something isn't working

waiting for reporter

#2174 opened Dec 11, 2024 by Yuto-24

6 of 8 tasks

Show sample batch content enhancement

New feature or request

#2145 opened Dec 7, 2024 by fzyzcjy

5 tasks done

Support ORPO/DPO Liger losses (and LigerORPOTrainer) enhancement

New feature or request

wip

#2141 opened Dec 6, 2024 by ccdv-ai

5 tasks done

Previous 1 2 3 4 5 … 8 9 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-03-10.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly