Skip to content

Pytorch based Finetuning LLMs (PEFT)#78

Merged
iswaryaalex merged 25 commits intomainfrom
iswarya/pytorch-finetuning
Mar 11, 2026
Merged

Pytorch based Finetuning LLMs (PEFT)#78
iswaryaalex merged 25 commits intomainfrom
iswarya/pytorch-finetuning

Conversation

@iswaryaalex
Copy link
Copy Markdown
Collaborator

@iswaryaalex iswaryaalex commented Feb 16, 2026

This PR brings in Pytorch Finetuning playbooks that focus of PEFT techniques like SFT full-finetuning, LoRA and QLoRA using pytorch with ROCm backend

  • Added 3 scripts in assets : train_full_finetuning, train_lora.py and train_qlora.py
  • Added README to walk through the different techniques, overview, comparison , details, how to finetune, how to customize and finally how to use the finetuned models/adapters
  • Note: For Qlora/Lora with bitsandbytes, the bnb package is not supported for Windows Only works for Linux.
  • Tested on stx halo windows and Linux with ROCm 7.11.0, pytorch
  • Added CI/CD tests for playbook README. Note: entire finetuning is not run during testing but covers module imports, initializations etc

@iswaryaalex iswaryaalex marked this pull request as draft February 16, 2026 22:10
@iswaryaalex iswaryaalex marked this pull request as ready for review February 17, 2026 05:50
@iswaryaalex iswaryaalex self-assigned this Feb 17, 2026
@adamlam2-amd
Copy link
Copy Markdown
Collaborator

adamlam2-amd commented Mar 5, 2026

Nice - overall looks decent. Couple thoughts:

  1. Why did we choose Gemma3-4B? It's also a gated model so users will need to navigate to HuggingFace to accept Google's licensing. This is not yet clear.
  2. We should make it more clear that bitsandbytes isn't supported on Windows
  3. The Lora and QLora explanations are a bit weak - I don't think normal users will be able to understand this. If we want to explain it, we can be more thorough.
  4. To compensate for the extra length above, we can save some space on some of the hyperparameter tuning. Instead of showing each value higher or lower, we can just tell them to change it themselves. Essentially, more concise here.
  5. Is the command rocm-smi? I think amd-smi might be the one.
  6. Wandb can be optional since it's a fairly significant addition.
  7. Lastly, if you can add any screenshots anywhere that might be good. Wherever you think is best. Maybe the result of the fine tuning?

Copy link
Copy Markdown
Collaborator

@danielholanda danielholanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed and tested on Windows! Worked great.

Please wait for Adam's review before merging.

Comment thread playbooks/supplemental/pytorch-finetuning/README.md
Comment thread playbooks/supplemental/pytorch-finetuning/README.md
Comment thread playbooks/supplemental/pytorch-finetuning/README.md
Comment thread playbooks/supplemental/pytorch-finetuning/README.md
Comment thread playbooks/supplemental/pytorch-finetuning/playbook.json Outdated
Comment thread playbooks/supplemental/pytorch-finetuning/README.md
Copy link
Copy Markdown
Collaborator

@adamlam2-amd adamlam2-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address Daniel's comments as well as the comment I left above.

Otherwise, most of it looks good. Will retest when QA sees it.

@iswaryaalex
Copy link
Copy Markdown
Collaborator Author

@danielholanda Adding a hidden test to do single run of lora (model loading, dataset loading and 1 iteration) with a timeout
image
Ran the test locally - looks good

@iswaryaalex
Copy link
Copy Markdown
Collaborator Author

@adamlam2-amd Thanks for the review. My key changes are

  • Added test for single run of LoRa
  • Added specific instructions to load gated models like Gemma (we based everything on this for stability.. others were iffy). Install and authenticate HF steps are also added
  • Added instructions for "Dataset example" what the format means, how to use it and what we accomplish from finetuning
  • Added more details on Lora/Qlora
  • Made hyperparam tuning more concise

@danielholanda
Copy link
Copy Markdown
Collaborator

@adamlam2-amd Anything else you would like to see here before approving?

Copy link
Copy Markdown
Collaborator

@adamlam2-amd adamlam2-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@iswaryaalex
Copy link
Copy Markdown
Collaborator Author

Alright! Merging

@iswaryaalex iswaryaalex merged commit e53108c into main Mar 11, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants