FineTuningLLMs

Collection of scripts to perform finetuning on consumer grade hardware or google colab free tier

🔧 LLM Fine-Tuning on Consumer Hardware (CPU / MPS / Free Colab)

This repository provides minimal and resource-friendly code to fine-tune large language models (LLMs) using LoRA on custom instruction-style datasets (like Alpaca) using either:

✅ CPU-only or Apple M1/M2 (via MPS)
✅ Free Colab GPUs (no paid subscription required)

Our approach allows modular plug-and-play fine-tuning across different models and datasets.

📌 Key Strategies for Consumer-Grade Fine-Tuning

Parameter	Why We Use It
`LoRA` via `peft`	Trains only a small set of parameters → saves memory
`batch_size=1`	Prevents OOM errors on CPU / MPS / small GPUs
`gradient_accumulation_steps=4`	Simulates larger batch size with less memory
`learning_rate=2e-4`	Empirically effective for small models + LoRA
`num_train_epochs=1-3`	Quick convergence for small datasets; can be increased for better results
`fp16=False` on CPU/MPS	Mixed precision not stable without CUDA
`save_total_limit=2`	Prevents disk overflow from multiple checkpoints
`alpaca-style datasets`	Easy to adapt for any task format (instruction + input → response)

🛠️ General Setup

pip install -r requirements.txt

🔄 Fine-Tuning Workflow Summary

This section outlines the standard pattern followed in all training scripts within this repo. It ensures modularity, clarity, and compatibility with consumer-grade hardware setups.

🔧 Step-by-step Flow:

Load the base model and tokenizer
Use AutoModelForCausalLM and AutoTokenizer from Hugging Face.
Set torch_dtype=torch.float32 and low_cpu_mem_usage=True to reduce memory usage.
Configure LoRA with LoraConfig
Leverage peft to define low-rank adaptation:
- r=8 (rank)
- lora_alpha=16 (scaling factor)
- lora_dropout=0.1
- bias="none"
- task_type=TaskType.CAUSAL_LM
Format your dataset
Each record must have:
- instruction (what the model should do)
- input (optional context)
- output (the expected response)
Format the prompt in Alpaca style: Instruction: {instruction} Input: {input} Response: {output}
Fine-tune using SFTTrainer
A lightweight wrapper from trl simplifies the process.
Pass in:

model (with LoRA applied)
train_dataset (after tokenization)
TrainingArguments
data_collator (disable MLM)

Save or merge the LoRA adapter

Use save_steps and save_total_limit to control checkpointing.
Optionally merge the LoRA adapter into the base model after training using PEFT utilities.

🧪 Fine-Tuning `phi-2` on Alpaca (LoRA, CPU/MPS-Friendly)

📁 Folder: ./experiments/phi2-alpaca-lora

⚙️ Parameters

Component	Setting
Base Model	`microsoft/phi-2`
Dataset	`tatsu-lab/alpaca` (1% sample)
LoRA Rank (r)	8
LoRA Alpha	16
Dropout	0.1
Max Length	512 tokens
Device	CPU / MPS (Apple Silicon)
Batch Size	1
Accumulation	4
Epochs	1
Learning Rate	`2e-4`

🧠 Why These Settings?

r=8, alpha=16 → Good balance between performance and efficiency for LoRA on small devices
gradient_accumulation_steps=4 → Effective batch size = 4
max_length=512 → Shorter sequences enable faster training
output_dir=./mistral-alpaca-lora → Easy to organize multiple experiments

🧾 `format_text()` Function

We use Alpaca-style formatting for supervised fine-tuning:

def format_text(text):
    if text["input"]:
        full_prompt = f"Instruction: {text['instruction']}\nInput: {text['input']}\n\nResponse:"
    else:
        full_prompt = f"Instruction: {text['instruction']}\n\nResponse:"
    tokenized = tokenizer(full_prompt + " " + text["output"], ...)

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
FineTuningWithQLoRA.html		FineTuningWithQLoRA.html
FineTuningWithQLoRA.ipynb		FineTuningWithQLoRA.ipynb
README.md		README.md
Requirements.txt		Requirements.txt
inference.py		inference.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FineTuningLLMs

🔧 LLM Fine-Tuning on Consumer Hardware (CPU / MPS / Free Colab)

📌 Key Strategies for Consumer-Grade Fine-Tuning

🛠️ General Setup

🔄 Fine-Tuning Workflow Summary

🔧 Step-by-step Flow:

🧪 Fine-Tuning `phi-2` on Alpaca (LoRA, CPU/MPS-Friendly)

⚙️ Parameters

🧠 Why These Settings?

🧾 `format_text()` Function

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FineTuningLLMs

🔧 LLM Fine-Tuning on Consumer Hardware (CPU / MPS / Free Colab)

📌 Key Strategies for Consumer-Grade Fine-Tuning

🛠️ General Setup

🔄 Fine-Tuning Workflow Summary

🔧 Step-by-step Flow:

🧪 Fine-Tuning phi-2 on Alpaca (LoRA, CPU/MPS-Friendly)

⚙️ Parameters

🧠 Why These Settings?

🧾 format_text() Function

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

🧪 Fine-Tuning `phi-2` on Alpaca (LoRA, CPU/MPS-Friendly)

🧾 `format_text()` Function

Packages