###Fine-tune the unsloth/phi-4-unsloth-bnb-4bit

In [1]:
%%capture
import os
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    # Do this only in Colab notebooks! Otherwise use pip install unsloth
    !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets>=3.4.1" huggingface_hub hf_transfer
    !pip install --no-deps unsloth

In [None]:
import pandas as pd
from datasets import Dataset


# 2. Load Your Dataset

file_path = "/content/merged_constitution_data.csv"  # upload via Colab sidebar
df = pd.read_csv(file_path)

print(df.head())  # must have "prompt", "response" columns

# Convert to Hugging Face Dataset
dataset = Dataset.from_pandas(df[["prompt", "response"]])

                                              prompt  \
0  What is the structure of the U.S. Congress as ...   
1  How often are members of the House of Represen...   
2  What are the age and citizenship requirements ...   
3  How was the population to be counted for repre...   
4  Who has the power to issue writs of election t...   

                                            response  
0  Article I, Section 1 states: 'All legislative ...  
1  Article I, Section 2 states that members of th...  
2  A Representative must be at least twenty-five ...  
3  Population was determined by adding the whole ...  
4  Article I, Section 2 states that 'the Executiv...  


In [None]:
# 3. Format for Instruction Fine-Tuning

def formatting_func(example):
    text = f"Question: {example['prompt']}\nAnswer: {example['response']}"
    return {"text": text}

# Apply formatting to dataset
dataset = dataset.map(formatting_func, remove_columns=["prompt", "response"])
print(dataset[0])

Map:   0%|          | 0/302 [00:00<?, ? examples/s]

{'text': "Question: What is the structure of the U.S. Congress as defined by the Constitution?\nAnswer: Article I, Section 1 states: 'All legislative Powers herein granted shall be vested in a Congress of the United States, which shall consist of a Senate and House of Representatives.'"}


In [None]:
# 4. Split into Train / Eval

# 90% train, 10% eval (you can adjust)
dataset = dataset.train_test_split(test_size=0.1, seed=42)

train_dataset = dataset["train"]
eval_dataset  = dataset["test"]

print(f"Train size: {len(train_dataset)} | Eval size: {len(eval_dataset)}")
print(train_dataset[0])

Train size: 271 | Eval size: 31
{'text': 'Question: What requirement is set forth in AMENDMENT XVII?\nAnswer: Passed by Congress May 13, 1912. Ratified April 8, 1913. Article I, section 3, of the Constitution was modified by the 17th amendment. The Senate of the United States shall be composed of two Senators from each State, elected by the people thereof, for six years; and each Senator shall have one vote. The electors in each State shall have the qualifications requisite for electors of the most numerous branch of the State legislatures. When vacancies happen in the representation of any State in the Senate, the executive authority of such State shall issue writs of election to fill such vacancies: Provided, That the legislature of any State may empower the executive thereof to make temporary appointments until the people fill the vacancies by election as the legislature may direct. This amendment shall not be so construed as to affect the election or term of any Senator chosen befo

In [2]:
5. #loadmodel
from unsloth import FastLanguageModel

model_name = "unsloth/phi-4-unsloth-bnb-4bit"
max_seq_length = 1024

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = model_name,
    max_seq_length = max_seq_length,
    load_in_4bit = True,

)

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.


    PyTorch 2.6.0+cu124 with CUDA 1204 (you have 2.8.0+cu126)
    Python  3.12.9 (you have 3.12.11)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details


🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))==  Unsloth 2025.8.10: Fast Llama patching. Transformers: 4.55.4.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.8.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.4.0
\        /    Bfloat16 = FALSE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/4.39G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/1.03G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/170 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

In [None]:
# 6. Apply LoRA (PEFT)

model = FastLanguageModel.get_peft_model(
    model,
    r = 8,   # smaller rank for large 20B
    target_modules = ["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"],
    lora_alpha = 16,
    lora_dropout = 0.05,
    bias = "none",
    use_gradient_checkpointing = "unsloth",
    random_state = 42,
)

Unsloth: Dropout = 0 is supported for fast patching. You are using dropout = 0.05.
Unsloth will patch all other layers, except LoRA matrices, causing a performance hit.
Unsloth 2025.8.10 patched 40 layers with 0 QKV layers, 0 O layers and 0 MLP layers.


In [None]:

# 7. Training Setup

from transformers import TrainingArguments, DataCollatorForLanguageModeling
# Training arguments
training_args = TrainingArguments(
    output_dir="./phi-4_finetuned",
    per_device_train_batch_size=1,   # small for Colab
    gradient_accumulation_steps=8,   # simulate larger batch
    num_train_epochs=3,
    learning_rate=2e-4,
    fp16=True,
    logging_dir="./logs",
    logging_steps=10,
    save_strategy="epoch",
    eval_steps=None,               # evaluation runs each epoch
    save_total_limit=2,            # keep last 2 checkpoints
    report_to="none" ,         # no wandb in this run

)


data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

In [None]:
# 8. Trainer

from trl import SFTTrainer

# Trainer
trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    tokenizer=tokenizer,
    data_collator=data_collator,
)

Unsloth: Tokenizing ["text"] (num_proc=6):   0%|          | 0/271 [00:00<?, ? examples/s]

Unsloth: Tokenizing ["text"] (num_proc=6):   0%|          | 0/31 [00:00<?, ? examples/s]

In [None]:
# 9. Train

trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 271 | Num Epochs = 3 | Total steps = 102
O^O/ \_/ \    Batch size per device = 1 | Gradient accumulation steps = 8
\        /    Data Parallel GPUs = 1 | Total batch size (1 x 8 x 1) = 8
 "-____-"     Trainable parameters = 32,768,000 of 14,692,275,200 (0.22% trained)


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss
10,0.6439
20,0.3763
30,0.5376
40,0.3412
50,0.2325
60,0.3154
70,0.2328
80,0.1478
90,0.1856
100,0.1872


TrainOutput(global_step=102, training_loss=0.32256876019870534, metrics={'train_runtime': 1484.2442, 'train_samples_per_second': 0.548, 'train_steps_per_second': 0.069, 'total_flos': 7189846924124160.0, 'train_loss': 0.32256876019870534})

In [None]:
# 10. save model
from huggingface_hub import HfApi, ModelCard

# Replace with your actual Hugging Face token
api = HfApi(token="enter you hf token here")

# 1. Create a repository (if it doesn't exist)
api.create_repo(
    repo_id="abdulsamad99/CONSTITUTION_USA-fine-tuning",
    repo_type="model",
    exist_ok=True
)

# 2. Save your model and tokenizer locally first
model.save_pretrained("constitution_USA_model")
tokenizer.save_pretrained("constitution_USA_model")

# 3. Upload to Hub
api.upload_folder(
    folder_path="constitution_USA_model",
    repo_id="abdulsamad99/CONSTITUTION_USA-fine-tuning",
    repo_type="model"
)

# 4. Create a model card (optional but recommended)
card = ModelCard.load("constitution_USA_model/README.md")  # Create this file first
card.push_to_hub("abdulsamad99/CONSTITUTION_USA-fine-tuning")

Processing Files (0 / 0)                : |          |  0.00B /  0.00B            

New Data Upload                         : |          |  0.00B /  0.00B            

  ...USA_model/adapter_model.safetensors:   0%|          | 64.7kB /  131MB            

RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-68b809f1-74b972bc41ade15e24b584cd;eec6cc54-5202-4f5b-b3fa-5d256007eff3)

Repository Not Found for url: https://huggingface.co/api/models/abdulsamad99/CONSTITUTION_USA-fine-tuning/preupload/main.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication
Invalid username or password.
Note: Creating a commit assumes that the repo already exists on the Huggingface Hub. Please use `create_repo` if it's not the case.