<a href="https://colab.research.google.com/github/DarshK01/Portfolio/blob/main/AI_Research_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
# ==============================================================================
# 1. INSTALLATION - Installs all necessary and compatible libraries
# ==============================================================================
# ==============================================================================
# Step 0: Set the Memory Allocator Configuration (The Fix)
# ==============================================================================
import os
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'expandable_segments:True'

print("✅ PYTORCH_CUDA_ALLOC_CONF set to 'expandable_segments:True'")

# ==============================================================================
# Step 1: Install all necessary and compatible libraries
# ==============================================================================
print("🚀 Installing libraries...")
!pip install -q transformers==4.41.2 peft==0.11.1 accelerate==0.30.1 bitsandbytes>=0.43.2 trl==0.8.6 triton

print("✅ All libraries installed successfully!")




✅ PYTORCH_CUDA_ALLOC_CONF set to 'expandable_segments:True'
🚀 Installing libraries...
✅ All libraries installed successfully!


In [4]:
# ==============================================================================
# 2. LOGIN TO HUGGING FACE
# ==============================================================================

from huggingface_hub import notebook_login

print("Please enter your Hugging Face access token with 'write' permissions.")
notebook_login()

Please enter your Hugging Face access token with 'write' permissions.


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [9]:
# Step 3: Loading Model and Tokenizer

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

print("🚀 Loading Model and Tokenizer...")
model_name = "microsoft/Phi-3-mini-4k-instruct"

# QLoRA configuration
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
)

# Load the base model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)
model.config.use_cache = False

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right" # Fixes a warning message during training

print(f"✅ Model '{model_name}' loaded successfully!")

🚀 Loading Model and Tokenizer...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


✅ Model 'microsoft/Phi-3-mini-4k-instruct' loaded successfully!


In [18]:
# Step 4: Preparing the dataset

from datasets import load_dataset

print("\n🚀 Step 4: Preparing the dataset...")
dataset_name = "iamtarun/python_code_instructions_18k_alpaca"
dataset = load_dataset(dataset_name, split="train")

# Rename the 'output' column to 'completion' to match SFTTrainer's expectation
dataset = dataset.rename_column("output", "completion")

print("✅ Dataset prepared successfully!")


🚀 Step 4: Preparing the dataset...
✅ Dataset prepared successfully!


In [19]:
print("\n🚀 Step 5: Configuring training...")

from peft import LoraConfig
from transformers import TrainingArguments

# LoRA config (no changes here)
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=[
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj",
    ],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

# --- MODIFICATION IS HERE ---
training_args = TrainingArguments(
    output_dir="./phi3-python-assistant",
    per_device_train_batch_size=1,
    gradient_accumulation_steps=16,
    optim="adamw_8bit",  # Use the 8-bit AdamW optimizer to save memory
    learning_rate=2e-4,
    num_train_epochs=1,
    logging_steps=10,
    fp16=True,
    save_strategy="epoch",
    push_to_hub=True,
    hub_model_id="DarshaniKare/Phi3-mini-Python-Assistant",
)

print("✅ Training configuration is set with 8-bit optimizer for maximum memory saving.")


🚀 Step 5: Configuring training...
✅ Training configuration is set with 8-bit optimizer for maximum memory saving.


In [25]:
# Step 6: Initializing Trainer

from trl import SFTTrainer

print("🚀 Step 6: Initializing Trainer...")

# Create the trainer with the final memory optimization
trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=dataset,
    peft_config=lora_config,
    dataset_text_field="prompt",
    max_seq_length=512,
)

print("🔥 Starting the training process...")
trainer.train()
print("✅ Training complete!")

🚀 Step 6: Initializing Trainer...


Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
  self.scaler = torch.cuda.amp.GradScaler(**kwargs)


🔥 Starting the training process...


[34m[1mwandb[0m: Currently logged in as: [33mdarshanikare[0m ([33mdarshanikare-usha-mittal-institute-of-technology[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin




Step,Training Loss
10,0.7512
20,0.5507
30,0.5225
40,0.4673
50,0.4543
60,0.4704
70,0.4908
80,0.474
90,0.4507
100,0.4783


Step,Training Loss
10,0.7512
20,0.5507
30,0.5225
40,0.4673
50,0.4543
60,0.4704
70,0.4908
80,0.474
90,0.4507
100,0.4783




✅ Training complete!


In [26]:
# ==============================================================================
# 7. INFERENCE (TESTING THE MODEL)
# ==============================================================================
print("\n🚀 Step 7: Testing the fine-tuned model...")
from transformers import pipeline

prompt = "Write a Python function that finds the longest word in a sentence."
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
result = pipe(f"[INST] {prompt} [/INST]")

print("\n--- MODEL RESPONSE ---")
print(result[0]['generated_text'])
print("--------------------")

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.



🚀 Step 7: Testing the fine-tuned model...

--- MODEL RESPONSE ---
[INST] Write a Python function that finds the longest word in a sentence. [/INST] def longest_word(sentence):
    words = sentence.split()
    longest_word = ""
    for word in words:
        if len(word) > len(longest_word):
            longest_word = word
    return longest_word

sentence = "This is a sample sentence"
longest_word = longest_word(sentence)
print(longest_word) # Output: "sentence"

--------------------
