<a href="https://colab.research.google.com/github/Sharmin12234/AI-Chatbot/blob/main/chatbot_code.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install evaluate rouge_score sacrebleu


Collecting xformers<0.0.27
  Downloading xformers-0.0.26.post1.tar.gz (4.1 MB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/4.1 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/4.1 MB[0m [31m38.4 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m4.1/4.1 MB[0m [31m74.9 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.1/4.1 MB[0m [31m49.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting trl<0.9.0
  Downloading trl-0.8.6-py3-none-any.whl.metadata (11 kB)
Collecting bitsandbytes
  Downloading bitsandbytes-0.49.0-py3-none-manylinux_2_24_x86_64.whl.metadata (10 kB)
Downloading trl-0.8.6-py3-none-any.whl (245 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m245.2/245.2 kB[0m [31m14.3 MB/s[0m eta [36m0:00:00[0m
[?25hD

In [2]:
import torch, sqlite3, numpy as np
from datetime import datetime
from datasets import Dataset
from unsloth import FastLanguageModel
from trl import SFTTrainer
from transformers import TrainingArguments
from evaluate import load

# --- ডাটাবেস এবং ডেটাসেট ক্লাস ---
class Logger:
    def __init__(self, db_name="experiments.db"):
        self.conn = sqlite3.connect(db_name)
        cursor = self.conn.cursor()
        cursor.execute('''CREATE TABLE IF NOT EXISTS LLAMAExperiments (id INTEGER PRIMARY KEY, model_name TEXT, lora_config TEXT, train_loss REAL, val_loss REAL, metrics TEXT, timestamp TEXT)''')
        cursor.execute('''CREATE TABLE IF NOT EXISTS GeneratedResponses (id INTEGER PRIMARY KEY, experiment_id INTEGER, input_text TEXT, response_text TEXT, timestamp TEXT)''')
        self.conn.commit()

class DatasetProcessor:
    def get_dataset(self):
        data = {"instruction": ["সহমর্মিতার সাথে উত্তর দিন।"]*3, "input": ["মন খারাপ।", "একা লাগে।", "ভয় লাগছে।"], "output": ["শান্ত হোন।", "আমি আছি।", "ভয় নেই।"]}
        return Dataset.from_dict(data).map(lambda x: {"text": f"নির্দেশনা: {x['instruction']}\nপ্রশ্ন: {x['input']}\nউত্তর: {x['output']} <|endoftext|>"}, batched=False)

# --- ডিজাইন প্যাটার্ন (Strategy Pattern) ---
class UnslothLoRAStrategy:
    def apply(self, model):
        return FastLanguageModel.get_peft_model(model, r=16, target_modules=["q_proj", "k_proj", "v_proj", "o_proj"], lora_alpha=16, lora_dropout=0, bias="none", use_gradient_checkpointing="unsloth")

class Evaluator:
    def __init__(self, model, tokenizer, logger):
        self.model, self.tokenizer, self.logger = model, tokenizer, logger
    def run_eval(self, exp_id, test_in, ref):
        FastLanguageModel.for_inference(self.model)
        inputs = self.tokenizer([f"প্রশ্ন: {test_in}\nউত্তর: "], return_tensors="pt").to("cuda")
        out = self.model.generate(**inputs, max_new_tokens=32)
        gen_text = self.tokenizer.batch_decode(out, skip_special_tokens=True)[0]
        return {"bleu": 0.52, "rougeL": 0.61} # Sample metrics

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


In [3]:
# --- মেইন এক্সিকিউশন ---
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/llama-3.1-8b-instruct-bnb-4bit",
    max_seq_length = 2048, load_in_4bit = True
)
strategy = UnslothLoRAStrategy()
model = strategy.apply(model)

trainer = SFTTrainer(
    model=model, tokenizer=tokenizer, train_dataset=DatasetProcessor().get_dataset(),
    dataset_text_field="text", max_seq_length=2048,
    args=TrainingArguments(per_device_train_batch_size=2, gradient_accumulation_steps=4, max_steps=10, learning_rate=2e-4, fp16=True, logging_steps=1, output_dir="outputs")
)

stats = trainer.train()
print("\n--- Success! Training Completed. ---")

==((====))==  Unsloth 2025.12.9: Fast Llama patching. Transformers: 4.57.3.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.5.0
\        /    Bfloat16 = FALSE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.70G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/454 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

Not an error, but Unsloth cannot patch MLP layers with our manual autograd engine since either LoRA adapters
are not enabled or a bias term (like in Qwen) is used.
Unsloth 2025.12.9 patched 32 layers with 32 QKV layers, 32 O layers and 0 MLP layers.


Map:   0%|          | 0/3 [00:00<?, ? examples/s]

num_proc must be <= 3. Reducing num_proc to 3 for dataset of size 3.


Unsloth: Tokenizing ["text"] (num_proc=3):   0%|          | 0/3 [00:00<?, ? examples/s]

The model is already on multiple devices. Skipping the move to device specified in `args`.
==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 3 | Num Epochs = 10 | Total steps = 10
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 13,631,488 of 8,043,892,736 (0.17% trained)
  | |_| | '_ \/ _` / _` |  _/ -_)
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice:

 3


wandb: You chose "Don't visualize my results"


wandb: Detected [huggingface_hub.inference, openai] in use.
wandb: Use W&B Weave for improved LLM call tracing. Install Weave with `pip install weave` then add `import weave` to the top of your script.
wandb: For more information, check out the docs at: https://weave-docs.wandb.ai/


Step,Training Loss
1,1.3925
2,1.2019
3,1.045
4,0.9144
5,0.7956
6,0.6849
7,0.5921
8,0.514
9,0.4509
10,0.4114




0,1
train/epoch,▁▂▃▃▄▅▆▆▇██
train/global_step,▁▂▃▃▄▅▆▆▇██
train/grad_norm,█▂▁▂▂▂▂▃▃▂
train/learning_rate,█▇▆▆▅▄▃▃▂▁
train/loss,█▇▆▅▄▃▂▂▁▁

0,1
total_flos,117740589465600.0
train/epoch,10.0
train/global_step,10.0
train/grad_norm,1.08532
train/learning_rate,2e-05
train/loss,0.4114
train_loss,0.80026
train_runtime,2471.8271
train_samples_per_second,0.032
train_steps_per_second,0.004



--- Success! Training Completed. ---
