# Qwen3-4B Offensive/Not Classification (Arabic Tweets)
This notebook is structured to avoid the issues we faced:
- NumPy/SciPy binary mismatch
- `TrainingArguments` parameter name differences (`evaluation_strategy` vs `eval_strategy`)
- Class imbalance handled **ONLY on Train** using `WeightedRandomSampler`
- Stratified split implemented **without scikit-learn** (works even if sklearn breaks)


## 0) Environment Fix (Run once) ثم Restart Runtime
Run this cell **only once** if you see SciPy/NumPy binary errors. After it finishes, **Restart runtime**.


In [1]:
# =========================
# Step 0: Fix binary mismatch (NumPy/SciPy)
# Run once, then RESTART runtime.
# =========================

!pip -q uninstall -y numpy scipy scikit-learn
!pip -q install --no-cache-dir -U "numpy==1.26.4" "scipy==1.11.4" "scikit-learn==1.4.2"

!pip -q install --no-cache-dir -U transformers accelerate datasets peft trl bitsandbytes evaluate

import numpy as np, scipy, sklearn, transformers
print("NumPy:", np.__version__)
print("SciPy:", scipy.__version__)
print("sklearn:", sklearn.__version__)
print("transformers:", transformers.__version__)

print("✅ If this cell was run to fix errors, now do: Runtime -> Restart runtime")


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.0/61.0 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.4/60.4 kB[0m [31m271.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.0/18.0 MB[0m [31m313.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m35.8/35.8 MB[0m [31m314.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.2/12.2 MB[0m [31m297.4 MB/s[0m eta [36m0:00:00[0m
[0m[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datasets 4.0.0 requires pyarrow>=15.0.0, which is not installed.
pandas-gbq 0.30.0 requires pyarrow>=4.0.0, which is not installed.
cudf-cu12 25.10.0 requires pyarrow>=15.0.0; platform_machine == "x86_64", which is 

## 1) Imports + Global Seed


In [1]:
# =========================
# Step 1: Imports + Seed
# =========================
import inspect
import numpy as np
import pandas as pd
import torch

SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)

print("CUDA available:", torch.cuda.is_available())


CUDA available: True


## 2) Load Data + Clean + Map Labels
- Dataset B: labeled (Arabic.csv) -> columns: tweet, label
- Dataset A: unlabeled (merged_twitterdata.csv) -> column: text


In [2]:
# =========================
# Step 2: Load datasets
# =========================

# Dataset B (labeled)
train_df = pd.read_csv("/content/Arabic.csv").dropna(subset=["tweet", "label"])

# Dataset A (unlabeled)
real_df = pd.read_csv("/content/merged_twitterdata.csv").dropna(subset=["text"])

# Label mapping
label_map = {"not": 0, "offensive": 1}

# Normalize and map labels
train_df["label"] = train_df["label"].astype(str).str.strip().str.lower()
train_df["label_id"] = train_df["label"].map(label_map)

# Drop unknown labels (safety)
train_df = train_df.dropna(subset=["label_id"]).copy()
train_df["label_id"] = train_df["label_id"].astype(int)

print(train_df[["label", "label_id"]].head())
print("NaN in label_id:", train_df["label_id"].isna().sum())
print("Label counts:\n", train_df["label"].value_counts())


       label  label_id
0  offensive         1
1  offensive         1
2  offensive         1
3  offensive         1
4  offensive         1
NaN in label_id: 0
Label counts:
 label
not          7364
offensive    3867
Name: count, dtype: int64


## 3) Stratified Train/Val/Test Split (No scikit-learn)
This avoids sklearn import issues.


In [3]:
# =========================
# Step 3: Stratified split WITHOUT scikit-learn
# 80% train, 10% val, 10% test
# =========================

def stratified_split(df, label_col="label_id", train_size=0.8, val_size=0.1, test_size=0.1, seed=42):
    assert abs(train_size + val_size + test_size - 1.0) < 1e-9, "Splits must sum to 1.0"

    train_parts, val_parts, test_parts = [], [], []
    for _, g in df.groupby(label_col):
        g = g.sample(frac=1.0, random_state=seed).reset_index(drop=True)  # shuffle per class
        n = len(g)
        n_train = int(round(n * train_size))
        n_val   = int(round(n * val_size))
        n_test  = n - n_train - n_val  # remainder

        train_parts.append(g.iloc[:n_train])
        val_parts.append(g.iloc[n_train:n_train+n_val])
        test_parts.append(g.iloc[n_train+n_val:n_train+n_val+n_test])

    train_out = pd.concat(train_parts).sample(frac=1.0, random_state=seed).reset_index(drop=True)
    val_out   = pd.concat(val_parts).sample(frac=1.0, random_state=seed).reset_index(drop=True)
    test_out  = pd.concat(test_parts).sample(frac=1.0, random_state=seed).reset_index(drop=True)

    return train_out, val_out, test_out

train_part, val_part, test_part = stratified_split(train_df, label_col="label_id", seed=SEED)

print("Train:", train_part.shape, "Val:", val_part.shape, "Test:", test_part.shape)
print("\nTrain dist:\n", train_part["label"].value_counts(normalize=True))
print("\nVal dist:\n", val_part["label"].value_counts(normalize=True))
print("\nTest dist:\n", test_part["label"].value_counts(normalize=True))


Train: (8985, 4) Val: (1123, 4) Test: (1123, 4)

Train dist:
 label
not          0.655648
offensive    0.344352
Name: proportion, dtype: float64

Val dist:
 label
not          0.655387
offensive    0.344613
Name: proportion, dtype: float64

Test dist:
 label
not          0.656278
offensive    0.343722
Name: proportion, dtype: float64


## 4) Prompt Template + HF Datasets
We train Qwen (causal LM) to **generate** the label token: `not` or `offensive`.


In [4]:
# =========================
# Step 4: Prompt + Datasets
# =========================
from datasets import Dataset

def build_prompt(tweet: str) -> str:
    # English comments required
    return (
        "Classify the following Arabic tweet into exactly one label: not or offensive.\n"
        "Answer with one word only (not/offensive). No explanation.\n\n"
        f"Tweet: {tweet}\n"
        "Answer:"
    )

def to_sft_example(row):
    prompt = build_prompt(row["tweet"])
    completion = "not" if int(row["label_id"]) == 0 else "offensive"
    return {"prompt": prompt, "completion": completion, "label_id": int(row["label_id"])}

train_ds = Dataset.from_pandas(train_part.reset_index(drop=True)).map(to_sft_example)
val_ds   = Dataset.from_pandas(val_part.reset_index(drop=True)).map(to_sft_example)
test_ds  = Dataset.from_pandas(test_part.reset_index(drop=True)).map(to_sft_example)

print("Train columns:", train_ds.column_names)
print("Example:", train_ds[0])


Map:   0%|          | 0/8985 [00:00<?, ? examples/s]

Map:   0%|          | 0/1123 [00:00<?, ? examples/s]

Map:   0%|          | 0/1123 [00:00<?, ? examples/s]

Train columns: ['created_at', 'tweet', 'label', 'label_id', 'prompt', 'completion']
Example: {'created_at': '2023-02-01 21:32:54+00:00', 'tweet': 'هذا مسلي ال معمر من انت يابكيري', 'label': 'not', 'label_id': 0, 'prompt': 'Classify the following Arabic tweet into exactly one label: not or offensive.\nAnswer with one word only (not/offensive). No explanation.\n\nTweet: هذا مسلي ال معمر من انت يابكيري\nAnswer:', 'completion': 'not'}


## 5) Handle Class Imbalance (Train ONLY)
Your counts are roughly `not=7364`, `offensive=3867`.
We use `WeightedRandomSampler` so the minority class is sampled more often **during training only**.


In [5]:
# =========================
# Step 5: WeightedRandomSampler (train only)
# =========================
from torch.utils.data import WeightedRandomSampler

train_labels = np.array(train_ds["label_id"], dtype=int)
class_counts = np.bincount(train_labels)
class_counts = np.maximum(class_counts, 1)  # safety

# Inverse-frequency class weights
class_weights = 1.0 / class_counts

# Weight per sample
sample_weights = class_weights[train_labels]

sampler = WeightedRandomSampler(
    weights=torch.tensor(sample_weights, dtype=torch.double),
    num_samples=len(sample_weights),
    replacement=True
)

print("Class counts:", class_counts)
print("Class weights:", class_weights)
print("Sampler ready ✅")


Class counts: [5891 3094]
Class weights: [0.00016975 0.00032321]
Sampler ready ✅


## 6) Load Qwen/Qwen3-4B (4-bit) + Tokenizer


In [6]:
# =========================
# Step 6: Load Qwen3-4B in 4-bit
# =========================
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

MODEL_ID = "Qwen/Qwen3-4B"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float16,
)

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    quantization_config=bnb_config,
    device_map="auto",
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float16,
)

# Ensure pad token exists
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

print("Loaded model ✅", MODEL_ID)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json:   0%|          | 0.00/11.4M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/726 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/99.6M [00:00<?, ?B/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/3.96G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/3.99G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

Loaded model ✅ Qwen/Qwen3-4B


## 7) LoRA + TrainingArguments (Version-safe)
This cell avoids the `evaluation_strategy` error by checking which argument name exists.


In [9]:
from peft import LoraConfig
from transformers import TrainingArguments
import inspect
import torch

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.10,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"],
)

def make_training_args(output_dir: str, seed: int):
    sig = inspect.signature(TrainingArguments.__init__)
    params = sig.parameters

    eval_arg_name = "evaluation_strategy" if "evaluation_strategy" in params else (
        "eval_strategy" if "eval_strategy" in params else None
    )

    kwargs = dict(
        output_dir=output_dir,
        per_device_train_batch_size=2,
        per_device_eval_batch_size=2,
        gradient_accumulation_steps=8,
        learning_rate=2e-4,
        num_train_epochs=2,
        warmup_ratio=0.05,
        lr_scheduler_type="cosine",
        weight_decay=0.01,
        eval_steps=100,
        save_steps=100,
        logging_steps=20,
        save_total_limit=2,
        max_grad_norm=1.0,
        optim="paged_adamw_8bit",
        seed=seed,
        report_to="none",
    )

    # Enable evaluation
    if eval_arg_name is not None:
        kwargs[eval_arg_name] = "steps"
    else:
        kwargs["do_eval"] = True

    # Mixed precision: pick ONLY ONE
    # A100/L4/H100 => bf16 True, fp16 False
    if "bf16" in params:
        kwargs["bf16"] = bool(torch.cuda.is_available())
    if "fp16" in params:
        kwargs["fp16"] = False

    # Remove unsupported keys
    for k in list(kwargs.keys()):
        if k not in params:
            kwargs.pop(k)

    return TrainingArguments(**kwargs)

training_args = make_training_args("/content/qwen3_offensive_cls", seed=SEED)

print("TrainingArguments ready ✅")
print("bf16:", getattr(training_args, "bf16", None), "| fp16:", getattr(training_args, "fp16", None))
print("Eval steps:", getattr(training_args, "eval_steps", None))
print("Save steps:", getattr(training_args, "save_steps", None))


TrainingArguments ready ✅
bf16: True | fp16: False
Eval steps: 100
Save steps: 100


## 8) Trainer with Weighted Sampler + Train
We override `get_train_dataloader()` to apply the sampler.


In [13]:
import torch

MAX_LEN = 512  # you can set 256 if you want faster training

def tokenize_for_sft(example):
    # Build full training text: prompt + completion
    text = example["prompt"] + " " + example["completion"]
    tokens = tokenizer(
        text,
        truncation=True,
        max_length=MAX_LEN,
        padding=False,
    )
    # Labels must match input_ids for causal LM
    tokens["labels"] = tokens["input_ids"].copy()
    return tokens

train_tok = train_ds.map(tokenize_for_sft, remove_columns=train_ds.column_names)
val_tok   = val_ds.map(tokenize_for_sft, remove_columns=val_ds.column_names)


Map:   0%|          | 0/8985 [00:00<?, ? examples/s]

Map:   0%|          | 0/1123 [00:00<?, ? examples/s]

In [14]:
from transformers import DataCollatorForLanguageModeling

# Causal LM => mlm=False
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)


In [15]:
import inspect
from trl import SFTTrainer

# Collect supported parameters from your installed TRL version
trainer_params = set(inspect.signature(SFTTrainer.__init__).parameters.keys())

base_kwargs = {
    "model": model,
    "train_dataset": train_tok,
    "eval_dataset": val_tok,
    "args": training_args,
    "peft_config": lora_config,
    "data_collator": data_collator,
}

# tokenizer argument name differs by version (tokenizer vs processing_class)
if "tokenizer" in trainer_params:
    base_kwargs["tokenizer"] = tokenizer
elif "processing_class" in trainer_params:
    base_kwargs["processing_class"] = tokenizer

# Filter out anything not supported (extra safety)
trainer_kwargs = {k: v for k, v in base_kwargs.items() if k in trainer_params}

print("✅ Passing these args to SFTTrainer:", sorted(trainer_kwargs.keys()))


✅ Passing these args to SFTTrainer: ['args', 'data_collator', 'eval_dataset', 'model', 'peft_config', 'processing_class', 'train_dataset']


In [16]:
print("sampler exists:", "sampler" in globals())


sampler exists: True


In [20]:
from transformers import DataCollatorForLanguageModeling

data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)


In [23]:
# Make sure pad token is set correctly
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model.config.pad_token_id = tokenizer.pad_token_id


In [24]:
MAX_LEN = 512  # 256 to speed up training

def tokenize_for_sft(example):
    text = example["prompt"] + " " + example["completion"]
    tokens = tokenizer(
        text,
        truncation=True,
        max_length=MAX_LEN,
        padding=False,   # IMPORTANT: no padding here, collator will handle it per-batch
    )
    return tokens

train_tok = train_ds.map(tokenize_for_sft, remove_columns=train_ds.column_names)
val_tok   = val_ds.map(tokenize_for_sft, remove_columns=val_ds.column_names)

print("✅ Tokenization done")
print(train_tok[0].keys())


Map:   0%|          | 0/8985 [00:00<?, ? examples/s]

Map:   0%|          | 0/1123 [00:00<?, ? examples/s]

✅ Tokenization done
dict_keys(['input_ids', 'attention_mask'])


In [25]:
import torch

def causal_lm_collator(features):
    # Pad inputs dynamically to the longest sequence in the batch
    batch = tokenizer.pad(
        features,
        padding=True,
        return_tensors="pt"
    )

    # Create labels from input_ids
    labels = batch["input_ids"].clone()

    # Ignore loss on padding tokens
    labels[batch["attention_mask"] == 0] = -100

    batch["labels"] = labels
    return batch

data_collator = causal_lm_collator
print("✅ Custom collator ready")


✅ Custom collator ready


In [26]:
import inspect
from trl import SFTTrainer
from torch.utils.data import DataLoader

class SFTTrainerWithSampler(SFTTrainer):
    def get_train_dataloader(self):
        return DataLoader(
            self.train_dataset,
            batch_size=self.args.train_batch_size,
            sampler=sampler,
            collate_fn=self.data_collator,  # <-- uses our custom collator now
            drop_last=self.args.dataloader_drop_last,
            num_workers=self.args.dataloader_num_workers,
            pin_memory=self.args.dataloader_pin_memory,
        )

trainer_params = set(inspect.signature(SFTTrainer.__init__).parameters.keys())

base_kwargs = {
    "model": model,
    "train_dataset": train_tok,
    "eval_dataset": val_tok,
    "args": training_args,
    "peft_config": lora_config,
    "data_collator": data_collator,  # <-- IMPORTANT
}

if "tokenizer" in trainer_params:
    base_kwargs["tokenizer"] = tokenizer
elif "processing_class" in trainer_params:
    base_kwargs["processing_class"] = tokenizer

trainer_kwargs = {k: v for k, v in base_kwargs.items() if k in trainer_params}

print("Passing to SFTTrainer:", sorted(trainer_kwargs.keys()))

trainer = SFTTrainerWithSampler(**trainer_kwargs)

trainer.train()

print("✅ Training done. global_step:", trainer.state.global_step)


Passing to SFTTrainer: ['args', 'data_collator', 'eval_dataset', 'model', 'peft_config', 'processing_class', 'train_dataset']




Truncating train dataset:   0%|          | 0/8985 [00:00<?, ? examples/s]

Truncating eval dataset:   0%|          | 0/1123 [00:00<?, ? examples/s]

You're using a Qwen2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Step,Training Loss,Validation Loss,Entropy,Num Tokens,Mean Token Accuracy
100,2.0234,2.010597,2.028882,112688.0,0.622948
200,1.9611,1.931551,1.949472,225305.0,0.634086
300,1.8264,1.886982,1.827533,338577.0,0.640947
400,1.6855,1.858807,1.812735,450630.0,0.64604
500,1.6484,1.830752,1.678992,562394.0,0.650887
600,1.6534,1.819237,1.659531,675210.0,0.653965
700,1.5606,1.801772,1.623268,788284.0,0.657351
800,1.5645,1.793296,1.622013,902143.0,0.660121
900,1.5137,1.78771,1.618425,1016086.0,0.661101
1000,1.499,1.785185,1.602935,1128391.0,0.661384


✅ Training done. global_step: 1124


## 9) Evaluate (Validation + Test)
We generate a short output and parse it as `not` or `offensive`.


In [30]:
from sklearn.metrics import accuracy_score, f1_score, classification_report
import torch

@torch.no_grad()
def predict_label_generate(tweet: str, max_new_tokens=3) -> str:
    prompt = build_prompt(tweet)
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    out = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        do_sample=False,
        temperature=0.0,
        pad_token_id=tokenizer.eos_token_id,
    )

    decoded = tokenizer.decode(out[0], skip_special_tokens=True)
    ans = decoded.split("Answer:")[-1].strip().lower()
    first = ans.split()[0] if ans else ""

    if "off" in first:
        return "offensive"
    return "not"

def eval_split_generate(df_part, name="Split"):
    y_true = df_part["label"].tolist()
    y_pred = [predict_label_generate(t) for t in df_part["tweet"].astype(str).tolist()]

    acc = accuracy_score(y_true, y_pred)
    f1  = f1_score(y_true, y_pred, pos_label="offensive")

    print(f"\n==== {name} (generate) ====")
    print("Accuracy:", acc)
    print("F1 (offensive positive):", f1)
    print(classification_report(y_true, y_pred, digits=4))

eval_split_generate(val_part, "Validation")
eval_split_generate(test_part, "Test")



==== Validation (generate) ====
Accuracy: 0.9519145146927872
F1 (offensive positive): 0.9291338582677166
              precision    recall  f1-score   support

         not     0.9559    0.9715    0.9636       736
   offensive     0.9440    0.9147    0.9291       387

    accuracy                         0.9519      1123
   macro avg     0.9499    0.9431    0.9464      1123
weighted avg     0.9518    0.9519    0.9517      1123


==== Test (generate) ====
Accuracy: 0.9536954585930543
F1 (offensive positive): 0.9329896907216495
              precision    recall  f1-score   support

         not     0.9673    0.9620    0.9646       737
   offensive     0.9282    0.9378    0.9330       386

    accuracy                         0.9537      1123
   macro avg     0.9477    0.9499    0.9488      1123
weighted avg     0.9538    0.9537    0.9538      1123



In [33]:
from sklearn.metrics import accuracy_score, f1_score, classification_report

def eval_split_generate_return(df_part, name="Split"):
    y_true = df_part["label"].tolist()
    y_pred = [predict_label_generate(t) for t in df_part["tweet"].astype(str).tolist()]

    acc = accuracy_score(y_true, y_pred)
    f1  = f1_score(y_true, y_pred, pos_label="offensive")
    report = classification_report(y_true, y_pred, digits=4, output_dict=True)

    return {
        "split": name,
        "accuracy": acc,
        "f1_offensive": f1,
        "precision_not": report["not"]["precision"],
        "recall_not": report["not"]["recall"],
        "f1_not": report["not"]["f1-score"],
        "precision_offensive": report["offensive"]["precision"],
        "recall_offensive": report["offensive"]["recall"],
        "f1_offensive_class": report["offensive"]["f1-score"],
    }


In [34]:
import pandas as pd

results = []
results.append(eval_split_generate_return(val_part, "Validation"))
results.append(eval_split_generate_return(test_part, "Test"))

results_df = pd.DataFrame(results)
results_df


Unnamed: 0,split,accuracy,f1_offensive,precision_not,recall_not,f1_not,precision_offensive,recall_offensive,f1_offensive_class
0,Validation,0.951024,0.927727,0.954606,0.971467,0.962963,0.94385,0.912145,0.927727
1,Test,0.955476,0.935401,0.967347,0.964722,0.966033,0.93299,0.937824,0.935401


In [35]:
out_path = "/content/generate_evaluation_results.csv"
results_df.to_csv(out_path, index=False)
print("✅ Saved generate results to:", out_path)


✅ Saved generate results to: /content/generate_evaluation_results.csv


In [36]:
import json

def save_generate_report(df_part, name):
    y_true = df_part["label"].tolist()
    y_pred = [predict_label_generate(t) for t in df_part["tweet"].astype(str).tolist()]
    report = classification_report(y_true, y_pred, digits=4, output_dict=True)

    path = f"/content/{name.lower()}_generate_report.json"
    with open(path, "w", encoding="utf-8") as f:
        json.dump(report, f, indent=2)

    print("✅ Saved:", path)

save_generate_report(val_part, "Validation")
save_generate_report(test_part, "Test")


✅ Saved: /content/validation_generate_report.json
✅ Saved: /content/test_generate_report.json


## 10) Save Adapter + Predict Unlabeled Dataset A

---




In [37]:
save_dir = "/content/qwen3_offensive_lora_adapter"
trainer.model.save_pretrained(save_dir)
tokenizer.save_pretrained(save_dir)
print("✅ Adapter saved to:", save_dir)


✅ Adapter saved to: /content/qwen3_offensive_lora_adapter


In [38]:
real_df_out = real_df.copy()
real_df_out["pred_label"] = real_df_out["text"].astype(str).apply(predict_label_generate)
real_df_out["pred_label_id"] = real_df_out["pred_label"].map({"not": 0, "offensive": 1})

out_path = "/content/merged_twitterdata_with_preds.csv"
real_df_out.to_csv(out_path, index=False, encoding="utf-8-sig")

print("✅ Predictions saved to:", out_path)
real_df_out.head()


✅ Predictions saved to: /content/merged_twitterdata_with_preds.csv


Unnamed: 0,url,twitterUrl,id,text,retweetCount,replyCount,likeCount,quoteCount,createdAt,bookmarkCount,isRetweet,isQuote,pred_label,pred_label_id
0,https://x.com/Gxxzi/status/1987582511455572351,https://twitter.com/Gxxzi/status/1987582511455...,1987582511455572351,كلب سئم من نباح كلب صغير وقام رمية في حمام الس...,19,2,92,0,Sun Nov 09 18:06:12 +0000 2025,29,False,True,not,0
1,https://x.com/Asem_a/status/1987570973743128746,https://twitter.com/Asem_a/status/198757097374...,1987570973743128746,يا ابن الحمير انا ما غلطت عليك ولا سبيتك لكن ا...,0,0,0,0,Sun Nov 09 17:20:22 +0000 2025,0,False,True,not,0
2,https://x.com/FaisalIdri61604/status/198756789...,https://twitter.com/FaisalIdri61604/status/198...,1987567891080749301,بعد فضيحة سرقة لوحاتها .. الفنانة الدنماركية ل...,0,0,0,0,Sun Nov 09 17:08:07 +0000 2025,0,False,False,not,0
3,https://x.com/Fallzhrani/status/19875611389265...,https://twitter.com/Fallzhrani/status/19875611...,1987561138926579734,طيب والله فضيحة لو ماحسبها بلنتي,0,0,0,0,Sun Nov 09 16:41:17 +0000 2025,0,False,False,not,0
4,https://x.com/ksa702aaa/status/198753138641334...,https://twitter.com/ksa702aaa/status/198753138...,1987531386413347295,كلب ومات,0,0,0,0,Sun Nov 09 14:43:03 +0000 2025,0,False,True,not,0


In [39]:
real_df_out["pred_label"].value_counts(normalize=True) * 100


Unnamed: 0_level_0,proportion
pred_label,Unnamed: 1_level_1
not,82.568807
offensive,17.431193


In [40]:
real_df_out["pred_label"].value_counts()


Unnamed: 0_level_0,count
pred_label,Unnamed: 1_level_1
not,180
offensive,38


In [41]:
real_df_out.sample(10)[["text", "pred_label"]]


Unnamed: 0,text,pred_label
100,المنشد سعد الغامدي .\nعن ( النشيد الاهلاوي ) \...,not
215,عن أبي هريرةرضي الله عنه:قال رسول اللهﷺ\n[لأن ...,not
139,احذروا نصابين الإستثمار https://t.co/ye2XznTvZG,not
178,عايز اشكر نفسي علي ان هي سكتت كتير علي ناس تست...,not
15,لمن تتذكرين تصرف غبي سويتيه تتفشلين اكثر من ال...,not
154,#الوطن | المؤبد لـ 24 من كوادر الإخوان بتهمة ح...,offensive
170,والهجرُ أقتلُ لي مما أراقبُهُ\n أنا الغريقُ ...,not
73,كانت ليلة من اجمل ليالي عمري\nشكراً قناة الواق...,not
207,من عادات العرب في الجاهلية ..\n\n .. إذا تكاثر...,not
140,مثل الحسابات هذي مدري كيف يصدقونها العالم!!؟\n...,not


## After enhancement prompt

In [42]:
def build_prompt(tweet):
    return f"""
You are an Arabic content moderation system.
Classify the following tweet as "offensive" if it contains insults, profanity, or abusive language,
even if it appears within a discussion or argument.
Otherwise, classify it as "not".

Tweet:
{tweet}

Answer:
"""


In [45]:
real_df_out["pred_label"].value_counts(normalize=True) * 100


Unnamed: 0_level_0,proportion
pred_label,Unnamed: 1_level_1
not,79.816514
offensive,20.183486


In [48]:
!pip install -q streamlit


[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.0/9.0 MB[0m [31m67.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.9/6.9 MB[0m [31m101.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [51]:
!zip -r qwen3_offensive_lora_adapter.zip qwen3_offensive_lora_adapter


  adding: qwen3_offensive_lora_adapter/ (stored 0%)
  adding: qwen3_offensive_lora_adapter/adapter_config.json (deflated 59%)
  adding: qwen3_offensive_lora_adapter/merges.txt (deflated 57%)
  adding: qwen3_offensive_lora_adapter/vocab.json (deflated 61%)
  adding: qwen3_offensive_lora_adapter/README.md (deflated 65%)
  adding: qwen3_offensive_lora_adapter/chat_template.jinja (deflated 76%)
  adding: qwen3_offensive_lora_adapter/tokenizer.json (deflated 81%)
  adding: qwen3_offensive_lora_adapter/tokenizer_config.json (deflated 90%)
  adding: qwen3_offensive_lora_adapter/added_tokens.json (deflated 68%)
  adding: qwen3_offensive_lora_adapter/adapter_model.safetensors (deflated 21%)
  adding: qwen3_offensive_lora_adapter/special_tokens_map.json (deflated 69%)


In [52]:
from google.colab import files
files.download("qwen3_offensive_lora_adapter.zip")


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## Live Stream


In [56]:
@torch.no_grad()
def predict_label_generate(text: str) -> str:
    prompt = build_prompt(text)
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    out = model.generate(
        **inputs,
        max_new_tokens=3,
        do_sample=False,
        temperature=0.0,
        pad_token_id=tokenizer.pad_token_id,
    )

    decoded = tokenizer.decode(out[0], skip_special_tokens=True)
    ans = decoded.split("Answer:")[-1].strip().lower()
    first = ans.split()[0] if ans else ""

    return "offensive" if "off" in first else "not"


In [58]:
print("\n🛡️ Arabic Offensive Detector (LIVE)")
print("اكتب نص عربي واضغط Enter")
print("اكتب exit للخروج\n")

while True:
    text = input("📝 النص: ").strip()
    if text.lower() == "exit":
        print("👋 تم الخروج")
        break

    label = predict_label_generate(text)

    if label == "offensive":
        print("🚫 النتيجة: OFFENSIVE\n")
    else:
        print("✅ النتيجة: NOT\n")



🛡️ Arabic Offensive Detector (LIVE)
اكتب نص عربي واضغط Enter
اكتب exit للخروج

📝 النص: ياحمار ايش هذا الكلام
✅ النتيجة: NOT

📝 النص: ياكلب ايش هذا الكلام
🚫 النتيجة: OFFENSIVE

📝 النص: exit
👋 تم الخروج


## Enhancement - The final system adopts a hybrid approach, combining a fine-tuned LLM with a lightweight rule-based layer to capture explicit profanity that may be underrepresented in the training data.

In [59]:
BAD_WORDS = [
    "حمار", "حمير", "يا حمار",
    "كلب", "يا كلب",
    "حيوان", "يا حيوان",
    "غبي", "اهبل", "قذر", "وسخ",
    "لعنة", "لعن", "يلعن", "تباً", "تفو"
]

def contains_bad_words(text: str) -> bool:
    t = text.replace("ـ", "").strip().lower()
    return any(w in t for w in BAD_WORDS)

@torch.no_grad()
def predict_label_generate_strict(text: str) -> str:
    # 1) Hard rule first (fast + catches obvious insults)
    if contains_bad_words(text):
        return "offensive"
    # 2) Otherwise fall back to model
    return predict_label_generate(text)


In [60]:
label = predict_label_generate_strict(text)


In [61]:
def build_prompt(tweet: str) -> str:
    return f"""
You are an Arabic content moderation system.
Classify as "offensive" if the text contains ANY insult or profanity (e.g., حمار، كلب، غبي، اهبل) or abusive language,
even if used casually or in an argument.
Otherwise classify as "not".

Tweet:
{tweet}

Answer:
""".strip()


In [62]:
import time
import torch

# -----------------------------
# Prompt (STRICT)
# -----------------------------
def build_prompt(tweet: str) -> str:
    return f"""
You are an Arabic content moderation system.
Classify the following text as "offensive" if it contains insults, profanity, or abusive language,
even if it appears within a discussion or argument. Otherwise, classify it as "not".

Text:
{tweet}

Answer:
""".strip()

# -----------------------------
# Inference (generate-based) - matches your training/eval
# -----------------------------
@torch.no_grad()
def predict_label_generate(text: str, max_new_tokens: int = 3) -> str:
    prompt = build_prompt(text)

    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    out = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        do_sample=False,
        temperature=0.0,
        pad_token_id=tokenizer.pad_token_id if tokenizer.pad_token_id is not None else tokenizer.eos_token_id,
    )

    decoded = tokenizer.decode(out[0], skip_special_tokens=True)
    ans = decoded.split("Answer:")[-1].strip().lower()
    first = ans.split()[0] if ans else ""

    return "offensive" if "off" in first else "not"


# -----------------------------
# LIVE CLI (no saving)
# -----------------------------
print("\n🛡️ Arabic Offensive Detector (LIVE)")
print("اكتب نص عربي واضغط Enter")
print("اكتب exit للخروج\n")

while True:
    try:
        text = input("📝 النص: ").strip()
        if not text:
            print("⚠️ اكتب نص غير فاضي.\n")
            continue

        if text.lower() in ["exit", "quit", "q"]:
            print("👋 تم الخروج")
            break

        t0 = time.time()
        label = predict_label_generate(text)
        dt = time.time() - t0

        if label == "offensive":
            print(f"🚫 النتيجة: OFFENSIVE   |  ⏱️ {dt:.2f}s\n")
        else:
            print(f"✅ النتيجة: NOT         |  ⏱️ {dt:.2f}s\n")

    except KeyboardInterrupt:
        print("\n👋 تم الخروج (Ctrl+C)")
        break
    except Exception as e:
        print(f"❌ Error: {e}\n")



🛡️ Arabic Offensive Detector (LIVE)
اكتب نص عربي واضغط Enter
اكتب exit للخروج

📝 النص: ياحمار وش ذا الكلام
🚫 النتيجة: OFFENSIVE   |  ⏱️ 0.32s

📝 النص: ياكلب وش هذا الكلام
🚫 النتيجة: OFFENSIVE   |  ⏱️ 0.32s

📝 النص: الاحد القادم دوام ساعة ٨ صباحا
✅ النتيجة: NOT         |  ⏱️ 0.34s

📝 النص: نعمل الان على مشروع التخرج 
✅ النتيجة: NOT         |  ⏱️ 0.33s

📝 النص: exit
👋 تم الخروج


In [90]:
import torch
import gradio as gr
import torch.nn.functional as F

# -----------------------------
# Prompt (STRICT)
# -----------------------------
def build_prompt(tweet: str) -> str:
    return f"""
You are an Arabic content moderation system.
Classify the following text as "offensive" if it contains insults, profanity, or abusive language,
even if it appears within a discussion or argument. Otherwise, classify it as "not".

Text:
{tweet}

Answer:
""".strip()

# -----------------------------
# Inference (generate-based)
# -----------------------------
@torch.no_grad()
def predict_label_generate(text: str, max_new_tokens: int = 3) -> str:
    prompt = build_prompt(text)
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    out = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        do_sample=False,
        temperature=0.0,
        pad_token_id=tokenizer.pad_token_id if tokenizer.pad_token_id is not None else tokenizer.eos_token_id,
    )

    decoded = tokenizer.decode(out[0], skip_special_tokens=True)
    ans = decoded.split("Answer:")[-1].strip().lower()
    first = ans.split()[0] if ans else ""
    return "offensive" if "off" in first else "not"

# -----------------------------
# Confidence (logits of next token: "not" vs "offensive")
# -----------------------------
def _get_single_token_id(tokenizer, variants):
    for v in variants:
        ids = tokenizer.encode(v, add_special_tokens=False)
        if len(ids) == 1:
            return ids[0]
    return tokenizer.encode(variants[0], add_special_tokens=False)[0]

NOT_ID = _get_single_token_id(tokenizer, ["not", " not"])
OFF_ID = _get_single_token_id(tokenizer, ["offensive", " offensive"])

@torch.no_grad()
def predict_with_confidence(text: str):
    prompt = build_prompt(text)
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model(**inputs)
    logits = outputs.logits[0, -1, :]  # next-token logits

    two = torch.stack([logits[NOT_ID], logits[OFF_ID]], dim=0)
    probs = F.softmax(two, dim=0)
    p_not = probs[0].item()
    p_off = probs[1].item()

    pred = "offensive" if p_off > p_not else "not"
    conf = max(p_off, p_not)
    return pred, conf

# -----------------------------
# UI Card (Colors) + Confidence Rate
# -----------------------------
def render_card(user_text: str):
    user_text = (user_text or "").strip()
    if not user_text:
        return """
        <div class="card neutral">
            <div class="title">—</div>
            <div class="sub">Confidence: —</div>
            <div class="text">اكتب نص ثم اضغط Submit</div>
        </div>
        """

    pred, conf = predict_with_confidence(user_text)
    conf_pct = f"{conf*100:.2f}%"

    if pred == "offensive":
        # RED
        return f"""
        <div class="card bad">
            <div class="title"><span class="dot"></span> OFFENSIVE</div>
            <div class="sub">Confidence: {conf_pct}</div>
            <div class="text">{user_text}</div>
        </div>
        """
    else:
        # GREEN
        return f"""
        <div class="card good">
            <div class="title"><span class="dot"></span> NOT</div>
            <div class="sub">Confidence: {conf_pct}</div>
            <div class="text">{user_text}</div>
        </div>
        """

# -----------------------------
# CSS
# -----------------------------
CSS = """
.wrap {max-width: 1000px; margin: 0 auto;}
.card{
  border-radius: 12px;
  padding: 14px 16px;
  border: 1px solid rgba(0,0,0,0.10);
  box-shadow: 0 6px 14px rgba(0,0,0,0.08);
  direction: rtl;
  font-family: Arial, sans-serif;
  min-height: 150px;
}
.title{ display:flex; align-items:center; gap:10px; font-size: 26px; font-weight: 800; }
.sub{ margin-top: 6px; font-size: 16px; opacity: 0.9; }
.text{
  margin-top: 12px; font-size: 18px; line-height: 1.6;
  background: rgba(255,255,255,0.40);
  padding: 10px 12px; border-radius: 10px;
}
.dot{ width: 14px; height: 14px; border-radius: 50%; display:inline-block; }

.good{ background:#dff3df; color:#0f3d0f; }
.good .dot{ background:#1f8f1f; }

.bad{ background:#ffd7d7; color:#5a0b0b; }
.bad .dot{ background:#d11a1a; }

.neutral{ background:#f2f2f2; color:#222; }
.neutral .dot{ background:#999; }

/* Buttons */
button.primary { background: #d77219 !important; border: none !important; }
"""

# -----------------------------
# Gradio UI: 2 Textboxes + 2 Submit (separate outputs) + Exit button
# -----------------------------
with gr.Blocks(css=CSS) as demo:
    gr.Markdown("## 🛡️ Arabic Offensive Detector", elem_classes="wrap")

    with gr.Row():
        # -------- Textbox 1 --------
        with gr.Column(scale=3):
            user_text_1 = gr.Textbox(label="النص (1)", placeholder="اكتب النص الأول هنا...", lines=6)
            with gr.Row():
                submit_btn_1 = gr.Button("Submit 1", variant="primary")
        with gr.Column(scale=2):
            out_html_1 = gr.HTML(render_card(""))

    gr.Markdown("---")

    with gr.Row():
        # -------- Textbox 2 --------
        with gr.Column(scale=3):
            user_text_2 = gr.Textbox(label="النص (2)", placeholder="اكتب النص الثاني هنا...", lines=6)
            with gr.Row():
                submit_btn_2 = gr.Button("Submit 2", variant="primary")
        with gr.Column(scale=2):
            out_html_2 = gr.HTML(render_card(""))

    with gr.Row():
        clear_btn = gr.Button("Clear")
        exit_btn = gr.Button("Exit")

    # Submit actions (separate)
    submit_btn_1.click(fn=render_card, inputs=user_text_1, outputs=out_html_1)
    user_text_1.submit(fn=render_card, inputs=user_text_1, outputs=out_html_1)

    submit_btn_2.click(fn=render_card, inputs=user_text_2, outputs=out_html_2)
    user_text_2.submit(fn=render_card, inputs=user_text_2, outputs=out_html_2)

    # Clear both
    clear_btn.click(
        fn=lambda: ("", render_card(""), "", render_card("")),
        inputs=None,
        outputs=[user_text_1, out_html_1, user_text_2, out_html_2],
    )

    # Exit: close server process
    def _exit():
        raise SystemExit

    exit_btn.click(fn=_exit, inputs=None, outputs=None)

demo.launch(share=True, debug=True)


  with gr.Blocks(css=CSS) as demo:


Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://817adb4eb6849cd02b.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


ERROR:    Traceback (most recent call last):
  File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/base_events.py", line 678, in run_until_complete
    self.run_forever()
  File "/usr/lib/python3.12/asyncio/base_events.py", line 645, in run_forever
    self._run_once()
  File "/usr/lib/python3.12/asyncio/base_events.py", line 1999, in _run_once
    handle._run()
  File "/usr/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/local/lib/python3.12/dist-packages/gradio/queueing.py", line 759, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/gradio/r

Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7860 <> https://817adb4eb6849cd02b.gradio.live




## Save Rusult after Enhancement

In [63]:
import os, json, textwrap, shutil
from datetime import datetime

RUN_TAG = datetime.now().strftime("%Y%m%d_%H%M%S")
EXPORT_DIR = f"/content/offensive_export_{RUN_TAG}"
os.makedirs(EXPORT_DIR, exist_ok=True)

print("✅ Export folder:", EXPORT_DIR)


✅ Export folder: /content/offensive_export_20260109_161538


In [64]:
ADAPTER_DIR = os.path.join(EXPORT_DIR, "qwen3_offensive_lora_adapter")

trainer.model.save_pretrained(ADAPTER_DIR)
tokenizer.save_pretrained(ADAPTER_DIR)

print("✅ Adapter+Tokenizer saved:", ADAPTER_DIR)
print("Files:", os.listdir(ADAPTER_DIR))


✅ Adapter+Tokenizer saved: /content/offensive_export_20260109_161538/qwen3_offensive_lora_adapter
Files: ['adapter_config.json', 'merges.txt', 'vocab.json', 'README.md', 'chat_template.jinja', 'tokenizer.json', 'tokenizer_config.json', 'added_tokens.json', 'adapter_model.safetensors', 'special_tokens_map.json']


In [65]:
# 1) Save BAD_WORDS list
bad_words_path = os.path.join(EXPORT_DIR, "bad_words.json")
with open(bad_words_path, "w", encoding="utf-8") as f:
    json.dump(BAD_WORDS, f, ensure_ascii=False, indent=2)

# 2) Save prompt template (string)
prompt_path = os.path.join(EXPORT_DIR, "prompt_template.txt")
with open(prompt_path, "w", encoding="utf-8") as f:
    f.write(build_prompt("{TEXT_HERE}"))

print("✅ Saved:", bad_words_path)
print("✅ Saved:", prompt_path)


✅ Saved: /content/offensive_export_20260109_161538/bad_words.json
✅ Saved: /content/offensive_export_20260109_161538/prompt_template.txt


In [66]:
import pandas as pd

results = []
results.append(eval_split_generate_return(val_part, "Validation (strict)"))
results.append(eval_split_generate_return(test_part, "Test (strict)"))

results_df = pd.DataFrame(results)
csv_path = os.path.join(EXPORT_DIR, "generate_eval_results_strict.csv")
results_df.to_csv(csv_path, index=False)

print("✅ Saved:", csv_path)
results_df


✅ Saved: /content/offensive_export_20260109_161538/generate_eval_results_strict.csv


Unnamed: 0,split,accuracy,f1_offensive,precision_not,recall_not,f1_not,precision_offensive,recall_offensive,f1_offensive_class
0,Validation (strict),0.869991,0.824096,0.933824,0.862772,0.896893,0.772009,0.883721,0.824096
1,Test (strict),0.861977,0.811206,0.922965,0.861601,0.891228,0.765517,0.862694,0.811206


In [67]:
from sklearn.metrics import classification_report

def save_generate_report(df_part, name):
    y_true = df_part["label"].tolist()
    y_pred = [predict_label_generate(t) for t in df_part["tweet"].astype(str).tolist()]
    report = classification_report(y_true, y_pred, digits=4, output_dict=True)

    path = os.path.join(EXPORT_DIR, f"{name.lower().replace(' ', '_')}_generate_report_strict.json")
    with open(path, "w", encoding="utf-8") as f:
        json.dump(report, f, ensure_ascii=False, indent=2)

    print("✅ Saved:", path)

save_generate_report(val_part, "Validation")
save_generate_report(test_part, "Test")


✅ Saved: /content/offensive_export_20260109_161538/validation_generate_report_strict.json
✅ Saved: /content/offensive_export_20260109_161538/test_generate_report_strict.json


In [92]:
import shutil
import os

# المسار الأصلي
source_dir = "/content/offensive_export_20260109_161538"

# اسم ملف zip الناتج
zip_path = "/content/offensive_export_20260109_161538.zip"

# إنشاء ملف zip
shutil.make_archive(
    base_name=zip_path.replace(".zip", ""),
    format="zip",
    root_dir=source_dir
)

print("✅ تم إنشاء الملف المضغوط:", zip_path)


✅ تم إنشاء الملف المضغوط: /content/offensive_export_20260109_161538.zip


In [93]:
from google.colab import files
files.download("/content/offensive_export_20260109_161538.zip")


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
import pandas as pd

path = "/mnt/data/merged_twitterdata/human classification333.csv"
df = pd.read_csv(path)


# Model vs Human Comparison

## Model Evaluation on Full Human-Labeled Dataset (Unbalanced Test)

In [2]:
import zipfile

zip_path = "/content/offensive_export_20260109_161538.zip"      # مسار الملف المضغوط
extract_to = "/content/qwen3_offensive_lora_adapter"   # مجلد الفك

with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall(extract_to)

print("✅ Unzip done to:", extract_to)


✅ Unzip done to: /content/qwen3_offensive_lora_adapter


In [5]:
import os

print("Exists /content ?", os.path.exists("/content"))
print("List /content:")
print(os.listdir("/content"))

print("\nExists adapter dir ?", os.path.exists("/content/qwen3_offensive_lora_adapter"))
if os.path.exists("/content/qwen3_offensive_lora_adapter"):
    print("List adapter dir:")
    print(os.listdir("/content/qwen3_offensive_lora_adapter"))


Exists /content ? True
List /content:
['.config', 'offensive_export_20260109_161538.zip', 'qwen3_offensive_lora_adapter', 'merged_twitterdata with human classification333.csv', 'sample_data']

Exists adapter dir ? True
List adapter dir:
['test_generate_report_strict.json', 'prompt_template.txt', 'qwen3_offensive_lora_adapter', 'bad_words.json', 'generate_eval_results_strict.csv', 'validation_generate_report_strict.json']


In [6]:
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

BASE_MODEL_ID = "Qwen/Qwen3-4B"
LORA_ADAPTER_PATH = "/content/qwen3_offensive_lora_adapter/qwen3_offensive_lora_adapter"


tokenizer = AutoTokenizer.from_pretrained(
    BASE_MODEL_ID,
    trust_remote_code=True
)

base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL_ID,
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True
)

model = PeftModel.from_pretrained(
    base_model,
    LORA_ADAPTER_PATH
)

model.eval()


Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): Qwen3ForCausalLM(
      (model): Qwen3Model(
        (embed_tokens): Embedding(151936, 2560)
        (layers): ModuleList(
          (0-35): 36 x Qwen3DecoderLayer(
            (self_attn): Qwen3Attention(
              (q_proj): lora.Linear(
                (base_layer): Linear(in_features=2560, out_features=4096, bias=False)
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.1, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=2560, out_features=16, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=16, out_features=4096, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
                (lora_magnitude_vector): ModuleDict()
              )
              (k_proj): lora.Linear(

In [7]:
import re
import numpy as np
import pandas as pd
import torch
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

DATA_PATH = "/content/merged_twitterdata with human classification333.csv"

TEXT_COL = "text"
LABEL_COL = "classification"

RANDOM_SEED = 42
IMBALANCE_RATIO_THRESHOLD = 1.5

MAX_INPUT_TOKENS = 256
MAX_NEW_TOKENS = 6
BATCH_SIZE = 16

# -------------------------
# 1) Load dataset
# -------------------------
df = pd.read_csv(DATA_PATH, sep=";", encoding="utf-8-sig")
df = df.dropna(subset=[TEXT_COL, LABEL_COL]).copy()
df[TEXT_COL] = df[TEXT_COL].astype(str)
df[LABEL_COL] = df[LABEL_COL].astype(str).str.strip().str.lower()

# لو عندك تسميات مختلفة، وحّدها هنا
label_map = {
    "offensive": "offensive",
    "not": "not",
    "not_offensive": "not",
    "non-offensive": "not",
    "non_offensive": "not",
    "0": "not",
    "1": "offensive",
}
df[LABEL_COL] = df[LABEL_COL].map(lambda x: label_map.get(x, x))

print("Dataset shape:", df.shape)
print("\nClass distribution (unbalanced):")
print(df[LABEL_COL].value_counts(dropna=False))
print("\nClass distribution (%):")
print((df[LABEL_COL].value_counts(normalize=True) * 100).round(2))

# -------------------------
# 2) Prompt + parsing
# -------------------------
_off_pat = re.compile(r"\boffensive\b", re.IGNORECASE)
_not_pat = re.compile(r"\bnot\b", re.IGNORECASE)

def build_prompt(text: str) -> str:
    return (
        "You are a strict classifier.\n"
        "Classify the text as Offensive or Not.\n"
        "Return exactly one word: Offensive or Not.\n\n"
        f"Text: {text}\n"
        "Answer:"
    )

def parse_label(generated: str) -> str:
    t = generated.strip().lower()
    if _off_pat.search(t):
        return "offensive"
    if _not_pat.search(t):
        return "not"
    return "not"  # fallback

@torch.no_grad()
def predict_batch(texts):
    prompts = [build_prompt(t) for t in texts]
    inputs = tokenizer(
        prompts,
        return_tensors="pt",
        padding=True,
        truncation=True,
        max_length=MAX_INPUT_TOKENS,
    ).to(model.device)

    out = model.generate(
        **inputs,
        max_new_tokens=MAX_NEW_TOKENS,
        do_sample=False,
        eos_token_id=tokenizer.eos_token_id,
    )

    decoded = tokenizer.batch_decode(out, skip_special_tokens=True)
    preds = []
    for full in decoded:
        # خذ الجزء بعد Answer:
        tail = full.split("Answer:")[-1] if "Answer:" in full else full
        preds.append(parse_label(tail))
    return preds

def evaluate(df_eval: pd.DataFrame, tag: str):
    texts = df_eval[TEXT_COL].tolist()
    y_true = df_eval[LABEL_COL].tolist()

    y_pred = []
    for i in range(0, len(texts), BATCH_SIZE):
        y_pred.extend(predict_batch(texts[i:i+BATCH_SIZE]))

    labels_sorted = sorted(list(set(y_true) | set(y_pred)))

    print(f"\n==================== {tag} ====================")
    print("N =", len(df_eval))
    print("Accuracy:", accuracy_score(y_true, y_pred))
    print("\nClassification report:")
    print(classification_report(y_true, y_pred, labels=labels_sorted, digits=4))
    print("\nConfusion matrix (rows=true, cols=pred) order:", labels_sorted)
    print(confusion_matrix(y_true, y_pred, labels=labels_sorted))

    out = df_eval[[TEXT_COL, LABEL_COL]].copy()
    out["pred"] = y_pred
    out.to_csv(f"eval_{tag.lower()}_results.csv", index=False, encoding="utf-8-sig")
    print(f"\nSaved: eval_{tag.lower()}_results.csv")

# -------------------------
# 3) Unbalanced eval
# -------------------------
evaluate(df, "UNBALANCED")

# -------------------------
# 4) Balanced eval (undersampling if imbalanced)
# -------------------------
counts = df[LABEL_COL].value_counts()
min_c = counts.min()
max_c = counts.max()
ratio = (max_c / min_c) if min_c > 0 else np.inf

is_imbalanced = ratio > IMBALANCE_RATIO_THRESHOLD
print(f"\nImbalance ratio (max/min) = {ratio:.3f} | threshold={IMBALANCE_RATIO_THRESHOLD} => imbalanced? {is_imbalanced}")

if is_imbalanced:
    parts = []
    for cls, grp in df.groupby(LABEL_COL):
        parts.append(grp.sample(n=min_c, random_state=RANDOM_SEED))
    df_bal = pd.concat(parts).sample(frac=1, random_state=RANDOM_SEED).reset_index(drop=True)
else:
    df_bal = df.copy()

print("\nBalanced class distribution:")
print(df_bal[LABEL_COL].value_counts())

evaluate(df_bal, "BALANCED")


The following generation flags are not valid and may be ignored: ['temperature', 'top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


Dataset shape: (218, 13)

Class distribution (unbalanced):
classification
not          162
offensive     56
Name: count, dtype: int64

Class distribution (%):
classification
not          74.31
offensive    25.69
Name: proportion, dtype: float64


A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='le


N = 218
Accuracy: 0.7522935779816514

Classification report:
              precision    recall  f1-score   support

         not     0.7842    0.9198    0.8466       162
   offensive     0.5357    0.2679    0.3571        56

    accuracy                         0.7523       218
   macro avg     0.6600    0.5938    0.6019       218
weighted avg     0.7204    0.7523    0.7209       218


Confusion matrix (rows=true, cols=pred) order: ['not', 'offensive']
[[149  13]
 [ 41  15]]

Saved: eval_unbalanced_results.csv

Imbalance ratio (max/min) = 2.893 | threshold=1.5 => imbalanced? True

Balanced class distribution:
classification
not          56
offensive    56
Name: count, dtype: int64


A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='le


N = 112
Accuracy: 0.5714285714285714

Classification report:
              precision    recall  f1-score   support

         not     0.5444    0.8750    0.6712        56
   offensive     0.6818    0.2679    0.3846        56

    accuracy                         0.5714       112
   macro avg     0.6131    0.5714    0.5279       112
weighted avg     0.6131    0.5714    0.5279       112


Confusion matrix (rows=true, cols=pred) order: ['not', 'offensive']
[[49  7]
 [41 15]]

Saved: eval_balanced_results.csv
