## Install Libraries

In [1]:
%%capture
!pip install pip3-autoremove
!pip-autoremove torch torchvision torchaudio -y
!pip install torch torchvision torchaudio xformers --index-url https://download.pytorch.org/whl/cu121
!pip install unsloth

## Mistral-7B Model Loading using Unsloth

- Loads the Mistral-7B model with Unsloth using 4-bit quantization to reduce GPU memory usage.
- Automatically enables RoPE scaling and selects optimal precision (e.g., FP16 or BF16) based on the hardware.


In [2]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/mistral-7b-bnb-4bit", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.


2025-04-11 16:39:59.239939: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1744389599.411310      31 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1744389599.461275      31 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


Unsloth: Failed to patch Gemma3ForConditionalGeneration.
🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))==  Unsloth 2025.3.19: Fast Mistral patching. Transformers: 4.51.1.
   \\   /|    Tesla T4. Num GPUs = 2. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu121. CUDA: 7.5. CUDA Toolkit: 12.1. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.29.post1. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/4.13G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/155 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.02k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/438 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

We now add LoRA adapters so we only need to update 1 to 10% of all parameters!

## PEFT Configuration with Unsloth's Mistral-7B Model

- Applies Parameter-Efficient Fine-Tuning (PEFT) using LoRA on selected transformer modules (e.g., q_proj, k_proj, etc.) with rank `r=16` and `lora_alpha=16`.
- Enables gradient checkpointing for memory efficiency and disables bias and dropout for optimized training performance.


In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",    
    use_gradient_checkpointing = True,
    random_state = 3407,
    use_rslora = False,  
    loftq_config = None,
)

Unsloth: Already have LoRA adapters! We shall skip this step.


## Preparing Context-Aware Reply Dataset for Fine-Tuning

- Loads a labeled dataset containing `reviewContent`, `sentiment`, `categoryLabel`, and `generatedReply` columns from a CSV file, and formats each entry into an instruction-based prompt using the Alpaca format.
- Converts the cleaned DataFrame into a Hugging Face `Dataset` and applies a formatting function that wraps each data point with an instruction, input, and expected response, followed by an EOS token.


In [15]:
import pandas as pd

# Load the CSV file with encoding
df = pd.read_csv("/kaggle/input/labeling-for-response-generation/Contextual_Replies_Generated.csv", encoding="ISO-8859-1")

# Keep only the relevant columns
df = df[["reviewContent", "sentiment", "categoryLabel", "generatedReply"]]

# Show the cleaned DataFrame
df.head()


Unnamed: 0,reviewContent,sentiment,categoryLabel,generatedReply
0,I do not understand the enthusiastic reviews o...,negative,health,Apologies that the bag did not live up to your...
1,"We like the sofa cover, so soft and beautiful",positive,home,So happy to hear that you're loving the softne...
2,At first it seem correct but after 48 hours it...,negative,automotive,Sorry it turned out that way it is frustrating...
3,Sat down perfectly on the machine. Hard and du...,positive,health,Thanks for recognizing the quality! We're thri...
4,"It works quite well, reading the manual and pa...",positive,health,We really appreciate your detailed feedback! I...


In [16]:
EOS_TOKEN = tokenizer.eos_token
# Define the prompt format
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

# Formatting function adapted for your dataset
def formatting_prompts_func(examples):
    reviews = examples["reviewContent"]
    sentiments = examples["sentiment"]
    categories = examples["categoryLabel"]
    replies = examples["generatedReply"]
    
    texts = []
    for review, sentiment, category, reply in zip(reviews, sentiments, categories, replies):
        instruction = "Generate a helpful and context-aware reply based on the review, sentiment, and category."
        input_text = f"Review: {review}\nSentiment: {sentiment}\nCategory: {category}"
        text = alpaca_prompt.format(instruction, input_text, reply) + EOS_TOKEN
        texts.append(text)
    
    return {"text": texts}


from datasets import load_dataset, Dataset
import pandas as pd

# Convert to Huggingface Dataset
dataset = Dataset.from_pandas(df)
dataset = dataset.map(formatting_prompts_func, batched=True)


Map:   0%|          | 0/100 [00:00<?, ? examples/s]

## Supervised Fine-Tuning Configuration with TRL's SFTTrainer

- Initializes the `SFTTrainer` with the preprocessed Alpaca-style dataset for fine-tuning a language model on response generation tasks.
- Uses 2 batch size per device with gradient accumulation (×4), 2e-4 learning rate, and mixed-precision training (FP16 or BF16 based on hardware).
- Optimizer is `adamw_8bit`, training logs every 10 steps, and the output is saved to the `outputs` directory.


In [17]:
from trl import SFTTrainer
from transformers import TrainingArguments

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        learning_rate = 2e-4,
        fp16 = not torch.cuda.is_bf16_supported(),
        bf16 = torch.cuda.is_bf16_supported(),
        logging_steps = 10,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
        report_to = "none", # Use this for WandB etc
    ),
)

Unsloth: Tokenizing ["text"] (num_proc=2):   0%|          | 0/100 [00:00<?, ? examples/s]

In [18]:
#@title Show current memory stats
gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

GPU = Tesla T4. Max memory = 14.741 GB.
6.883 GB of memory reserved.


In [19]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 100 | Num Epochs = 3 | Total steps = 18
O^O/ \_/ \    Batch size per device = 4 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (4 x 4 x 1) = 16
 "-____-"     Trainable parameters = 41,943,040/7,000,000,000 (0.60% trained)


Step,Training Loss
10,0.9358


In [20]:
#@title Show final memory and time stats
used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
used_memory_for_lora = round(used_memory - start_gpu_memory, 3)
used_percentage = round(used_memory         /max_memory*100, 3)
lora_percentage = round(used_memory_for_lora/max_memory*100, 3)
print(f"{trainer_stats.metrics['train_runtime']} seconds used for training.")
print(f"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.")
print(f"Peak reserved memory = {used_memory} GB.")
print(f"Peak reserved memory for training = {used_memory_for_lora} GB.")
print(f"Peak reserved memory % of max memory = {used_percentage} %.")
print(f"Peak reserved memory for training % of max memory = {lora_percentage} %.")

187.6692 seconds used for training.
3.13 minutes used for training.
Peak reserved memory = 6.883 GB.
Peak reserved memory for training = 0.0 GB.
Peak reserved memory % of max memory = 46.693 %.
Peak reserved memory for training % of max memory = 0.0 %.


## Save The Fine Tunned Model to Hugging Face

In [25]:
model.push_to_hub("AbuSalehMd/ResponseLabeling_Mistral_7B_FineTuned", token = "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") # Online saving

README.md:   0%|          | 0.00/581 [00:00<?, ?B/s]

  0%|          | 0/1 [00:00<?, ?it/s]

adapter_model.safetensors:   0%|          | 0.00/168M [00:00<?, ?B/s]

Saved model to https://huggingface.co/AbuSalehMd/ResponseLabeling_Mistral_7B_FineTuned


## Inference for Context-Aware Reply Generation

- Prepares a sample input using the Alpaca-style prompt with review content, sentiment, and category, then tokenizes it for the model.
- Enables optimized inference with `FastLanguageModel.for_inference()` and generates a reply using the `generate()` method.
- Decodes the output and extracts only the reply section following `"### Response:"` for clean post-processing.


In [22]:
sample = df.iloc[0]
# alpaca_prompt = Copied from above
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
inputs = tokenizer(
[
    alpaca_prompt.format(
        "Generate a helpful and context-aware reply based on the review, sentiment, and category.",  # instruction
        f"Review: {sample['reviewContent']}\nSentiment: {sample['sentiment']}\nCategory: {sample['categoryLabel']}",  # input
        "" 
    )
], return_tensors = "pt").to("cuda")
# Generate output
outputs = model.generate(**inputs, max_new_tokens=128, use_cache=True)

# Decode the response
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Extract only the generated reply (after "### Response:")
if "### Response:" in decoded_output:
    reply = decoded_output.split("### Response:")[-1].strip()
else:
    reply = decoded_output.strip()

print(reply)



We appreciate your honesty and your comparison to dermatine skin. While the bag may not have fully met your expectations, we understand that not every product is perfect for everyone. We encourage you to check out our other styles and materials for better fit and durability.


## Reply Generation Labeling Workflow for AliExpress Reviews

- **Data Split:** The full dataset is manually divided into 4 parts for efficient processing.
- **Reply Generation:** Each part is processed separately using Unsloth’s model with `tqdm` progress tracking, and replies are generated and saved to individual CSVs.
- **Final Merge:** All parts are combined into a single labeled CSV for the complete output.


In [29]:
# Load the dataset
df = pd.read_csv("/kaggle/input/labeling-for-response-generation/Final Mapped Product Review Of AliExpress Product Category Processed.csv")
df = df[["reviewContent", "sentiment", "categoryLabel"]]
print(len(df))
# Manually split into 4 parts
df_part1 = df.iloc[0:3230]
df_part2 = df.iloc[3230:6460]
df_part3 = df.iloc[6460:9690]
df_part4 = df.iloc[9690:12916]

12916


In [None]:
import pandas as pd
from tqdm import tqdm
import torch

def generate_replies_for_df(df, save_path, model, tokenizer, alpaca_prompt, max_new_tokens=128):
    """
    Generates context-aware replies for each review in the DataFrame using the given model and tokenizer.
    
    Parameters:
    - df (DataFrame): The input DataFrame containing 'reviewContent', 'sentiment', and 'categoryLabel'.
    - save_path (str): File path to save the output CSV with generated replies.
    - model: Loaded language model.
    - tokenizer: Corresponding tokenizer.
    - alpaca_prompt (str): Prompt format string.
    - max_new_tokens (int): Maximum number of tokens to generate per reply.
    """

    # Enable fast inference mode
    from unsloth import FastLanguageModel
    FastLanguageModel.for_inference(model)

    # Prepare storage
    generated_replies = []

    # Loop with progress bar
    for idx in tqdm(range(len(df)), desc="Generating replies"):
        sample = df.iloc[idx]

        # Format prompt
        prompt = alpaca_prompt.format(
            "Generate a helpful and context-aware reply based on the review, sentiment, and category.",
            f"Review: {sample['reviewContent']}\nSentiment: {sample['sentiment']}\nCategory: {sample['categoryLabel']}",
            ""
        )

        # Tokenize and move to CUDA
        inputs = tokenizer([prompt], return_tensors="pt").to("cuda")

        # Generate output
        with torch.no_grad():
            outputs = model.generate(**inputs, max_new_tokens=max_new_tokens, use_cache=True)

        # Decode
        decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)

        # Extract only the reply
        if "### Response:" in decoded:
            reply = decoded.split("### Response:")[-1].strip()
        else:
            reply = decoded.strip()

        # Append the reply
        generated_replies.append(reply)

    # Add to DataFrame
    df["generatedReply"] = generated_replies

    # Save to CSV
    df.to_csv(save_path, index=False)

    return df


In [32]:
df_with_replies = generate_replies_for_df(
    df=df_part1.copy(),
    save_path="1st_AliExpress_reviews_with_generated_replies.csv",
    model=model,
    tokenizer=tokenizer,
    alpaca_prompt=alpaca_prompt
)


Generating replies: 100%|██████████| 3230/3230 [1:53:50<00:00,  2.11s/it]  


Unnamed: 0,reviewContent,sentiment,categoryLabel,generatedReply
0,Works rather well does its job and has amazing...,positive,automotive,That's great to hear! Thanks for the detailed ...
1,I really liked I can take out my quiet dog,positive,fashion,That's great to hear! Thanks for the positive ...
2,Order received successfully and faster than ex...,positive,electronics,Thats always great newsthanks for the detail...
3,Very good the league arrived fast thank you th...,positive,automotive,That's a lot of thanks! We are happy the leagu...
4,"Really high-quality bags, buying worth, it als...",positive,fashion,That sounds great! Thanks for the detailed not...


In [33]:
df_with_replies = generate_replies_for_df(
    df=df_part2.copy(),
    save_path="2nd_AliExpress_reviews_with_generated_replies.csv",
    model=model,
    tokenizer=tokenizer,
    alpaca_prompt=alpaca_prompt
)


Generating replies: 100%|██████████| 3230/3230 [1:51:47<00:00,  2.08s/it]  


Unnamed: 0,reviewContent,sentiment,categoryLabel,generatedReply
3230,Good for the price. The flicker phenomenon of ...,neutral,home,Thank you for your feedback! We appreciate you...
3231,"Perfect, are the second I buy",positive,automotive,Thats great news! Perfect fit and performance...
3232,Why seams to make an L,negative,electronics,Apologies for the inconvenience caused by the ...
3233,Goods for two Tig. Capture of pratsyu Puzha go...,positive,electronics,Thanks for the detailed feedback! We appreciat...
3234,I'm not impressed with how it looks it doesn't...,negative,automotive,We apologize for the misleading appearance. Yo...


In [34]:
df_with_replies = generate_replies_for_df(
    df=df_part3.copy(),
    save_path="3rd_AliExpress_reviews_with_generated_replies.csv",
    model=model,
    tokenizer=tokenizer,
    alpaca_prompt=alpaca_prompt
)


Generating replies: 100%|██████████| 3230/3230 [1:51:18<00:00,  2.07s/it]  


Unnamed: 0,reviewContent,sentiment,categoryLabel,generatedReply
6460,My and it looks good is quite elastic and soft,positive,automotive,Thanks for the great feedback! We love hearing...
6461,The box came a little crushed but the contents...,neutral,electronics,Thank you for your feedback! We appreciate you...
6462,Peace be upon you a set of flash disk capacity...,positive,electronics,Assalamu alaikum! Thanks for the detailed and ...
6463,A very good product that is worth buying and t...,positive,automotive,That's great to hear! Fast delivery and a perf...
6464,Wallets are very nice. Arrived in excellent co...,positive,fashion,That sounds great! Thanks for the detailed not...


In [35]:
df_with_replies = generate_replies_for_df(
    df=df_part4.copy(),
    save_path="4th_AliExpress_reviews_with_generated_replies.csv",
    model=model,
    tokenizer=tokenizer,
    alpaca_prompt=alpaca_prompt
)


Generating replies: 100%|██████████| 3226/3226 [1:54:39<00:00,  2.13s/it]  


Unnamed: 0,reviewContent,sentiment,categoryLabel,generatedReply
9690,"This is USB 2.0, please note it when buying! T...",negative,electronics,We appreciate your honesty and clear feedback....
9691,It is a very well-made product. I like it very...,positive,automotive,That's great to hear! Thanks for the positive ...
9692,Cool do not advise people who have hair on the...,neutral,automotive,Thank you for your feedback! We appreciate you...
9693,You charged me twice...,neutral,electronics,Thank you for your feedback! We appreciate you...
9694,A little bit slippery but good quality for the...,positive,automotive,Thanks for the honest feedback! We appreciate ...


In [36]:
import pandas as pd

# Replace these with your actual file paths
file1 = "/kaggle/working/1st_AliExpress_reviews_with_generated_replies.csv"
file2 = "/kaggle/working/2nd_AliExpress_reviews_with_generated_replies.csv"
file3 = "/kaggle/working/3rd_AliExpress_reviews_with_generated_replies.csv"
file4 = "/kaggle/working/4th_AliExpress_reviews_with_generated_replies.csv"

# Read all CSVs
df1 = pd.read_csv(file1)
df2 = pd.read_csv(file2)
df3 = pd.read_csv(file3)
df4 = pd.read_csv(file4)

# Concatenate them into a single DataFrame
combined_df = pd.concat([df1, df2, df3, df4], ignore_index=True)

# Save to a new CSV
combined_df.to_csv("Final_Labled_AliExpress_reviews_with_generated_replies.csv.csv", index=False)



In [37]:
print(len(combined_df))

12916
