<a href="https://www.kaggle.com/code/aabdollahii/fine-tuning-a-gpt-2-model-for-a-custom-chatbot?scriptVersionId=265506448" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

<div style="background-color: #013220; color: #EAEAEA; border-radius: 15px; padding: 30px; font-family: 'Helvetica Neue', sans-serif; border: 1px solid #388E3C;">

<div style="text-align: center; font-size: 32px; font-weight: bold; color: #A5D6A7; padding-bottom: 15px; border-bottom: 2px solid #388E3C; margin-bottom: 25px;">
        Fine-Tuning a GPT-2 Model for a Custom Chatbot
    </div>

<div style="font-size: 20px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Project Overview
    </div>
    <div style="font-size: 16px; line-height: 1.7; text-align: justify; margin-bottom: 20px;">
        Welcome to this hands-on guide to fine-tuning a Large Language Model (LLM)! In this project, we will transform a general-purpose language model, the powerful GPT-2, into a specialized chatbot. Our goal is to leverage a dataset of 3,000 conversational pairs to teach the model a specific conversational style and knowledge base. This process, known as fine-tuning, is a cornerstone of modern NLP, allowing us to adapt massive pre-trained models for specific tasks without the prohibitive cost of training them from scratch. By the end of this notebook, you will have a functional chatbot customized with our data and a clear understanding of the end-to-end LLM fine-tuning workflow.
    </div>

<div style="font-size: 20px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        The Core Objective
    </div>
    <div style="font-size: 16px; line-height: 1.7; text-align: justify; margin-bottom: 20px;">
        Our primary objective is to demonstrate the power of transfer learning in NLP. We will take the `distilgpt2` model—a smaller, faster variant of GPT-2 that is perfect for experimentation in environments like Kaggle—and train it further on a custom question-and-answer dataset. We will then compare the responses of the original, base model with our newly fine-tuned model to concretely measure the improvement and see how the model's personality and knowledge have shifted.
    </div>

 <div style="font-size: 20px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Methodology at a Glance
    </div>
    <div style="font-size: 16px; line-height: 1.7; text-align: justify;">
        Our approach is structured into a clear, step-by-step process:
        <div style="margin-top: 15px; padding-left: 20px;">
            <div style="margin-bottom: 10px;"><b>1. Environment Setup:</b> We will import essential libraries like PyTorch and Hugging Face's `transformers` and `datasets`.</div>
            <div style="margin-bottom: 10px;"><b>2. Data Loading & Preparation:</b> We will load the 3k conversations dataset and format it into a single text sequence per conversation, making it suitable for training a causal language model like GPT-2.</div>
            <div style="margin-bottom: 10px;"><b>3. Model & Tokenizer Initialization:</b> We will load the pre-trained `distilgpt2` model and its corresponding tokenizer, setting up the necessary configurations for training.</div>
            <div style="margin-bottom: 10px;"><b>4. Fine-Tuning:</b> Using the Hugging Face `Trainer` API, we will fine-tune the model on our prepared dataset. This is where the model learns the new conversational patterns.</div>
            <div style="margin-bottom: 10px;"><b>5. Inference & Evaluation:</b> Finally, we'll test our new chatbot! We will provide it with prompts and compare its generated answers against those from the original model to see our improvements in action.</div>
        </div>
    </div>

</div>


<div style="background-color: #013220; color: #EAEAEA; border-radius: 15px; padding: 30px; font-family: 'Helvetica Neue', sans-serif; border: 1px solid #388E3C;">

 <div style="text-align: center; font-size: 28px; font-weight: bold; color: #A5D6A7; padding-bottom: 15px; border-bottom: 2px solid #388E3C; margin-bottom: 25px;">
        Step 2: Environment Setup & Loading the Base Model/Tokenizer
    </div>

 <div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Objective of This Step
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify; margin-bottom: 20px;">
        In this step, we prepare our Kaggle environment for working with Large Language Models using the Hugging Face <code>transformers</code> library alongside <code>PyTorch</code>. This includes installing any required dependencies, importing essential packages, and initializing the building blocks of our project — the pre-trained base model and its tokenizer.
        The tokenizer is responsible for converting human-readable text into a sequence of tokens (numerical IDs) that the model can understand, while the model itself is the neural network that processes those tokens to generate text predictions.
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Why We Use DistilGPT-2
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify; margin-bottom: 20px;">
        We will use <code>distilgpt2</code>, which is a distilled (smaller and faster) version of GPT-2. This choice allows us to train and experiment quickly in Kaggle's GPU environment without exhausting memory or exceeding time limits, while still benefiting from the rich language understanding learned during GPT-2's original training.
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Actions We Will Perform
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify;">
        <div style="padding-left: 20px;">
            <div style="margin-bottom: 10px;"><b>1.</b> Import the necessary Python packages: <code>transformers</code>, <code>torch</code>, and optionally <code>datasets</code>.</div>
            <div style="margin-bottom: 10px;"><b>2.</b> Initialize the model <code>distilgpt2</code> using the Hugging Face model hub.</div>
            <div style="margin-bottom: 10px;"><b>3.</b> Load the corresponding tokenizer so we can transform raw text into token IDs for the model to process.</div>
            <div style="margin-bottom: 10px;"><b>4.</b> Verify that the model and tokenizer are ready for fine-tuning on our custom dataset in the next steps.</div>
        </div>
    </div>

</div>


In [1]:


# Install the Hugging Face Transformers library (if not already available in Kaggle)
#!pip install transformers datasets --quiet

# Import required libraries
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Load the tokenizer for distilgpt2
tokenizer = AutoTokenizer.from_pretrained("distilgpt2")

# Load the base model (DistilGPT-2) for Causal Language Modeling
model = AutoModelForCausalLM.from_pretrained("distilgpt2").to(device)

# Verify model and tokenizer are ready
sample_text = "Hello, how are you?"
encoded_input = tokenizer.encode(sample_text, return_tensors="pt").to(device)
print("Sample token IDs:", encoded_input)


Using device: cuda


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

2025-10-03 14:35:26.683570: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1759502126.866302      19 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1759502126.918122      19 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


model.safetensors:   0%|          | 0.00/353M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Sample token IDs: tensor([[15496,    11,   703,   389,   345,    30]], device='cuda:0')


In [2]:
import pandas as pd

# Dataset path in Kaggle
dataset_path = "/kaggle/input/3k-conversations-dataset-for-chatbot/Conversation.csv"

# Load CSV file into a Pandas DataFrame
df = pd.read_csv(dataset_path)

# Show basic info about the dataset
print("Dataset Shape:", df.shape)
print("\nColumn Names:", df.columns.tolist())

# Display first few rows
print("\nSample Rows:")
print(df.head())

# Check for missing values
print("\nMissing Values Per Column:")
print(df.isnull().sum())

# Show basic statistics (if numerical columns exist)
print("\nDataset Statistics:")
print(df.describe(include='all'))


Dataset Shape: (3725, 3)

Column Names: ['Unnamed: 0', 'question', 'answer']

Sample Rows:
   Unnamed: 0                             question  \
0           0               hi, how are you doing?   
1           1        i'm fine. how about yourself?   
2           2  i'm pretty good. thanks for asking.   
3           3    no problem. so how have you been?   
4           4     i've been great. what about you?   

                                     answer  
0             i'm fine. how about yourself?  
1       i'm pretty good. thanks for asking.  
2         no problem. so how have you been?  
3          i've been great. what about you?  
4  i've been good. i'm in school right now.  

Missing Values Per Column:
Unnamed: 0    0
question      0
answer        0
dtype: int64

Dataset Statistics:
         Unnamed: 0           question             answer
count   3725.000000               3725               3725
unique          NaN               3510               3512
top             NaN  wha

  has_large_values = (abs_vals > 1e6).any()
  has_small_values = ((abs_vals < 10 ** (-self.digits)) & (abs_vals > 0)).any()
  has_small_values = ((abs_vals < 10 ** (-self.digits)) & (abs_vals > 0)).any()


<div style="background-color: #013220; color: #EAEAEA; border-radius: 15px; padding: 30px; font-family: 'Helvetica Neue', sans-serif; border: 1px solid #388E3C;">

<div style="text-align: center; font-size: 28px; font-weight: bold; color: #A5D6A7; padding-bottom: 15px; border-bottom: 2px solid #388E3C; margin-bottom: 25px;">
        Step 3: Data Preparation for Fine-Tuning
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Objective of This Step
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify; margin-bottom: 20px;">
        The core objective here is to transform our structured dataset (with 'question' and 'answer' columns) into a format suitable for training a causal language model. GPT-2 learns by predicting the next token in a continuous sequence of text. Our current two-column format isn't a continuous sequence, so we must combine them into one. We will create a single text string for each row that clearly represents the conversational flow.
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Our Formatting Strategy
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify; margin-bottom: 20px;">
        To help the model learn the structure of a conversation, we will format each row into a single string like this:
        <br><br>
        <code>"&lt;s&gt;[Q]: How are you? [A]: I'm doing great, thanks!&lt;/s&gt;"</code>
        <br><br>
        Here, <code>&lt;s&gt;</code> and <code>&lt;/s&gt;</code> represent the <b>start</b> and <b>end</b> of a single conversational exchange. This special token (known as the EOS or End-of-Sequence token) is crucial because it teaches the model when a thought or response is complete, which is vital for generating coherent answers during inference.
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Actions We Will Perform
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify;">
        <div style="padding-left: 20px;">
            <div style="margin-bottom: 10px;"><b>1. Combine Columns:</b> We will merge the 'question' and 'answer' columns into a new column, applying our formatting rule.</div>
            <div style="margin-bottom: 10px;"><b>2. Configure Tokenizer:</b> Since <code>distilgpt2</code> does not have a default padding token, we will set its padding token to be the same as its end-of-sequence (EOS) token. This is a standard practice for this model.</div>
            <div style="margin-bottom: 10px;"><b>3. Create a Dataset Object:</b> We will convert our Pandas DataFrame into a Hugging Face <code>Dataset</code> object for efficient processing and tokenization.</div>
        </div>
    </div>

</div>


In [3]:
# Step 3: Data Preparation for Fine-Tuning



from datasets import Dataset

# --- 1. Format the data into a single string per row ---
# We will format our data as "<s>[Q]: {question} [A]: {answer}</s>"
# <s> and </s> are special tokens that mark the start and end of a sequence.
# This helps the model understand the structure of a complete conversational turn.

def format_conversation(row):
    return f"<s>[Q]: {row['question']} [A]: {row['answer']}</s>"

# Apply the formatting to the DataFrame
df['text'] = df.apply(format_conversation, axis=1)

print("Sample of formatted data:")
print(df['text'].head())


# --- 2. Configure the tokenizer ---
# The distilgpt2 model doesn't have a default padding token.
# We'll use the end-of-sequence (EOS) token as our padding token.
# This is a common practice for GPT-style models.
tokenizer.pad_token = tokenizer.eos_token


# --- 3. Create a Hugging Face Dataset object ---
# The Hugging Face Trainer API works best with 'Dataset' objects.
# We will convert our formatted text column from the pandas DataFrame
# into this specialized format.

# First, create a new DataFrame with only the text we need
training_data = pd.DataFrame({'text': df['text']})

# Convert the pandas DataFrame to a Hugging Face Dataset
dataset = Dataset.from_pandas(training_data)

# --- 4. Tokenize the entire dataset ---
# We'll apply the tokenizer to all our text examples. The tokenizer will
# convert the text into numerical IDs that the model can understand.
def tokenize_function(examples):
    return tokenizer(examples['text'], truncation=True, padding='max_length', max_length=128)

tokenized_dataset = dataset.map(tokenize_function, batched=True, remove_columns=['text'])

print("\nDataset has been processed and tokenized.")
print("A single example from the tokenized dataset:")
print(tokenized_dataset[0])


Sample of formatted data:
0    <s>[Q]: hi, how are you doing? [A]: i'm fine. ...
1    <s>[Q]: i'm fine. how about yourself? [A]: i'm...
2    <s>[Q]: i'm pretty good. thanks for asking. [A...
3    <s>[Q]: no problem. so how have you been? [A]:...
4    <s>[Q]: i've been great. what about you? [A]: ...
Name: text, dtype: object


Map:   0%|          | 0/3725 [00:00<?, ? examples/s]


Dataset has been processed and tokenized.
A single example from the tokenized dataset:
{'input_ids': [27, 82, 36937, 48, 5974, 23105, 11, 703, 389, 345, 1804, 30, 685, 32, 5974, 1312, 1101, 3734, 13, 703, 546, 3511, 30, 3556, 82, 29, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256, 50256], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1

<div style="background-color: #013220; color: #EAEAEA; border-radius: 15px; padding: 30px; font-family: 'Helvetica Neue', sans-serif; border: 1px solid #388E3C;">

 <div style="text-align: center; font-size: 28px; font-weight: bold; color: #A5D6A7; padding-bottom: 15px; border-bottom: 2px solid #388E3C; margin-bottom: 25px;">
        Step 4: Fine-Tuning the Language Model
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Objective of This Step
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify; margin-bottom: 20px;">
        This is the core of our project. We will now take the pre-trained <code>distilgpt2</code> model and train it further on our specially prepared conversational dataset. This process, known as fine-tuning, adjusts the model's internal weights to specialize its knowledge and conversational style to match our data. To manage the complexities of a training loop (like batching data, calculating loss, and updating model weights), we will use the high-level <code>Trainer</code> API from the Hugging Face library, which simplifies the entire process immensely.
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Key Components: `TrainingArguments` and `Trainer`
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify; margin-bottom: 20px;">
        The <code>Trainer</code> API requires two main components:
        <br><br>
        1. <b><code>TrainingArguments</code></b>: This is a configuration class where we define all the hyperparameters for our training run. This includes the number of epochs (how many times to go through the data), the batch size, the learning rate, and where to save the trained model.
        <br><br>
        2. <b><code>Trainer</code></b>: This is the main class that orchestrates the training. We simply provide it with our model, the training arguments, and our tokenized dataset. It handles the entire training loop automatically.
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Actions We Will Perform
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify;">
        <div style="padding-left: 20px;">
            <div style="margin-bottom: 10px;"><b>1. Define Training Arguments:</b> We'll set up the training configuration, choosing parameters that are well-suited for a Kaggle environment.</div>
            <div style="margin-bottom: 10px;"><b>2. Instantiate the Trainer:</b> We will create an instance of the <code>Trainer</code>, passing it our model, dataset, and arguments.</div>
            <div style="margin-bottom: 10px;"><b>3. Launch Training:</b> We will call the <code>.train()</code> method to start the fine-tuning process. You will see the training loss decrease over time, indicating that the model is learning.</div>
            <div style="margin-bottom: 10px;"><b>4. Save the Final Model:</b> Once training is complete, we will save our newly fine-tuned model to a directory so we can use it for inference in the next step.</div>
        </div>
    </div>

</div>


In [4]:
# ===============================================
# Step 4: Fine-Tuning the Model (Corrected Code)
# ===============================================

from transformers import Trainer, TrainingArguments, DataCollatorForLanguageModeling

# --- 1. Define Training Arguments ---
# (This part remains exactly the same)
training_args = TrainingArguments(
    output_dir="./chatbot_model",
    num_train_epochs=1,
    per_device_train_batch_size=2,
    warmup_steps=100,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=500,
    save_strategy="epoch",
    report_to="none"
)

# --- 2. Create the Data Collator ---
# This collator will automatically create the 'labels' needed for the loss calculation.
# For GPT-2 style models (Causal LM), we set mlm=False.
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer, 
    mlm=False
)

# --- 3. Instantiate the Trainer ---
# We now add the data_collator to the Trainer.
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
    data_collator=data_collator,  # <-- This is the new key argument
)

# --- 4. Launch Training ---
print("Starting the fine-tuning process...")
trainer.train()
print("Fine-tuning complete.")

# --- 5. Save the Final Model and Tokenizer ---
trainer.save_model("./chatbot_model")
tokenizer.save_pretrained("./chatbot_model")

print("\nModel and tokenizer have been saved to './chatbot_model'")


Starting the fine-tuning process...


`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.


Step,Training Loss
500,1.1209


Fine-tuning complete.

Model and tokenizer have been saved to './chatbot_model'


<div style="background-color: #013220; color: #EAEAEA; border-radius: 15px; padding: 30px; font-family: 'Helvetica Neue', sans-serif; border: 1px solid #388E3C;">

<div style="text-align: center; font-size: 28px; font-weight: bold; color: #A5D6A7; padding-bottom: 15px; border-bottom: 2px solid #388E3C; margin-bottom: 25px;">
        Step 5: Inference with the Fine-Tuned Chatbot Model
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Objective of This Step
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify; margin-bottom: 20px;">
        In this step, we load our fine-tuned GPT-2 variant (<code>distilgpt2</code>) from the <code>./chatbot_model</code> directory and use it to generate chatbot responses to custom prompts. We will leverage the Hugging Face <code>generate()</code> method to produce coherent, context-aware text that reflects the conversational style learned during fine-tuning.
    </div>

<div style="font-size: 18px; font-weight: bold; color: #81C784; margin-top: 25px; margin-bottom: 10px;">
        Key Actions to Perform
    </div>
    <div style="font-size: 15px; line-height: 1.7; text-align: justify;">
        <div style="padding-left: 20px;">
            <div style="margin-bottom: 10px;"><b>1.</b> Load the saved model and tokenizer from <code>./chatbot_model</code>.</div>
            <div style="margin-bottom: 10px;"><b>2.</b> Prepare a sample prompt that simulates a chatbot conversation (e.g., <code>[Q]: How are you today?</code>).</div>
            <div style="margin-bottom: 10px;"><b>3.</b> Tokenize the prompt and pass it to the model's <code>generate()</code> function to produce a continuation.</div>
            <div style="margin-bottom: 10px;"><b>4.</b> Decode the generated tokens back into text for human-readable output.</div>
        </div>
    </div>

</div>


In [5]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import re

# 1. Load fine-tuned model & tokenizer
model_path = "./chatbot_model" 
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer.pad_token = tokenizer.eos_token

# Helper function: Clean model output from special sequences
def clean_response(text, prompt):
    """
    This function removes the original prompt and special tokens 
    from the generated text.
    """
    # Remove the original prompt part
    reply = text[len(prompt):]
    # Strip GPT special format tags and extra spaces
    reply = re.sub(r"<s>|</s>", "", reply)
    return reply.strip()

# 2. Define a list of questions to ask the bot
questions = [
    "How are you today?",
    "what is your name?",
    "Do you feel Good?"
]

print("--- Starting Batch Inference ---")

# 3. Loop through the predefined questions and get answers
for user_q in questions:
    print(f"\n[Q]: {user_q}")
    
    # Format the prompt
    prompt = f"[Q]: {user_q} [A]:"
    
    # Tokenize the input
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    
    # Generate a response from the model
    
    output_ids = model.generate(
        **inputs,
        max_length=100,
        temperature=0.7,
        top_k=50,
        top_p=0.95,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=tokenizer.eos_token_id
    )
    
    # Decode the generated token IDs back to a string
    generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=False) # Keep special tokens for cleaning
    
    # Clean the response to show only the bot's answer
    reply = clean_response(generated_text, prompt)
    
    print(f"[Bot]: {reply}")

print("\n--- Batch Inference Finished ---")


--- Starting Batch Inference ---

[Q]: How are you today?
[Bot]: i'm in the middle of a busy day.)





[A]: i'm so tired.







<|endoftext|>

[Q]: what is your name?
[Bot]: i'm from the Philippines, so i'm from the Philippines.)

[Q]: Do you feel Good?
[Bot]: yes, i feel good.</Q]: so, what do you mean?</Q]: i feel bad.</Q]: i have to thank you.




<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>


</script>


<|endoftext|>

--- Batch Inference Finished ---


<div style="background-color:#121212; color:#f0f0f0; padding:20px; border-radius:10px; font-family:Arial, sans-serif; font-size:15px;">
    <h2 style="color:#77dd77;">📌 Chatbot Batch Inference (Non‑Interactive)</h2>
    <p>
        This notebook runs a <b>fine‑tuned Transformer chatbot model</b> in a 
        <span style="color:#add8e6;">non-interactive batch mode</span> so it can execute on Kaggle without requiring 
        <code>input()</code> from the user.  
        We define a <span style="color:#add8e6;">list of pre‑written questions</span> and the model generates responses for each in sequence.
    </p>
    <p style="color:#ffcccb;">
        ⚠ Note: Model outputs may <b>sometimes be inaccurate or incomplete</b> because the dataset used for training 
        was not very large or heavy. Limited data can reduce generalization quality.
    </p>
    <h3 style="color:#f0e68c;">Key Features:</h3>
    <ul>
        <li>✔ Pre‑defined questions list for automatic execution</li>
        <li>✔ Custom prompt formatting (<code>[Q]: ... [A]:</code>)</li>
        <li>✔ Response cleaning to remove special tokens</li>
        <li>✔ Tunable generation parameters (<code>temperature</code>, <code>top_k</code>, <code>top_p</code>)</li>
    </ul>
</div>
