# GPT2 math test

## 1. Introduction
This **Notebook** provides an interactive environment for exploring the capabilities of a fine-tuned GPT-2 model in solving math problems. You'll be able to load and compare a fine-tuned model against a base GPT-2 model, adjust various generation parameters on the fly, run predefined test cases, and even interactively pose your own math questions to the model. This setup allows for easy experimentation and observation of how different parameters affect the model's output.

## 2. Run It!
This program consists of two parts: a comparison mode where the fine-tuned model's performance is gauged against a base GPT-2 model using a set of predefined test cases, and an interactive mode that allows you to directly input and receive answers for your own math problems using the fine-tuned model.

### *Part A: Comparison Between Base GPT-2 Model and the Fine-tuned One* 

In [15]:
# --- Import Libraries ---
import os
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from datetime import datetime
import ipywidgets as widgets
from IPython.display import display, HTML, clear_output

In [13]:
# --- Global Variables for Models and Tokenizers ---
# These will be populated after the "Load Models" cell is run
fine_tuned_model = None
fine_tuned_tokenizer = None
base_model = None
base_tokenizer = None
current_device = None # To store the determined device

The following section sets up **interactive widgets** that allow you to easily **adjust various parameters** for the GPT-2 models, such as model paths, generation length, temperature, and device settings, directly within the notebook interface.

In [16]:
# --- Interactive Parameter Widgets ---
# Define widgets for user input
model_path_widget = widgets.Text(
    value="../models/fine_tuned_gpt2_math_2",
    description="Fine-tuned Model Path:",
    placeholder="e.g., ../models/fine_tuned_gpt2_math_2",
    layout=widgets.Layout(width='80%')
)

base_model_name_widget = widgets.Text(
    value="gpt2",
    description="Base Model Name:",
    placeholder="e.g., gpt2, gpt2-medium",
    layout=widgets.Layout(width='80%')
)

cache_dir_widget = widgets.Text(
    value="../models",
    description="Cache Directory:",
    placeholder="e.g., ../models",
    layout=widgets.Layout(width='80%')
)

max_length_widget = widgets.IntSlider(
    value=256,
    min=50,
    max=512,
    step=10,
    description="Max Generation Length:",
    continuous_update=False, # Only update on release
    orientation='horizontal',
    readout=True,
    readout_format='d'
)

temperature_widget = widgets.FloatSlider(
    value=0.7,
    min=0.1,
    max=2.0,
    step=0.1,
    description="Temperature:",
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='.1f'
)

top_p_widget = widgets.FloatSlider(
    value=0.9,
    min=0.1,
    max=1.0,
    step=0.05,
    description="Top-p:",
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='.2f'
)

device_widget = widgets.Dropdown(
    options=["auto", "cuda", "cpu"],
    value="auto",
    description="Device:",
)

output_dir_widget = widgets.Text(
    value="../res",
    description="Output Directory:",
    placeholder="e.g., ../res",
    layout=widgets.Layout(width='80%')
)

# Display all widgets
print("Adjust parameters below and then run the 'Load Models' cell:")
display(
    model_path_widget,
    base_model_name_widget,
    cache_dir_widget,
    max_length_widget,
    temperature_widget,
    top_p_widget,
    device_widget,
    output_dir_widget
)

Adjust parameters below and then run the 'Load Models' cell:


Text(value='../models/fine_tuned_gpt2_math_2', description='Fine-tuned Model Path:', layout=Layout(width='80%'…

Text(value='gpt2', description='Base Model Name:', layout=Layout(width='80%'), placeholder='e.g., gpt2, gpt2-m…

Text(value='../models', description='Cache Directory:', layout=Layout(width='80%'), placeholder='e.g., ../mode…

IntSlider(value=256, continuous_update=False, description='Max Generation Length:', max=512, min=50, step=10)

FloatSlider(value=0.7, continuous_update=False, description='Temperature:', max=2.0, min=0.1, readout_format='…

FloatSlider(value=0.9, continuous_update=False, description='Top-p:', max=1.0, min=0.1, step=0.05)

Dropdown(description='Device:', options=('auto', 'cuda', 'cpu'), value='auto')

Text(value='../res', description='Output Directory:', layout=Layout(width='80%'), placeholder='e.g., ../res')

**Run the cell below to define the function**

In [17]:
# --- Load Models and Tokenizers (Interactive Function) ---
# This cell will now use the values from the widgets
def load_models_and_tokenizers(
    model_path,
    base_model_name,
    cache_dir,
    selected_device,
    output_dir
):
    global fine_tuned_model, fine_tuned_tokenizer, base_model, base_tokenizer, current_device

    clear_output(wait=True) # Clear previous output to show fresh loading status
    print(f"Loading models with parameters:")
    print(f"  Fine-tuned Model Path: {model_path}")
    print(f"  Base Model Name: {base_model_name}")
    print(f"  Cache Directory: {cache_dir}")
    print(f"  Selected Device: {selected_device}")
    print(f"  Output Directory: {output_dir}")
    print("-" * 50)

    # Determine the device
    if selected_device == "auto":
        current_device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    else:
        current_device = torch.device(selected_device)

    print(f"Using device: {current_device}")

    # Create output directory if it doesn't exist
    os.makedirs(output_dir, exist_ok=True)

    # Check if the fine-tuned model exists
    if not os.path.exists(model_path):
        print(f"Error: Fine-tuned model directory not found at '{model_path}'.")
        print("Please ensure the training program has been run and the model saved correctly.")
        fine_tuned_model = None
        fine_tuned_tokenizer = None
    else:
        print("Loading fine-tuned model and tokenizer...")
        try:
            fine_tuned_model = AutoModelForCausalLM.from_pretrained(model_path, cache_dir=cache_dir)
            fine_tuned_tokenizer = AutoTokenizer.from_pretrained(model_path, cache_dir=cache_dir)
            # Ensure pad_token exists for consistent generation behavior
            if fine_tuned_tokenizer.pad_token is None:
                fine_tuned_tokenizer.pad_token = fine_tuned_tokenizer.eos_token
                if fine_tuned_model.config.pad_token_id is None:
                    fine_tuned_model.config.pad_token_id = fine_tuned_model.config.eos_token_id
            fine_tuned_model.to(current_device)
            fine_tuned_model.eval() # Set to evaluation mode
            print("Fine-tuned model loaded successfully.")
        except Exception as e:
            print(f"Error loading fine-tuned model or tokenizer from '{model_path}': {e}")
            print("Please verify the path and ensure the directory contains valid model files.")
            fine_tuned_model = None
            fine_tuned_tokenizer = None

    print(f"Loading base model '{base_model_name}' and tokenizer...")
    try:
        base_model = AutoModelForCausalLM.from_pretrained(base_model_name, cache_dir=cache_dir)
        base_tokenizer = AutoTokenizer.from_pretrained(base_model_name, cache_dir=cache_dir)
        # Ensure pad_token exists for consistent generation behavior
        if base_tokenizer.pad_token is None:
             base_tokenizer.pad_token = base_tokenizer.eos_token
             if base_model.config.pad_token_id is None:
                 base_model.config.pad_token_id = base_model.config.eos_token_id
        base_model.to(current_device)
        base_model.eval() # Set to evaluation mode
        print("Base model loaded successfully.")
    except Exception as e:
        print(f"Error loading base model or tokenizer '{base_model_name}': {e}")
        print("Please check the model name and your internet connection.")
        base_model = None
        base_tokenizer = None

# Use widgets.interactive to create an interactive UI for loading models
# This will execute load_models_and_tokenizers whenever a widget value changes
# We use a button to trigger the load explicitly, as loading models can be slow.
load_button = widgets.Button(description="Load Models")
output_area = widgets.Output()

def on_load_button_clicked(b):
    with output_area:
        load_models_and_tokenizers(
            model_path_widget.value,
            base_model_name_widget.value,
            cache_dir_widget.value,
            device_widget.value,
            output_dir_widget.value
        )

load_button.on_click(on_load_button_clicked)

display(load_button, output_area)

# --- Answer Generation Function ---
def generate_answer(model, tokenizer, question):
    """Generates an answer to a math problem using the specified model and tokenizer."""
    if model is None or tokenizer is None:
        return "Error: Model or tokenizer not loaded. Please run the 'Load Models' cell first."

    # Use the values from the widgets for generation parameters
    max_length = max_length_widget.value
    temperature = temperature_widget.value
    top_p = top_p_widget.value

    input_text = f"Question: {question}\nAnswer: "
    inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True, return_attention_mask=True)
    input_ids = inputs.input_ids.to(current_device)
    attention_mask = inputs.attention_mask.to(current_device)

    with torch.no_grad():
        output_sequences = model.generate(
            input_ids=input_ids,
            attention_mask=attention_mask,
            max_length=max_length,
            num_return_sequences=1,
            pad_token_id=tokenizer.pad_token_id,
            do_sample=True,
            temperature=temperature,
            top_p=top_p,
            early_stopping=False
        )

    num_prompt_tokens = input_ids.shape[1]
    generated_ids = output_sequences[0][num_prompt_tokens:]
    answer = tokenizer.decode(generated_ids, skip_special_tokens=True).strip()

    return answer

Button(description='Load Models', style=ButtonStyle())

Output()

In [18]:
# --- Run Test Cases ---
print("\n===== Running Tests and Saving Results =====")

test_problems = [
    "If the length of a rectangle is 8 meters and the width is 5 meters, what is its area?",
    "Ming has 12 apples. He gives 3 to Hong and 2 to Gang. How many apples does Ming have left?",
    "The radius of a circle is 6 centimeters. What is its circumference? (Take π as 3.14)",
    "A train travels at a speed of 72 kilometers per hour. How many kilometers can it travel in 2.5 hours?",
    "A shop has a batch of goods. On the first day, 1/3 of the total was sold. On the second day, 40% of the remainder was sold. If 120 items are left, how many items were there originally in this batch?"
]

# Define output file path with a timestamp
output_dir = output_dir_widget.value # Get the output directory from the widget
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_filename = f"math_test_results_{timestamp}.txt"
output_filepath = os.path.join(output_dir, output_filename)

with open(output_filepath, "w", encoding="utf-8") as f:
    f.write(f"Math Problem Test Results\n")
    f.write(f"Timestamp: {timestamp}\n")
    f.write(f"Fine-tuned Model: {model_path_widget.value}\n")
    f.write(f"Base Model: {base_model_name_widget.value}\n")
    f.write(f"Device: {current_device}\n")
    f.write(f"Max Length: {max_length_widget.value}\n")
    f.write(f"Temperature: {temperature_widget.value}\n")
    f.write(f"Top-p: {top_p_widget.value}\n")
    f.write("-" * 50 + "\n\n")

    for i, problem in enumerate(test_problems, 1):
        print(f"\nProcessing Problem {i}...")
        f.write(f"===== Problem {i} =====\n")
        f.write(f"Question: {problem}\n\n")

        # Generate answer from base model
        if base_model and base_tokenizer:
            print("Generating answer with Base Model...")
            base_answer = generate_answer(base_model, base_tokenizer, problem)
            f.write(f"Base Model Answer: {base_answer}\n\n")
            print(f"Base Model Answer: {base_answer}")
        else:
            f.write("Base Model not loaded, skipping generation.\n\n")
            print("Base Model not loaded, skipping generation.")

        # Generate answer from fine-tuned model
        if fine_tuned_model and fine_tuned_tokenizer:
            print("Generating answer with Fine-tuned Model...")
            fine_tuned_answer = generate_answer(fine_tuned_model, fine_tuned_tokenizer, problem)
            f.write(f"Fine-tuned Model Answer: {fine_tuned_answer}\n")
            print(f"Fine-tuned Model Answer: {fine_tuned_answer}")
        else:
            f.write("Fine-tuned Model not loaded, skipping generation.\n")
            print("Fine-tuned Model not loaded, skipping generation.")

        f.write("-" * 50 + "\n\n")

print(f"\n===== Testing Complete. Results saved to {output_filepath} =====")


===== Running Tests and Saving Results =====

Processing Problem 1...
Generating answer with Base Model...
Base Model Answer: It is the area of the rectangle, the width and height of the rectangle, the area of the rectangle, the width and height of the rectangle.
Answer:  It is the area of the rectangle, the width and height of the rectangle, the area of the rectangle, the width and height of the rectangle.
Answer:  It is the area of the rectangle, the width and height of the rectangle, the area of the rectangle, the width and height of the rectangle.
Answer:  It is the area of the rectangle, the width and height of the rectangle, the area of the rectangle, the width and height of the rectangle.
Answer:  It is the area of the rectangle, the width and height of the rectangle, the area of the rectangle, the width and height of the rectangle.
Answer:  It is the area of the rectangle, the width and height of the rectangle, the area of the rectangle, the width and height of the rectangle.


### *Part B: Interactive Math Test for Fine-tune GTP-2 Model*

In [19]:
# --- Interactive Mode ---
print("\n===== Interactive Mode (using Fine-tuned Model) =====")
print("Enter math problems below. Type 'q' or 'quit' to exit.")

# Create a text area for user input in interactive mode
interactive_input = widgets.Textarea(
    value="",
    placeholder="Enter a math problem here...",
    description="Problem:",
    layout=widgets.Layout(width='80%', height='100px')
)

# Create a button to generate answer
generate_button = widgets.Button(description="Generate Answer")

# Create an output area for the model's answer
interactive_output_area = widgets.Output()

def on_generate_button_clicked(b):
    with interactive_output_area:
        clear_output(wait=True)
        user_input = interactive_input.value.strip()
        if not user_input:
            print("Input cannot be empty. Please enter a problem.")
            return

        if fine_tuned_model and fine_tuned_tokenizer:
            print("Generating answer...")
            answer = generate_answer(fine_tuned_model, fine_tuned_tokenizer, user_input)
            print(f"Model Answer: {answer}")
        else:
            print("Fine-tuned model not available for interactive mode. Please load models first.")

generate_button.on_click(on_generate_button_clicked)

# Display interactive widgets
display(interactive_input, generate_button, interactive_output_area)

# Optional: Clear CUDA cache if using GPU, to free up memory
if current_device and current_device.type == 'cuda':
    torch.cuda.empty_cache()
    print("CUDA cache cleared.")


===== Interactive Mode (using Fine-tuned Model) =====
Enter math problems below. Type 'q' or 'quit' to exit.


Textarea(value='', description='Problem:', layout=Layout(height='100px', width='80%'), placeholder='Enter a ma…

Button(description='Generate Answer', style=ButtonStyle())

Output()

CUDA cache cleared.
