# Assignment: Code-Focused Inference

**Objective:** Build a system using a pre-trained GPT-2 model that answers **only** Python coding questions and rejects everything else.

In [None]:
# 1. Install and Import Dependencies
!pip install transformers torch

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

In [None]:
# 2. Load Model and Tokenizer
# We use the small 'gpt2' model for efficiency.
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Ensure pad token is set (GPT-2 does not have one by default)
tokenizer.pad_token = tokenizer.eos_token
model.config.pad_token_id = tokenizer.eos_token_id

# 3. Implement Filtering & Generation Mechanism

We will use **Few-Shot Prompting** to guide the model. By providing examples of how to handle coding vs. non-coding questions, the model learns the pattern.

**Pattern:**
- Non-coding input -> "I can only answer Python coding questions."
- Coding input -> [Code Solution]

In [None]:
def ask_python_bot(question, max_new_tokens=50):
    """
    Generates a response if the question is about Python, otherwise refuses.
    """
    
    # Few-shot prompt to teach the behavior
    prompt = f"""
User: What is the capital of France?
Assistant: I can only answer Python coding questions.

User: How do I print hello in Python?
Assistant: print("Hello World")

User: Who won the world cup?
Assistant: I can only answer Python coding questions.

User: How do I create a list?
Assistant: my_list = [1, 2, 3]

User: {question}
Assistant:"""

    # Tokenize input
    inputs = tokenizer(prompt, return_tensors="pt")
    
    # Generate response
    # We use temperature=0.1 to make it deterministic and focused
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            pad_token_id=tokenizer.eos_token_id,
            temperature=0.1,  # Low temperature to be less creative/random
            do_sample=True,
            stop_strings=["\nUser:"], # Stop generating if it tries to continue the conversation
            tokenizer=tokenizer
        )
    
    # Decode output
    full_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    
    # Extract only the assistant's new response
    # The model might repeat the prompt, so we strip it.
    response_text = full_text[len(prompt):].strip()
    
    # Cleanup: sometimes it might generate extra newlines or "User:"
    if "User:" in response_text:
        response_text = response_text.split("User:")[0].strip()
        
    return response_text

# 4. Test the Model
We will test with a mix of Python and non-Python questions.

In [None]:
test_propmts = [
    "How do I write a for loop in Python?",
    "What is the tallest mountain?",
    "How to define a function?",
    "Tell me a joke about cats.",
    "import numpy as np"
]

print(f"{'Question':<40} | {'Response'}")
print("-"*80)

for q in test_propmts:
    response = ask_python_bot(q)
    print(f"{q:<40} | {response}")