# Assignment: Code-Focused Inference
- Your task is to load a pre-trained GPT-2 model and configure it to answer only questions related to Python coding.

- Load Model and Tokenizer: Load a suitable pre-trained GPT-2 model and its corresponding tokenizer. You can use transformers.AutoModelForCausalLM and transformers.AutoTokenizer. A smaller model like gpt2 or gpt2-medium might be sufficient.
- Implement a Filtering Mechanism: Before generating a response, check if the input prompt is related to Python coding. You can use simple keyword matching (e.g., "Python", "code", "function", "class", "import") or a more sophisticated approach using a text classification model (optional).
- Generate Response: If the prompt is deemed a Python coding question, generate a response using the loaded GPT-2 model.
- Handle Non-Coding Questions: If the prompt is not related to Python coding, return a predefined message indicating that the model can only answer coding questions.
- Test: Test your implementation with various prompts, including both Python coding questions and non-coding questions, to ensure the filtering mechanism works correctly.






# Python Code Assistant: A Simple Introduction



In [None]:
# 1. Importing necessary libraries
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

print("--- Loading Model and Tokenizer ---")
# 2. Load a pre-trained GPT-2 model and tokenizer
# we are using "gpt2" opensource
model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Set the pad token to the end-of-sentence token for open-ended generation
tokenizer.pad_token = tokenizer.eos_token
print("--- Model and Tokenizer Loaded Successfully ---\n")


# 3. Implement the filtering mechanism
def is_python_question(prompt: str) -> bool:
    """
    Checks if a prompt is related to Python coding using keyword matching.
    The check is case-insensitive.
    """
    python_keywords = [
        "python", "code", "function", "class", "import", "def",
        "list", "dict", "tuple", "pandas", "numpy", "torch",
        "tensorflow", "fastapi", "django", "flask", "error",
        "syntax", "algorithm", "variable", "loop"
    ]
    prompt_lower = prompt.lower()
    return any(keyword in prompt_lower for keyword in python_keywords)

def get_code_answer(prompt: str, max_length: int = 100):
    """
    Generates a response if the prompt is a Python coding question.
    Otherwise, returns a predefined refusal message.
    """
    print(f"User Prompt: \"{prompt}\"")

    # First, check if the prompt is a valid coding question
    if not is_python_question(prompt):
        return "I'm sorry, I can only answer questions about Python coding. Please try again with a coding-related question."

    # If it is, encode the prompt and generate a response
    inputs = tokenizer.encode(prompt, return_tensors="pt")

    # Generate text using the model
    outputs = model.generate(
        inputs,
        max_length=max_length,
        num_return_sequences=1,
        pad_token_id=tokenizer.eos_token_id,
        no_repeat_ngram_size=2, # it helps in preventing repetitive loops in the output
        early_stopping=True
    )

    # Decode the generated token IDs back to text
    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return generated_text

# 5. Test the implementation with various prompts
print("#######-----Testing with Python-related questions-----########")

# Test Case 1: Python question
python_prompt_1 = "How do I define a function in Python?"
print(f"Response:\n{get_code_answer(python_prompt_1)}\n")

# Test Case 2: Another Python question
python_prompt_2 = "What is a list comprehension in Python code?"
print(f"Response:\n{get_code_answer(python_prompt_2, max_length=120)}\n")

print("--- Testing with non-coding questions ---")

# Test Case 3: Non-coding question

non_coding_prompt_1 = "What is the capital of France?"
print(f"Response:\n{get_code_answer(non_coding_prompt_1)}\n")

# Test Case 4: Another non-coding question

non_coding_prompt_2 = "Can you tell me a story?"
print(f"Response:\n{get_code_answer(non_coding_prompt_2)}\n")

# Test Case : 5 : Another coding question :

prompt_code = "How do I define a function in Python?"
print(f"Response:\n{get_code_answer(prompt_code)}\n")

--- Loading Model and Tokenizer ---
--- Model and Tokenizer Loaded Successfully ---

#######-----Testing with Python-related questions-----########
User Prompt: "How do I define a function in Python?"
Response:
How do I define a function in Python?

The Python documentation has a very simple definition of a variable called a_function. It's a list of functions that can be called from any Python program.
.py is a Python function that takes a string and returns a tuple of strings. The function is called by the Python interpreter. You can use the function to call any function you want. For example, you can call a() from a program like this:
, a = a

User Prompt: "What is a list comprehension in Python code?"
Response:
What is a list comprehension in Python code?

A list is an object that contains a set of values. It is used to represent a collection of items.
.list is the most common type of list. The list can be used as a way to store information about a given item. For example, a diction