##Assignment: Your task is to load a pre-trained GPT-2 model and configure it to answer only questions related to Python coding.

 1: Install and Import

In [18]:
!pip install transformers torch -q

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer, pipeline
import re

2:Load GPT-2 Model


In [19]:
print("Loading GPT-2 model...")

# Load GPT-2
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token

# Create generator
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

print("GPT-2 model loaded!")


Loading GPT-2 model...


Device set to use cuda:0


GPT-2 model loaded!


3:Python Question Detector

In [20]:
def is_python_question(text):
    """Check if question is about Python coding"""

    text_lower = text.lower()

    # High confidence Python keywords
    high_confidence_keywords = [
        'python', 'def ', 'class ', 'import ', 'from ', '__init__',
        'pip install', 'conda install', 'jupyter', 'colab',
        'pandas', 'numpy', 'matplotlib', 'seaborn', 'sklearn',
        'django', 'flask', 'fastapi', 'requests', 'beautifulsoup'
    ]

    # Python syntax patterns
    python_syntax = [
        'elif', 'except', 'finally', 'lambda', 'yield',
        'with open', 'try:', 'except:', 'if __name__',
        'self.', '.append(', '.extend(', '.join('
    ]

    # Programming concepts in Python context
    coding_concepts = [
        'list comprehension', 'dictionary comprehension',
        'generator expression', 'decorator', 'context manager',
        'virtual environment', 'package management',
        'python script', 'python function', 'python class',
        'python module', 'python package', 'python library'
    ]

    # Code-like patterns (regex)
    code_patterns = [
        r'def\s+\w+\s*\(',           # function definitions
        r'class\s+\w+\s*[\(:]',      # class definitions
        r'import\s+[\w\.]+',         # import statements
        r'from\s+\w+\s+import',      # from-import
        r'\w+\s*=\s*\[.*\]',        # list assignments
        r'\w+\s*=\s*\{.*\}',        # dict assignments
        r'print\s*\(',               # print statements
        r'len\s*\(',                 # len() calls
        r'range\s*\(',               # range() calls
        r'for\s+\w+\s+in\s+',       # for loops
        r'if\s+.*:',                 # if statements
        r'while\s+.*:',              # while loops
    ]

    # Check all patterns
    for keyword in high_confidence_keywords:
        if keyword in text_lower:
            return True

    for syntax in python_syntax:
        if syntax in text_lower:
            return True

    for concept in coding_concepts:
        if concept in text_lower:
            return True

    for pattern in code_patterns:
        if re.search(pattern, text_lower):
            return True

    return False

print("Python detector ready!")


Python detector ready!


 4: Simple Assistant Function

In [21]:
def ask_python_question(question):
    """Main function - answers only Python questions"""

    print(f"Question: {question}")
    print("-" * 60)

    # Check if Python question
    if not is_python_question(question):
        print("NOT PYTHON RELATED")
        print("I only answer Python coding questions.")
        print()
        return

    # Generate answer for Python questions
    print("PYTHON QUESTION DETECTED - Generating answer...")

    prompt = f"Python question: {question}\nAnswer:"

    try:
        response = generator(
            prompt,
            max_new_tokens=50,  # Limit to 50 new tokens only
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id,
            truncation=True
        )

        answer = response[0]['generated_text'].replace(prompt, "").strip()

        # Make answer shorter - take only first sentence or first 100 characters
        if answer:
            # Take first sentence or first 100 chars, whichever is shorter
            first_sentence = answer.split('.')[0] + '.'
            if len(first_sentence) > 100:
                answer = answer[:100] + "..."
            else:
                answer = first_sentence
            print(f"Answer: {answer}")
        else:
            print("Answer: I can help you with that Python question!")

    except Exception as e:
        print(f"Error: {e}")

    print()

print("Assistant ready!")

Assistant ready!


5:Automatic Testing

In [22]:
print("🧪 TESTING THE ASSISTANT")
print("=" * 70)

# Test questions
test_questions = [
    # Python questions (should answer)
    "How do I create a list in Python?",
    "What is def in Python?",
    "How to use pandas dataframe?",
    "Explain list comprehension",
    "How does try except work?",

    # Non-Python questions (should reject)
    "What is the capital of France?",
    "How do I bake a cake?",
    "What's the weather today?",
]

# Run tests
for i, question in enumerate(test_questions, 1):
    print(f"TEST {i}:")
    ask_python_question(question)

print("=" * 70)
print("TESTING COMPLETE!")

🧪 TESTING THE ASSISTANT
TEST 1:
Question: How do I create a list in Python?
------------------------------------------------------------
PYTHON QUESTION DETECTED - Generating answer...
Answer: If you use a standard search engine like Google or Bing, you can create a list of all the words in a...

TEST 2:
Question: What is def in Python?
------------------------------------------------------------
PYTHON QUESTION DETECTED - Generating answer...
Answer: Def is an interface to Python for data formats such as JSON and XML.

TEST 3:
Question: How to use pandas dataframe?
------------------------------------------------------------
PYTHON QUESTION DETECTED - Generating answer...
Answer: The idea is that you want to use the dataframe type of your project as a starting point.

TEST 4:
Question: Explain list comprehension
------------------------------------------------------------
PYTHON QUESTION DETECTED - Generating answer...
Answer: You can read it in more detail at the end of this post.

T

6: Usage Examples

In [23]:
print("\nHOW TO USE:")
print("ask_python_question('your question here')")

print("\nTRY THESE:")
print("ask_python_question('How do I create a dictionary in Python?')")
print("ask_python_question('What is a for loop?')")
print("ask_python_question('How to import numpy?')")

print("\nTHESE WILL BE REJECTED:")
print("ask_python_question('What is machine learning?')")
print("ask_python_question('How to cook pasta?')")

print("\nREADY TO USE!")


HOW TO USE:
ask_python_question('your question here')

TRY THESE:
ask_python_question('How do I create a dictionary in Python?')
ask_python_question('What is a for loop?')
ask_python_question('How to import numpy?')

THESE WILL BE REJECTED:
ask_python_question('What is machine learning?')
ask_python_question('How to cook pasta?')

READY TO USE!


7:Manual Testing

In [26]:
print("LIVE EXAMPLES")


# Example 1 - Python question
ask_python_question("How do I create a variable in Python?")

# Example 2 - Non-Python question
ask_python_question("What is the weather like today?")

# Example 3 - Python syntax
ask_python_question("What does def mean in Python?")

print("ASSISTANT IS WORKING!")

LIVE EXAMPLES
Question: How do I create a variable in Python?
------------------------------------------------------------
PYTHON QUESTION DETECTED - Generating answer...
Answer: I'll ask you to create a Python variable named "my_variable" in the Python interpreter.

Question: What is the weather like today?
------------------------------------------------------------
NOT PYTHON RELATED
I only answer Python coding questions.

Question: What does def mean in Python?
------------------------------------------------------------
PYTHON QUESTION DETECTED - Generating answer...
Answer: Def means, "to create a rule or to create a rule (or to act as a rule).

ASSISTANT IS WORKING!


## Conclusion

The project successfully demonstrated how placing a filter in front of a pre-trained LLM can create a specialized chatbot.

The comprehensive keyword and pattern detection system worked effectively, accurately identifying Python-related queries while rejecting off-topic questions like weather or cooking, proving that even basic input control is highly effective.

However, the base GPT-2 model revealed significant limitations. Its answers were often inaccurate or incomplete—for instance, generating partial responses or missing key Python concepts. The model produced text that appeared technical but frequently lacked factual correctness and depth expected for coding assistance.

The experiment highlights that while general-purpose models can generate plausible text, they are insufficient for specialized expert tasks requiring accurate domain knowledge.

Key findings:


Overall, the project underscored both the power of input filtering and the critical importance of using domain-specific or properly fine-tuned models for specialized applications. While the filtering approach proved effective for scope control, the underlying model's limitations highlighted the need for more sophisticated approaches in production systems.