## Set up

# üåÄ Q&A Generator

This Colab project uses open-source **Hugging Face instruct model** to build a Q&A on LLM Engineering.

---

## ‚öôÔ∏è Context
1. **Topics** a list of topics from Week 1-3 of LLM Engineering course by Ed Donner.  
2. **Difficulty** three levels ranging from Beginner, Intermediate and Advanced.  
3. **Interface** built with the beautiful Gradio UI.

---

## üß† Highlights
- Generate helpful Q&A to assist learning with **Meta-Llama-3.1-8B-Instruct**.  
- Supports **4-bit quantized loading** for efficiency, can be run on free T4 GPU via Google Colab.
- Simple **Gradio user interface** with ability to download JSON and CSV files.  

---

## üìò Notebook
üëâ <a href="https://colab.research.google.com/drive/1A8mtfT_JyJQISWa96ZEduHsqNnpb5yGN?usp=share_link" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt=" Open In Google Colab "/></a>

---

## üß© Tech Stack
`Google Colab ¬∑ Hugging Face ¬∑ Transformers ¬∑ BitsAndBytes ¬∑ Pandas ¬∑ JSON ¬∑ Torch ¬∑ Gradio`

---

## üí° Summary
**Q&A Generator** shows how open-source LLM models can be used as a **powerful learning tool**, delivering **on-demand support** for students learning new topics.

In [None]:
# Built to run on Google Colab
!pip install -q transformers accelerate bitsandbytes torch gradio

In [None]:
# Import libraries
import torch
import json
import pandas as pd
import gradio as gr
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from huggingface_hub import login
from google.colab import userdata

In [None]:
# Authenticate with Hugging Face
hf_token = userdata.get('HF_TOKEN')
login(hf_token, add_to_git_credential=True)
print("Successfully authenticated with Hugging Face")

## Load model

In [None]:
# Model configuration
MODEL_NAME = "meta-llama/Meta-Llama-3.1-8B-Instruct"

# 4-bit quantization for efficiency on T4 GPU
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4"
)

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    device_map="auto",
    quantization_config=quant_config
)

print("Model loaded successfully!")

## Define learning curriculum topics and difficulty settings

In [None]:
# Topic definitions, revise and add as needed
TOPICS = {
    "Week 1: LLM APIs & Prompting": {
        "concepts": [
            "Cursor IDE set up",
            "Introduction to git and github",
            "Different types of LLM: base, instruct, reasoning",
            "OpenAI API usage and parameters",
            "Ollama and OpenRouter usage",
            "Prompt engineering techniques",
            "System vs user messages",
            "JSON mode and structured outputs",
            "Token counting and API pricing",
            "Encoding and decoding",
            "Chat completions vs completions",
            "Multi-shot prompting",
            "Prompt caching",
        ]
    },
    "Week 2: Function Calling & Agents": {
        "concepts": [
            "Function calling syntax and format",
            "Tool definitions and schemas",
            "Common use cases for Tools",
            "Parallel function calling",
            "Function calling best practices",
            "Agent patterns and workflows",
            "Structured outputs with Pydantic",
            "Error handling in function calls",
            "Use of Gradio UI",
        ]
    },
    "Week 3: Transformers & Models": {
        "concepts": [
            "Difference between Hugging Face platform and libraries",
            "Popular Hugging Face libraries",
            "Tokenizers and tokenization strategies",
            "Hugging Face pipelines",
            "AutoModel and AutoTokenizer",
            "Model quantization (4-bit, 8-bit)",
            "Speech-to-text with Whisper",
            "Local vs cloud model inference",
            "Model architectures (encoder, decoder, encoder-decoder)",
            "Introduction to Google Colab",
            "Deep dive into LLM layers such as Llama 3.2",
            "LLM temperature or do sample settings",
        ]
    }
}

In [None]:
# Difficulty definitions
DIFFICULTY_LEVELS = {
    "Beginner": "Basic understanding of concepts and definitions",
    "Intermediate": "Application of concepts with some technical depth",
    "Advanced": "Edge cases, optimization, and deep technical understanding",
}

## Main function to generate multiple-choice questions

In [None]:
def generate_questions(topic, difficulty, num_questions):
    """
    Generate Q&A questions using the LLM.

    Args:
        topic: Topic category to generate questions for based on the curriculum
        difficulty: Difficulty level (Beginner/Intermediate/Advanced)
        num_questions: Number of questions to generate

    Returns:
        List of dictionaries containing questions and answers
    """

    # Get topic details
    topic_info = TOPICS[topic]
    concepts = ", ".join(topic_info["concepts"])

    # Build the system prompt
    system_message = """
    You are an expert teacher creating high-quality multiple-choice questions for an LLM Engineering course.

    Format each question EXACTLY as shown below:

    QUESTION: [question text]
    A) [option A]
    B) [option B]
    C) [option C]
    D) [option D]
    ANSWER: [correct letter]
    EXPLANATION: [brief explanation]
    ---
    """

    # Build the user prompt
    user_prompt = f"""
    Create {num_questions} multiple-choice questions about: {topic}

    Difficulty Level: {difficulty}

    Cover these concepts: {concepts}

    Requirements:
    - Questions should be practical and relevant to real LLM engineering
    - All 4 options should be plausible
    - Explanations should be clear and educational
    - Vary the correct answer position

    Generate {num_questions} questions now:
    """

    # Prepare messages for LLM
    messages = [
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_prompt}
    ]

    # Tokenize using HF's apply_chat_template utility
    input_ids = tokenizer.apply_chat_template(
        messages,
        return_tensors="pt",
        add_generation_prompt=True
    ).to(model.device)

    attention_mask = torch.ones_like(input_ids).to(model.device)

    # Generate and set max tokens
    print(f"Generating {num_questions} questions...")
    max_tokens = min(2500, num_questions * 200)

    with torch.no_grad():
        outputs = model.generate(
            input_ids,
            attention_mask=attention_mask,
            max_new_tokens=max_tokens,
            temperature=0.7,
            do_sample=True,
            top_p=0.9,
            pad_token_id=tokenizer.eos_token_id
        )

    # Decode
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    # Extract just the assistant's response
    if "assistant" in response:
        response = response.split("assistant")[-1].strip()

    # Debug: print what we got
    print("Generated text preview:")
    print(response[:500] + "..." if len(response) > 500 else response)
    print()

    # Parse the questions
    questions = parse_questions(response, topic, difficulty)

    print(f"Successfully generated {len(questions)} questions")
    return questions

In [None]:
def parse_questions(text, topic, difficulty):
    """
    Parse the generated text into structured question objects.
    More robust parsing that handles various formats.
    """
    questions = []

    # Split by "QUESTION:" to get individual question blocks
    blocks = text.split("QUESTION:")

    for i, block in enumerate(blocks):
        if not block.strip() or i == 0 and len(block) < 20:
            continue

        try:
            # Extract components
            question_text = ""
            options = {}
            answer = ""
            explanation = ""

            lines = block.strip().split("\n")

            for line in lines:
                line = line.strip()
                if not line or line == "---":
                    continue

                # Handle question text (first non-empty line before options)
                if not question_text and not any(line.startswith(x) for x in ["A)", "B)", "C)", "D)", "ANSWER:", "EXPLANATION:", "Answer:", "Explanation:"]):
                    question_text = line

                # Handle options - be flexible with formatting
                elif line.startswith("A)") or line.startswith("A."):
                    options["A"] = line[2:].strip()
                elif line.startswith("B)") or line.startswith("B."):
                    options["B"] = line[2:].strip()
                elif line.startswith("C)") or line.startswith("C."):
                    options["C"] = line[2:].strip()
                elif line.startswith("D)") or line.startswith("D."):
                    options["D"] = line[2:].strip()

                # Handle answer
                elif line.upper().startswith("ANSWER:"):
                    answer = line.split(":", 1)[1].strip()

                # Handle explanation
                elif line.upper().startswith("EXPLANATION:"):
                    explanation = line.split(":", 1)[1].strip()
                elif explanation and len(explanation) < 200:
                    # Continue multi-line explanation (up to reasonable length)
                    explanation += " " + line

            # Extract just the letter from answer
            if answer:
                answer_letter = ""
                for char in answer.upper():
                    if char in ["A", "B", "C", "D"]:
                        answer_letter = char
                        break
                answer = answer_letter

            # Only add if we have minimum required components
            if question_text and len(options) >= 3 and answer:
                # Fill missing option if needed
                if len(options) == 3:
                    for letter in ["A", "B", "C", "D"]:
                        if letter not in options:
                            options[letter] = "Not applicable"
                            break

                # Use placeholder explanation if none provided
                if not explanation:
                    explanation = f"The correct answer is {answer}."

                questions.append({
                    "id": len(questions) + 1,
                    "topic": topic,
                    "difficulty": difficulty,
                    "question": question_text,
                    "options": options,
                    "correct_answer": answer,
                    "explanation": explanation.strip()
                })
                print(f"Parsed question {len(questions)}")
            else:
                print(f"Skipped incomplete block: Q={bool(question_text)}, Opts={len(options)}, Ans={bool(answer)}")

        except Exception as e:
            print(f"Error parsing block {i+1}: {str(e)}")
            continue

    return questions

In [None]:
def format_questions_display(questions):
    """Format questions for display in Gradio."""
    if not questions:
        return "No questions generated."

    output = f"# Generated Questions\n\n"
    output += f"**Total Questions:** {len(questions)}\n\n"
    output += "---\n\n"

    for q in questions:
        output += f"## Question {q['id']}\n\n"
        output += f"**Topic:** {q['topic']}  \n"
        output += f"**Difficulty:** {q['difficulty']}  \n\n"
        output += f"**Q:** {q['question']}\n\n"

        for letter in ['A', 'B', 'C', 'D']:
            # prefix currently NOT in use so answer is not obviously shown in UI
            prefix = "‚úÖ " if letter == q['correct_answer'] else ""
            output += f"{letter}) {q['options'][letter]}\n\n"

        output += f"**Answer:** {q['correct_answer']}\n\n"
        output += f"**Explanation:** {q['explanation']}\n\n"
        output += "---\n\n"

    return output

In [None]:
def export_to_json(questions):
    """Export questions to JSON file."""
    if not questions:
        return None

    filename = "qa_dataset.json"
    with open(filename, 'w') as f:
        json.dump(questions, f, indent=2)

    return filename

In [None]:
def export_to_csv(questions):
    """Export questions to CSV file."""
    if not questions:
        return None

    # Flatten the data for CSV
    flattened = []
    for q in questions:
        flattened.append({
            'id': q['id'],
            'topic': q['topic'],
            'difficulty': q['difficulty'],
            'question': q['question'],
            'option_A': q['options']['A'],
            'option_B': q['options']['B'],
            'option_C': q['options']['C'],
            'option_D': q['options']['D'],
            'correct_answer': q['correct_answer'],
            'explanation': q['explanation']
        })

    filename = "qa_dataset.csv"
    df = pd.DataFrame(flattened)
    df.to_csv(filename, index=False)

    return filename

## Build Gradio UI

In [None]:
def gradio_generate(topic, difficulty, num_questions):
    """
    Wrapper function for Gradio interface.
    Generates questions and returns formatted output plus download files.
    """
    try:
        # Generate questions
        questions = generate_questions(topic, difficulty, num_questions)

        if not questions:
            return "Failed to generate questions. Please try again.", None, None

        # Format for display
        display_text = format_questions_display(questions)

        # Export files
        json_file = export_to_json(questions)
        csv_file = export_to_csv(questions)

        return display_text, json_file, csv_file

    except Exception as e:
        return f"Error: {str(e)}", None, None

In [None]:
# Build the Gradio UI
with gr.Blocks(title="Q&A Generator", theme=gr.themes.Soft()) as demo:

    gr.Markdown("""
    # üìö Educational Q&A Dataset Generator
    Generate high-quality multiple-choice questions for LLM Engineering topics
    """)

    with gr.Row():
        with gr.Column(scale=1):
            gr.Markdown("### ‚öôÔ∏è Configuration")

            topic_dropdown = gr.Dropdown(
                choices=list(TOPICS.keys()),
                value="Week 1: LLM APIs & Prompting",
                label="Select Topic",
                info="Choose which week's content to generate questions for"
            )

            difficulty_dropdown = gr.Dropdown(
                choices=["Beginner", "Intermediate", "Advanced"],
                value="Intermediate",
                label="Difficulty Level",
                info="Select the difficulty of the questions"
            )

            num_questions_slider = gr.Slider(
                minimum=5,
                maximum=20,
                value=10,
                step=5,
                label="Number of Questions",
                info="How many questions to generate (5-20)"
            )

            generate_btn = gr.Button("üöÄ Generate Questions", variant="primary", size="lg")

            gr.Markdown("""
            ---
            ### üì• Download Files
            After generation, download your dataset in JSON or CSV format
            """)

            with gr.Row():
                json_download = gr.File(label="JSON File", interactive=False)
                csv_download = gr.File(label="CSV File", interactive=False)

        with gr.Column(scale=2):
            gr.Markdown("### üìù Generated Questions")

            output_display = gr.Markdown(
                value="Click 'Generate Questions' to start...",
                label="Questions"
            )

    # Connect the generate button
    generate_btn.click(
        fn=gradio_generate,
        inputs=[topic_dropdown, difficulty_dropdown, num_questions_slider],
        outputs=[output_display, json_download, csv_download]
    )

    gr.Markdown("""
    ---
    ### üí° Tips:
    - Start with 5 questions to test the system
    - Beginner questions cover definitions and basic concepts
    - Intermediate questions test application and understanding
    - Advanced questions explore edge cases and optimization
    - Generation takes ~30-60 seconds depending on number of questions

    ### üìä Output Formats:
    - **JSON**: Structured data for programmatic use
    - **CSV**: Easy to view in spreadsheets or import into other tools
    """)

print("‚úÖ Gradio interface configured!")

In [None]:
# Launch the Gradio app, create a public URL
demo.launch(share=True, debug=True)