# 📘 Kahoot Quiz Agent – Code Overview

This notebook automates the process of **joining a Kahoot quiz**, **analyzing quiz screenshots**, and **submitting answers** using a combination of **OCR**, **image processing**, and a **local LLM (e.g., Ollama)**.


## 1️⃣ Setup & Configuration

This section loads essential libraries such as `Playwright` for browser automation, `PIL` and `OpenCV` for image processing, and `pytesseract` for OCR.

- Creates a `debug/` folder to store intermediate screenshots and cropped images.
- Defines bounding boxes for extracting the question and answer choices from the screen.
- Prepares coordinates for simulating clicks on answer buttons.

In [None]:
from playwright.async_api import async_playwright
from datetime import datetime
from IPython.display import Image as IPyImage, display 
from PIL import Image
import os
import requests
import cv2
import pytesseract
import numpy as np

DEBUG_DIR = "debug"
os.makedirs(DEBUG_DIR, exist_ok=True)

MODEL = "llama3.1:8b"

box_dict = {
    "choice1": (40, 510, 635, 580),
    "choice2": (660, 510, 1270, 580),
    "choice3": (40, 570, 635, 650),
    "choice4": (660, 570, 1270, 650),
    "question": (20, 350, 1270, 510)
}

choice_dict={
    "choice1": (325, 545),
    "choice2": (950, 545),
    "choice3": (325, 610),
    "choice4": (950, 610)
}

async def click_browser_by_choice(page, answer):
    x=choice_dict[f'choice{answer}'][0]
    y=choice_dict[f'choice{answer}'][1]
    print(f"click {x},{y} for answer {answer}")
    await page.mouse.move(x, y)
    await page.mouse.click(x, y)


## 2️⃣ Image Processing & OCR Pipeline

The OCR pipeline consists of two main functions:
- `preprocess_and_ocr()`: Preprocesses individual image regions and runs Tesseract OCR to extract text.

These are used in `crop_and_extract_with_llm()`, which crops specific areas (question and choices) from a screenshot and returns text.

In [None]:
OCR_CONFIGS = {
    "question": '--oem 3 --psm 6',
    "option": '--oem 3 --psm 7'
}

def preprocess_and_ocr(key, region_pil):
    img = np.array(region_pil.convert('RGB'))

    gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    if key == 'question':
        _, processed = cv2.threshold(gray, 180, 255, cv2.THRESH_BINARY)
        config = OCR_CONFIGS["question"]
    else:
        processed = cv2.convertScaleAbs(gray, alpha=2.0, beta=0)
        config = OCR_CONFIGS["option"]
    processed = cv2.resize(processed, (0, 0), fx=2.0, fy=2.0, interpolation=cv2.INTER_LINEAR)

    text = pytesseract.image_to_string(Image.fromarray(processed), config=config)
    return key, text.strip()

def crop_and_extract_with_llm(inputImageFn,tag):
    box=box_dict[tag]
    basename = os.path.splitext(inputImageFn)[0]
    im = Image.open(inputImageFn)
    im1 = im.crop(box)
    #im1.show()
    #outputFn=f"{basenam}_{tag}.png"
    outputFn = os.path.join(DEBUG_DIR, f"{os.path.basename(basename)}_{tag}.png")
    im1.save(outputFn, 'png')
    text=preprocess_and_ocr(tag, im1)
    return(text)


## 3️⃣ Answer Generation via LLM

The function `get_answer_with_llm()` sends the extracted question and choices to a locally hosted LLM (e.g., via Ollama at `localhost:11434`).

- Constructs a prompt and system instruction to make the model behave like a Kahoot quiz solver.
- Only expects a single digit (1-4) as the model's response, with no explanation.

Change the model_id if you are using different model in your Ollama server.

In [None]:
def get_answer_with_llm(filename,endpoint='http://localhost:11434'):
    question=crop_and_extract_with_llm(filename,'question')
    choice1=crop_and_extract_with_llm(filename,'choice1')
    choice2=crop_and_extract_with_llm(filename,'choice2')
    choice3=crop_and_extract_with_llm(filename,'choice3')
    choice4=crop_and_extract_with_llm(filename,'choice4')

    model_id=MODEL
    
    prompt=f"{question}. Choice 1={choice1}, Choice 2={choice2}, Choice 3={choice3}, Choice 4={choice4}"
    print(prompt)
    payload = {
        "model": model_id,  # Replace with your actual model name
        "messages": [
            {
                "role": "system",
                "content": (
                    "You are a Kahoot quiz competitor. Your goal is to select the most accurate answer "
                    "as fast as possible, based only on the question and the available choices. "
                    "Always respond with the correct choice number only (e.g., '3') without explanation. "
                    "Avoid any extra words or reasoning — be concise and immediate."
                )
            },
            {
                "role": "user",
                "content": (
                    prompt
                )
            }
        ],
        "temperature": 0,
        "max_tokens": 512,
        "stop": ["\n"]
    }
    chat_url=f"{endpoint}/v1/chat/completions"
    response = requests.post(chat_url, json=payload)
    answer = response.json()['choices'][0]['message']['content']
    return(answer)


## 4️⃣ Browser Automation with Playwright

The `join_kahoot_game()` function automates the entire Kahoot interaction:
- Joins the game using the provided PIN and nickname.
- Waits for user input (`1` to answer, `q` to quit).
- Takes a screenshot, extracts question and options, calls the LLM, and clicks the correct answer.
- Times the answering process for performance benchmarking.

In [None]:
async def join_kahoot_game(pin: int, nickname: str):
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()
        print("Navigating to Kahoot...")
        await page.goto("https://kahoot.it/")
        print("Entering Game PIN...")
        await page.fill('input[name="gameId"]', str(pin))
        await page.press('input[name="gameId"]', 'Enter')
        await page.wait_for_selector('input[type="text"]', timeout=10000)
        print("Entering Nickname...")
        await page.fill('input[type="text"]', nickname)
        await page.press('input[type="text"]', 'Enter')
        try:
            await page.wait_for_selector("text=You're in!", timeout=10000)
            print("✅ Joined successfully")
        except:
            print("⚠️ Join failed or timed out.")
            
        while True:
            cmd = input("Type '1' to answer question or 'q' to quit: ").strip()
            if cmd == "1":
                print("Option 1: Answer question")
                #start timer
                start_time = datetime.now()
                timestamp = start_time.strftime("%Y%m%d_%H%M%S")
                #filename = f"screenshot_{timestamp}.png"
                filename = os.path.join(DEBUG_DIR, f"screenshot_{timestamp}.png")
                await page.screenshot(path=filename)
                print(f"Screenshot saved to {filename}")
                display(IPyImage(filename))
                answer=get_answer_with_llm(filename,'http://localhost:11434')
                await click_browser_by_choice(page,answer)
                #end timer
                end_time = datetime.now()
                elapsed = (end_time - start_time).total_seconds()
                #output
                print(f"Answer: {answer} ({elapsed:.2f} seconds)")               
            elif cmd.lower() == "q":
                print("Exiting session.")
                break
            else:
                print("Unknown input. Please type '1' or 'q'.")
        await browser.close()

## 🚀 How to Play

Execute the below code cell, when prompted:
1. **Enter the Kahoot Game PIN** (displayed on the host's screen).
2. **Enter your nickname** (any name you'd like to use in the quiz).
3. When a **question and all answer choices appear** on the Kahoot screen:
   - **Type `1`** and press Enter. This will:
     - Take a screenshot
     - Extract text using OCR
     - Send the question to the LLM
     - Automatically select the answer in the quiz
4. **Type `q` only when the quiz is completely over** to safely exit.
   - ⚠️ *Do NOT type `q` in the middle of a quiz round — you won't be able to rejoin the same game session!*

ℹ️ **Note:** This notebook assumes you have already set up Tesseract OCR, Playwright, and a local LLM server (like Ollama) to handle question answering.

In [None]:
pin = int(input("Enter Kahoot Game PIN: "))
nickname = input("Enter your nickname: ")
await join_kahoot_game(pin, nickname)


## 🔄 Want to Play Again?

To restart the quiz from the beginning:

1. Click the **"Restart Kernel and Run All Cells"** button (▶▶ icon) in the top menu of Jupyter Notebook.
2. This will:
   - Reset the bot state
   - Allow you to join a new quiz session

You're now ready to go another round!