# LLM 20 Questions Starter with Rigging

This starter notebook shows how the python package rigging can be used to create a baseline submission for the competition. This setup uses the `llama3` quantized model using vLLM.

## Update **June 10, 2024**
- Updated code to work with rigging 2.0
- Including non-llm question asking agent that leverages the known keywords **note this won't work well on the private leaderboard**. Answer agent uses LLM via rigging.

## What is Rigging?

Rigging is a lightweight LLM interaction framework built on Pydantic XML. The goal is to make leveraging LLMs in production pipelines as simple and effictive as possible. Rigging is perfectly fit for the 20 questions tasks as it can:
1. Easily handle swapping out different backend LLM models.
2. Design LLM querying pipelines that check for expected outputs and retry until successful.
3. Modern python with type hints, async support, pydantic validation, serialization, etc.

Star the repo here: https://github.com/dreadnode/rigging
Read the documentation here: https://rigging.dreadnode.io/

Rigging is built and maintained by [dreadnode](https://www.dreadnode.io/) where we use it daily for our work.

An example rigging pipeline might look like this:
```{python}
chat = rg.get_generator('gpt-4o') \
    .chat(f"Provide me the names of all the countries in South America that start with the letter A {Answer.xml_tags()} tags.") \
    .until_parsed_as(Answer) \
    .run() 
```

Generators can be created seemlessly with most major LLM apis, so long as you have api keys saved as env variables.
```
export OPENAI_API_KEY=...
export TOGETHER_API_KEY=...
export TOGETHERAI_API_KEY=...
export MISTRAL_API_KEY=...
export ANTHROPIC_API_KEY=...
```

For this competition we must run our model locally, luckily rigging has support to run models using transformers on the back end.

# Setup

Below is some of the setup for this notebook. Where we will:
- Load secret tokens for huggingface and kaggle (optional)
- Install required packages
- Create a helper utility script for testing our vLLM server

This notebooks uses some hidden tokens using kaggle's secrets. This is optional and not required to run the code.

In [None]:
from kaggle_secrets import UserSecretsClient
secrets = UserSecretsClient()

HF_TOKEN: str | None  = None
KAGGLE_KEY: str | None = None
KAGGLE_USERNAME: str | None = None
    
try:
    HF_TOKEN = secrets.get_secret("HF_TOKEN")
    KAGGLE_KEY = secrets.get_secret("KAGGLE_KEY")
    KAGGLE_USERNAME = secrets.get_secret("KAGGLE_USERNAME")
except:
    pass

## Pip install
We will install:
- [rigging](https://github.com/dreadnode/rigging) Used to created our LLM pipelines for the competition.
- [vLLM](https://github.com/vllm-project/vllm) For hosting our model locally as an independent service.

We also use [uv](https://github.com/astral-sh/uv) which allows us to install these packages much faster.

**Note:** We are installing these packages to the `/kaggle/tmp/lib` directory. We only do this for the purposes of the competition setup, where we will later need to include the files from this path in our submission zip. We also install the vllm dependencies to `/kaggle/tmp/srvlib`.

In [None]:
# Dependencies (uv for speed)
!pip install uv==0.1.45

!uv pip install -U \
    --python $(which python) \
    --target /kaggle/tmp/lib \
    rigging==2.0.0 \
    kaggle

!uv pip install -U \
    --python $(which python) \
    --target /kaggle/tmp/srvlib \
    vllm==0.4.2 \
    numpy==1.26.4

# Download the LLM Locally

Because this competition requires us to submit our code with model weights, we will first download the model weights using `snapshot_download` from huggingface.

We are going to download the `solidrust/Meta-Llama-3-8B-Instruct-hf-AWQ`. This is a Activation-aware Weight Quantization version of the model that is small enough to run in the competition requirements.

**Note**: When using rigging in a normal situation this step would not be necessary, but we are downloading the weights seperately so that we can include them in our submission zip for the competition.

In [None]:
# Download the model

from huggingface_hub import snapshot_download
from pathlib import Path
import shutil

g_model_path = Path("/kaggle/tmp/model")
if g_model_path.exists():
    shutil.rmtree(g_model_path)
g_model_path.mkdir(parents=True)

snapshot_download(
    repo_id="solidrust/Meta-Llama-3-8B-Instruct-hf-AWQ",
    ignore_patterns="original*",
    local_dir=g_model_path,
    local_dir_use_symlinks=False,
    token=globals().get("HF_TOKEN", None)
)

We can see the model weights are stored in `/kaggle/tmp/model/`

In [None]:
!ls -l /kaggle/tmp/model

# Helper Utilities File

These are helper functions we will use for starting our vLLM server.

In [None]:
%%writefile util.py

# Helpers for starting the vLLM server

import subprocess
import os
import socket
import time

def check_port(port: int) -> bool:
    try:
        with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
            sock.settimeout(1)
            result = sock.connect_ex(('localhost', port))
            if result == 0:
                return True
    except socket.error:
        pass
    
    return False

def run_and_wait_for_port(
    cmd: list[str], port: int, env: dict[str, str] | None, timeout: int = 60, debug: bool = False
) -> subprocess.Popen:
    
    if check_port(port):
        raise ValueError(f"Port {port} is already open")
        
    popen = subprocess.Popen(
        cmd,
        env={**os.environ, **(env or {})},
        stdout=subprocess.DEVNULL if not debug else None,
        stderr=subprocess.DEVNULL if not debug else None,
    )
    
    start_time = time.time()
    while time.time() - start_time < timeout:
        if check_port(port):
            return popen
        time.sleep(1)
    
    popen.terminate()
    raise Exception(f"Process did not open port {port} within {timeout} seconds.")

# Starting up our vLLM server for testing

Our model will be hosted using a vLLM server. Below we will start up the notebook so we can understand how it works in the kaggle environment.

In [None]:
# vLLM paths and settings.

import importlib
from pathlib import Path
import util

util = importlib.reload(util)

g_srvlib_path = Path("/kaggle/tmp/srvlib")
assert g_srvlib_path.exists()

g_model_path = Path("/kaggle/tmp/model")
assert g_model_path.exists()

g_vllm_port = 9999
g_vllm_model_name = "custom"

In [None]:
# Run the vLLM server using subprocess
vllm = util.run_and_wait_for_port([
    "python", "-m",
    "vllm.entrypoints.openai.api_server",
    "--enforce-eager",
    "--model", str(g_model_path),
    "--port", str(g_vllm_port),
    "--served-model-name", g_vllm_model_name
],
    g_vllm_port,
    {"PYTHONPATH": str(g_srvlib_path)},
    debug=False
)

print("vLLM Started")

We can see that the llama3 model is loaded onto the 1st Tesla T4 GPU.

In [None]:
!nvidia-smi

## Validating the Model

Lets create our first rigging generator. In rigging the generators are the foundation for creating powerful LLM pipelines.

In [None]:
# Connect with Rigging

import sys
import logging

sys.path.insert(0, "/kaggle/tmp/lib")

logging.getLogger("LiteLLM").setLevel(logging.WARNING)

import rigging as rg

generator = rg.get_generator(
    f"openai/{g_vllm_model_name}," \
    f"api_base=http://localhost:{g_vllm_port}/v1," \
    "api_key=sk-1234," \
    "stop=<|eot_id|>" # Llama requires some hand holding
)

answer = await generator.chat("Say Hello!").run()

print()
print('[Rigging Chat]')
print(type(answer), answer)

print()
print('[LLM Response Only]')
print(type(answer.last), answer.last)

print()
answer_string = answer.last.content
print('[LLM Response as a String]')
print(answer.last.content)

## Converting results to pandas dataframe

Using the `to_df()` method we can easily convert the chat history to a pandas dataframe.

In [None]:
answer.to_df()

## Changing Model Parameters

Much like database connection strings, Rigging generators can be represented as strings which define what provider, model, API key, generation params, etc. should be used. They are formatted as follows:

```
<provider>!<model>,<**kwargs>
```

As an example, here we load the model with additional parameters:
- temperature=0.9
- max_tokens=512

You can read more about these in the docs here: https://rigging.dreadnode.io/topics/generators/#overload-generation-params

In [None]:
generator = rg.get_generator(
    f"openai/{g_vllm_model_name}," \
    f"api_base=http://localhost:{g_vllm_port}/v1," \
    "api_key=sk-1234," \
    "temperature=0.9,max_tokens=512," \
    "stop=<|eot_id|>" # Llama requires some hand holding,
)

Alternatively we can set these parameters using the `rg.GenerateParams` class. This class allows you to set various model parameters:

```
rg.GenerateParams(
    *,
    temperature: float | None = None,
    max_tokens: int | None = None,
    top_k: int | None = None,
    top_p: float | None = None,
    stop: list[str] | None = None,
    presence_penalty: float | None = None,
    frequency_penalty: float | None = None,
    api_base: str | None = None,
    timeout: int | None = None,
    seed: int | None = None,
    extra: dict[str, typing.Any] = None,
)
```

https://rigging.dreadnode.io/api/generator/#rigging.generator.GenerateParams

In [None]:
rg_params = rg.GenerateParams(
    temperature = 0.9,
    max_tokens = 512,
)
base_chat = generator.chat(params=rg_params)
answer = await base_chat.fork('How is it going?').run()
print(answer.last.content)

Or parameters can be set within the chain using params.

In [None]:
base_chat = generator.chat() # No params set
answer = await base_chat.fork('How is it going?') \
    .with_(temperature = 0.9, max_tokens = 512) \
    .run()
print(answer.last.content)

# Parsed outputs example

Next we will create a pipeline where we:
1. Create a rigging Model called `Answer`. This explains the expected output that we will parse from the model results.
    - We will add some validators to this that will ensure the output is either `yes` or `no`
    - This is fully customizable.
    - Here `validate_content` is ensuring that our response conforms to the expected output (lowercase and starts with "yes" or "no")
2. We can use the `Answer.xml_example()` in our prompt to let the LLM know how we expect the output to look.
3. Later on we will use `.until_parsed_as(Answer)` to ensure the LLM output is extracted as defined here.

**Note** `until_parsed_as()` can take a `max_rounds` parameter, which by default is 5.

In [None]:
import typing as t
from pydantic import field_validator

class Answer(rg.Model):
    content: t.Literal["yes", "no"]

    @field_validator("content", mode="before")
    def validate_content(cls, v: str) -> str:
        for valid in ["yes", "no"]:
            if v.lower().startswith(valid):
                return valid
        raise ValueError("Invalid answer, must be 'yes' or 'no'")

    @classmethod
    def xml_example(cls) -> str:
        return f"{Answer.xml_start_tag()}**yes/no**{Answer.xml_end_tag()}"


In [None]:
# Lets see what the xml example looks like for this we can use this in our prompt
Answer.xml_example()

In [None]:
generator = rg.get_generator(
    f"openai/{g_vllm_model_name}," \
    f"api_base=http://localhost:{g_vllm_port}/v1," \
    "api_key=sk-1234," \
    "stop=<|eot_id|>" # Llama requires some hand holding,
)

keyword='Tom Hanks'
category='Famous Person'
last_question='Is it a famous person?'

prompt = f"""\
            The secret word for this game is "{keyword}" [{category}]

            You are currently answering a question about the word above.

            The next question is "{last_question}".

            Answer the yes/no question above and place it in the following format:
            {Answer.xml_example()}

            - Your response should be accurate given the keyword above
            - Always answer with "yes" or "no"

            What is the answer?
"""

chat = await (
    generator
    .chat(prompt)
    .until_parsed_as(Answer, max_rounds=50)
    .run()
)

print('=== Full Chat ===')
print(chat)

print()
print('=== LLM Response Only ===')
print(chat.last)

print()
print('=== Parsed Answer ===')
print(chat.last.parse(Answer).content)

# Create an example Questioner Chat Pipeline with Rigging

Next lets create the questioner pipeline that will attempt to help determine what the keyword might be.

First lets create a `Question` object which we will use to parse our output.

In [None]:
from pydantic import StringConstraints  # noqa

str_strip = t.Annotated[str, StringConstraints(strip_whitespace=True)]

class Question(rg.Model):
    content: str_strip

    @classmethod
    def xml_example(cls) -> str:
        return Question(content="**question**").to_pretty_xml()

In [None]:
base =  generator.chat("""\
You are a talented player of the 20 questions game. You are accurate, focused, and
structured in your approach. You will create useful questions, make guesses, or answer
questions about a keyword.

""")


question_chat = await (base.fork(
    f"""\
    You are currently asking the next question.

    question and place it in the following format:
    {Question.xml_example()}

    - Your response should be a focused question which will gather the most information
    - Start general with your questions
    - Always try to bisect the remaining search space
    - Pay attention to previous questions and answers

    What is your next question?
    """
)
.until_parsed_as(Question, attempt_recovery=True)
.run()
)

In [None]:
# Dataframe representation of the conversation
question_chat.to_df()

We now are confident that the LLM response contains the quesion and case parse the question like:

In [None]:
question = question_chat.last.parse(Question).content
print(question)

# Create a keyword dataframe
** Note this only works because we know the possible keywords in the public set. This will not work on the final leaderboard**

In [None]:
!wget -O keywords_local.py https://raw.githubusercontent.com/Kaggle/kaggle-environments/master/kaggle_environments/envs/llm_20_questions/keywords.py

In [None]:
!head keywords_local.py

In [None]:
import sys
import json
import pandas as pd
sys.path.append('./')
from keywords_local import KEYWORDS_JSON

def capitalize_first_word(text):
    if not text:
        return text
    return text[0].upper() + text[1:].lower()

def create_keyword_df(KEYWORDS_JSON):
    keywords_dict = json.loads(KEYWORDS_JSON)

    category_words_dict = {}
    all_words = []
    all_cat_words = []
    for d in keywords_dict:
        words = [w['keyword'] for w in d['words']]
        cat_word = [(d['category'], w['keyword']) for w in d['words']]
        category_words_dict[d['category']] = words
        all_words += words
        all_cat_words += cat_word

    keyword_df = pd.DataFrame(all_cat_words, columns=['category','keyword'])
    keyword_df['first_letter'] = keyword_df['keyword'].str[0]
    keyword_df['second_letter'] = keyword_df['keyword'].str[1]
    keyword_df.to_parquet('keywords.parquet')
    
create_keyword_df(KEYWORDS_JSON)

In [None]:
keywords_df = pd.read_parquet('keywords.parquet')
keywords_df.sample(10)

In [None]:
keywords_df['category'].value_counts()

# Create `main.py` Script for Final Submission.

Our final submission will be a zipped directory with a `main` file. This file is below.

In [None]:
%%writefile main.py

# Main agent file

import itertools
import os
import sys
import typing as t
from pathlib import Path
import logging

import string
import numpy as np
import pandas as pd

# Path fixups

g_working_path = Path('/kaggle/working')
g_input_path = Path('/kaggle/input')
g_temp_path = Path("/kaggle/tmp")
g_agent_path = Path("/kaggle_simulations/agent/")

g_model_path = g_temp_path / "model"
g_srvlib_path = g_temp_path / "srvlib"
g_lib_path = g_temp_path / "lib"

if g_agent_path.exists():
    g_lib_path = g_agent_path / "lib"
    g_model_path = g_agent_path / "model"
    g_srvlib_path = g_agent_path / "srvlib"
else:
    g_agent_path = Path('/kaggle/working')
    
sys.path.insert(0, str(g_lib_path))

# Logging noise

logging.getLogger("LiteLLM").setLevel(logging.WARNING)

# Fixed imports

import util # noqa
import rigging as rg  # noqa
from pydantic import BaseModel, field_validator, StringConstraints  # noqa

# Constants

g_vllm_port = 9999
g_vllm_model_name = "custom"

g_generator_id = (
    f"openai/{g_vllm_model_name}," \
    f"api_base=http://localhost:{g_vllm_port}/v1," \
    "api_key=sk-1234," \
    "stop=<|eot_id|>" # Llama requires some hand holding
)

# Types

str_strip = t.Annotated[str, StringConstraints(strip_whitespace=True)]

class Observation(BaseModel):
    step: int
    role: t.Literal["guesser", "answerer"]
    turnType: t.Literal["ask", "answer", "guess"]
    keyword: str
    category: str
    questions: list[str]
    answers: list[str]
    guesses: list[str]
    
    @property
    def empty(self) -> bool:
        return all(len(t) == 0 for t in [self.questions, self.answers, self.guesses])
    
    def get_history(self) -> t.Iterator[tuple[str, str, str]]:
        return itertools.zip_longest(self.questions, self.answers, self.guesses, fillvalue="[none]")

    def get_history_as_xml(self, *, skip_guesses: bool = False) -> str:
        if not self.empty:
            history = "\n".join(
            f"""\
            <turn-{i}>
            Question: {question}
            Answer: {answer}
            {'Guess: ' + guess if not skip_guesses else ''}
            </turn-{i}>
            """
            for i, (question, answer, guess) in enumerate(self.get_history())
            )
            return history
        return "none yet."


class Answer(rg.Model):
    content: t.Literal["yes", "no"]

    @field_validator("content", mode="before")
    def validate_content(cls, v: str) -> str:
        for valid in ["yes", "no"]:
            if v.lower().startswith(valid):
                return valid
        raise ValueError("Invalid answer, must be 'yes' or 'no'")

    @classmethod
    def xml_example(cls) -> str:
        return f"{Answer.xml_start_tag()}yes/no{Answer.xml_end_tag()}"


class Question(rg.Model):
    content: str_strip

    @classmethod
    def xml_example(cls) -> str:
        return Question(content="question").to_pretty_xml()


class Guess(rg.Model):
    content: str_strip

    @classmethod
    def xml_example(cls) -> str:
        return Guess(content="thing/place").to_pretty_xml()


# Functions

async def ask(base: rg.ChatPipeline, observation: Observation) -> str:
    if observation.step == 0:
        # override first question until keyword bug is fixed.
        return "Are we playing 20 questions?"
    
    try:
        chat = await (
             base.fork(
                f"""\
                You are currently asking the next question.

                <game-history>
                {observation.get_history_as_xml(skip_guesses=True)}
                </game-history>

                Based on the history above, ask the next most useful yes/no
                question and place it in the following format:
                {Question.xml_example()}

                - Your response should be a focused question which will gather the most information
                - Start general with your questions
                - Always try to bisect the remaining search space
                - Pay attention to previous questions and answers

                What is your next question?
                """
            )
            .until_parsed_as(Question, attempt_recovery=True, max_rounds=20)
            .run()
        )
        return chat.last.parse(Question).content.strip('*')
    except rg.error.MessagesExhaustedMaxRoundsError:
        return 'Is it a person?'

async def answer(base: rg.ChatPipeline, observation: Observation) -> t.Literal["yes", "no"]:
    if not observation.keyword:
        print("Keyword wasn't provided to answerer", file=sys.stderr)
        return "yes" # override until keyword bug is fixed.
            
    last_question = observation.questions[-1]
    
#     print('=' * 10)
#     print(f"""\
#                 Provide the best yes/no answer to the question about the keyword [{observation.keyword}] in the category [{observation.category}]

#                 [QUESTION] "{last_question}" [/QUESTION]
                
#                 Remember they keyword is [{observation.keyword}]
                
#                 Answer the yes/no question above and place it in the following format:
#                 {Answer.xml_example()}
#                 """
#          )
#     print('=' * 10)
    try:
        responses = []
        for i in range(5):
            # Loop 5 times and take the most frequent response
            chat = await (
                base.fork(
                    f"""\
                    Provide the best yes/no answer to the question about the keyword [{observation.keyword}] in the category [{observation.category}]

                    [QUESTION] "{last_question}" [/QUESTION]

                    Remember they keyword is [{observation.keyword}]

                    Answer the yes/no question above and place it in the following format:
                    {Answer.xml_example()}
                    """
                )
                .until_parsed_as(Answer, attempt_recovery=True, max_rounds=20)
                .run()
            )
            responses.append(chat.last.parse(Answer).content.strip('*'))
            
        print(f'Responses are {responses}')
        return pd.Series(responses).value_counts().index[0]
#         print('=' * 10)
#         print('Response.....')
#         print(chat.last)
#         print('=' * 10)
#         return chat.last.parse(Answer).content.strip('*')
    except rg.error.MessagesExhaustedMaxRoundsError:
        print('%%%%%%%%%%%% Error so answering yes %%%%%%%%%%%% ')
        return 'yes'

async def guess(base: rg.ChatPipeline, observation: Observation) -> str:
    try:

        chat = await (
            base.fork(
                f"""\
                You are currently making an informed guess of the keyword.

                <game-history>
                {observation.get_history_as_xml()}
                </game-history>

                Based on the history above, produce a single next best guess
                for the keyword and place it in the following format:
                {Guess.xml_example()}

                - Avoid repeat guesses based on the history above
                - The guess should be a specific person, place, or thing

                What is your guess?
                """
            )
            .until_parsed_as(Guess, attempt_recovery=True, max_rounds=20)
            .run()
        )

        return chat.last.parse(Guess).content.strip('*')
    except rg.error.MessagesExhaustedMaxRoundsError:
        return 'france'
    
# vLLM and Generator

try:
    vllm = util.run_and_wait_for_port([
        "python", "-m",
        "vllm.entrypoints.openai.api_server",
        "--enforce-eager",
        "--model", str(g_model_path),
        "--port", str(g_vllm_port),
        "--served-model-name", g_vllm_model_name
    ], g_vllm_port, {"PYTHONPATH": str(g_srvlib_path)})

    print("vLLM Started")
except ValueError:
    print('vLLM Already Running')
    
    
generator = rg.get_generator(g_generator_id)

base =  generator.chat("""\
You are a talented player of the 20 questions game. You are accurate, focused, and
structured in your approach. You will create useful questions, make guesses, or answer
questions about a keyword.

""")

# Entrypoint
def format_first_letter_question(letters):
    if not letters:
        return "Does the keyword start with any letter?"
    
    if len(letters) == 1:
        return f"Does the keyword start with the letter '{letters[0]}'"
    
    formatted_letters = ", ".join(f"'{letter}'" for letter in letters[:-1])
    formatted_letters += f" or '{letters[-1]}'"
    
    return f"Does the keyword start with one of the letters {formatted_letters}?"

import re

def extract_letters_from_question(question):
    pattern = r"'([a-zA-Z])'"
    matches = re.findall(pattern, question)
    return matches

# Simple question asker
class SimpleQuestionerAgent():
    def __init__(self, keyword_df: pd.DataFrame):
        self.keyword_df = keyword_df
        self.keyword_df_init = keyword_df.copy()
        self.round = 0
        self.category_questions = [
            "Are we playing 20 questions?",
            "Is the keyword a thing that is not a place?",
            "Is the keyword a place?",
        ]
        self.found_category = False
        
    def filter_keywords(self, obs):
        print(self.keyword_df.shape)
        # Filter down keyword_df based on past answers
        for i, answer in enumerate(obs.answers):
            if obs.questions[i] in self.category_questions:
                if answer == 'yes':
                    if obs.questions[i] == "Is the keyword a thing that is not a place?":
                        self.found_category = 'things'
                    if obs.questions[i] == "Is the keyword a place?":
                        self.found_category = 'place'
                    fc = self.found_category
                    self.keyword_df = self.keyword_df.query('category == @fc').reset_index(drop=True)
    
            if obs.questions[i].startswith('Does the keyword start '):
                if self.keyword_df['first_letter'].nunique() <= 1:
                    break
                letter_question = obs.questions[i]
#                 letters = letter_question.replace('Precisely evaluate the very first letter in the keyword. If the keyword is multiple words only evaluate the first word. Answer Yes/No if ANY of these letters are the first letter in the keyword: ','')
#                 letters = letter_question.split(' (say yes if it does start with one of them, no if it doesnt) ')[-1]
#                 letters = letters.replace('?','').replace(' ','').replace(':','').replace('[','').replace(']','').strip().split(',')
                letters = extract_letters_from_question(letter_question)
                self.keyword_df = self.keyword_df.reset_index(drop=True).copy()
                if obs.answers[i] == 'yes':
#                     print(f'Filtering down to letters {letters}')
                    self.keyword_df = self.keyword_df.loc[
                        self.keyword_df['first_letter'].isin(letters)].reset_index(drop=True).copy()
                elif obs.answers[i] == 'no':
#                     print(f'Excluding letters {letters}')
                    self.keyword_df = self.keyword_df.loc[
                        ~self.keyword_df['first_letter'].isin(letters)].reset_index(drop=True).copy()
        if len(self.keyword_df) == 0:
            # Reset
            self.keyword_df = self.keyword_df_init.copy()
            
    def get_letters(self, obs, max_letters=20):
        n_letters = self.keyword_df['first_letter'].nunique()
        sample_letters = self.keyword_df['first_letter'].drop_duplicates().sample(n_letters // 2).values.tolist()
        sample_letters = sample_letters[:max_letters]
        print('sample letters', n_letters, sample_letters)
        return sample_letters # ', '.join(sample_letters)
    
    def __call__(self, obs, *args):
        if len(self.keyword_df) == 0:
            # Reset
            self.keyword_df = self.keyword_df_init.copy()
        self.filter_keywords(obs)
        if obs.turnType == 'ask':
            self.round += 1
            if (self.round <= 3 and not self.found_category):
                response = self.category_questions[self.round - 1]
            else:
                sample_letters = self.get_letters(obs)
                if len(sample_letters) == 0:
                    n_sample = min(len(self.keyword_df), 10)
                    possible_keywords = ", ".join(self.keyword_df['keyword'].sample(n_sample).values.tolist())
                    response = f"Is the keyword one of the following? {possible_keywords}"
                else:
                    sample_letters_str = str(sample_letters).replace('[','').replace(']','')
#                     response = f'Does the keyword start with one of the following letters : {sample_letters_str} ?'
                    response = format_first_letter_question(sample_letters)
        elif obs.turnType == 'guess':
            response = self.keyword_df['keyword'].sample(1).values[0]
            # Remove the guessed word
            updated_df = self.keyword_df.loc[self.keyword_df['keyword'] != response].reset_index(drop=True).copy()
            if len(updated_df) >= 1:
                self.keyword_df = updated_df.copy()
            else:
                self.keyword_df = self.keyword_df_init.copy() # Reset the df
#         print(f'Round {self.round}')
#         print(f"{response=}")
#         print(f'keyword_df size {self.keyword_df.shape}')
        return response


keyword_df = pd.read_parquet(f'{g_agent_path}/keywords.parquet')
question_agent = None

async def observe(obs: t.Any) -> str:
    observation = Observation(**obs.__dict__)
    global question_agent
    if question_agent is None:
        question_agent = SimpleQuestionerAgent(keyword_df)

    try:
        match observation.turnType:
            case "ask":
#                 return await ask(base, observation)
                return question_agent(obs)
            case "answer":
                return await answer(base, observation)
            case "guess":
#                 return await guess(base, observation)
                return question_agent(obs)

            case _:
                raise ValueError("Unknown turn type")
    except Exception as e:
        print(str(e), file=sys.stderr)
        raise

def agent_fn(obs: t.Any, _: t.Any) -> str:
    # Async gate when executing in their framework
    import asyncio
    return asyncio.run(observe(obs))


# Test the Agent Against Itself

In [None]:
def format_first_letter_question(letters):
    if not letters:
        return "Does the keyword start with any letter?"
    
    if len(letters) == 1:
        return f"Does the keyword start with the letter '{letters[0]}'"
    
    formatted_letters = ", ".join(f"'{letter}'" for letter in letters[:-1])
    formatted_letters += f" or '{letters[-1]}'"
    
    return f"Does the keyword start with one of the letters {formatted_letters}?"

format_first_letter_question(['a','b','c'])

import re

def extract_letters_from_question(question):
    pattern = r"'([a-zA-Z])'"
    matches = re.findall(pattern, question)
    return matches

In [None]:
%load_ext autoreload
%autoreload 2
from main import Observation, agent_fn, observe

In [None]:
# Check if vllm is running
!ps -aef | grep vllm

In [None]:
import pandas as pd

keyword_df = pd.read_parquet('keywords.parquet')
sample = keyword_df.sample(1)

obs = Observation(step = 0,
    role = 'guesser',
    turnType= "ask",
    keyword= sample['keyword'].values[0],
    category= sample['category'].values[0],
    questions = [],
    answers= [],
    guesses= [],
)


for i in range(20):
    obs.role = 'guesser'
    obs.turnType = 'ask'
    question = await observe(obs)
    print(f'[{i} Question]: {question}')
    obs.questions.append(question)
    obs.role = 'answerer'
    obs.turnType = 'answer'
    answer = await observe(obs)
    obs.answers.append(answer)
    
    if obs.questions[-1].startswith('Are we playing 20 questions?'):
        gt_answer = answer # whatever
    elif obs.questions[-1].startswith('Is the keyword a thing that is not a place?'):
        if sample['category'].values[0] == 'things':
            gt_answer = 'yes'
        else:
            gt_answer = 'no'
    elif obs.questions[-1].startswith('Is the keyword a place?'):
        if sample['category'].values[0] == 'place':
            gt_answer = 'yes'
        else:
            gt_answer = 'no'
    elif obs.questions[-1].startswith('Does the keyword start'):
        letters_guess = extract_letters_from_question(obs.questions[-1])
        gt_answer = obs.keyword[0] in letters_guess
        gt_answer = 'yes' if gt_answer else 'no'
    elif obs.questions[-1].startswith('Is the keyword one of the following?'):
        possible_kw = obs.questions[-1].replace('Is the keyword one of the following? ','').split(',')
        possible_kw = [c.strip(' ') for c in possible_kw]
        print(possible_kw)
        gt_answer = obs.keyword in possible_kw
        gt_answer = 'yes' if gt_answer else 'no'

    print(f'[{i} Answer]: {answer} [True Answer]: {gt_answer}')
    if answer != gt_answer:
        break

    obs.role = 'guesser'
    obs.turnType = 'guess'
    guess = await observe(obs)
    print(f'[{i} Guess]: {guess} - [Keyword]: {obs.keyword}')
    obs.guesses.append(guess)
    if guess == obs.keyword:
        print('GOT IT!')
        break
        
    obs.step += 1

# Zipping Model and Code for Submission

In [None]:
!apt install pigz pv

In [None]:
!tar --use-compress-program='pigz --fast' \
    -cf submission.tar.gz \
    --dereference \
    -C /kaggle/tmp model lib srvlib \
    -C /kaggle/working main.py util.py \
    -C /kaggle/working keywords.parquet

In [None]:
!ls -GFlash --color

# Submitting using Kaggle CLI

Optionally you can submit using the kaggle cli interface without needing to re-run commit the notebook.

In [None]:
# !KAGGLE_USERNAME={KAGGLE_USERNAME} \
#  KAGGLE_KEY={KAGGLE_KEY} \
#  kaggle competitions submit -c llm-20-questions -f submission.tar.gz -m "submit from notebook"