# Thudbot Scaffold - Next part

This will be a prototype Thudbot built in jupyter.

In notebook 00, I went through the steps below to evaluate and choose an appropriate retriever for Thudbot

Steps:
1. ✅ General setup
2. ✅ Data collection and Preparation
3. ✅ SDG with RAGAS to create a golden data set
4. ✅ Setup the RAG chain (finally)
5. ✅ Evaluate results with RAGAS
6. ✅ Refine RAG performance (prompt tuning, retreival methods)

Now I will:
1. Rebuild the final RAG here (without the SDG or RAGAS eval)
2. Add agentic tool calls
3. Add external APIs

Then:
- Convert to a standalone Python script
- Build or reuse a chatbot front end to run it locally


Naming this 01_ 

## Step 1 General setup

In [1]:
### API key management and environment variables

### Reminder: Place .env file inside the root of the project folder so when calling the below from inside the notebook it should find the .env fule and load it inside the notebook environment
### PLEASE ADD THIS `.env` FILE TO YOUR PROJECT'S `.gitignore` file before committing and pushing the changes to your remote repo, as it contains API Keys and Secrets in it

import os
from dotenv import load_dotenv

load_dotenv(dotenv_path="../.env", override=True)

# --- Verify API Keys ---
print("--- API Key Status ---")
print(f"OPENAI_API_KEY loaded: {'OPENAI_API_KEY' in os.environ}")
print(f"LANGCHAIN_API_KEY loaded: {'LANGCHAIN_API_KEY' in os.environ}")
print(f"TAVILY_API_KEY loaded: {'TAVILY_API_KEY' in os.environ}")
print(f"RAGAS_API_KEY loaded: {'RAGAS_API_KEY' in os.environ}")
print(f"ANTHROPIC_API_KEY loaded: {'ANTHROPIC_API_KEY' in os.environ}")
print(f"COHERE_API_KEY loaded: {'COHERE_API_KEY' in os.environ}")

# --- Verify General Settings ---
print("\n--- Project Settings Status ---")
print(f"DEBUG mode enabled: {os.environ.get('DEBUG') == 'True'}")
print(f"LangSmith Tracing V2 enabled: {os.environ.get('LANGCHAIN_TRACING_V2') == 'true'}")
print(f"LangChain Project Base: {os.environ.get('LANGCHAIN_PROJECT_BASE')}")
print(f"LangChain Project: {os.environ.get('LANGCHAIN_PROJECT')}")


--- API Key Status ---
OPENAI_API_KEY loaded: True
LANGCHAIN_API_KEY loaded: True
TAVILY_API_KEY loaded: True
RAGAS_API_KEY loaded: False
ANTHROPIC_API_KEY loaded: True
COHERE_API_KEY loaded: True

--- Project Settings Status ---
DEBUG mode enabled: True
LangSmith Tracing V2 enabled: True
LangChain Project Base: None
LangChain Project: THUDBOT-CC


including nltk, because it worked before

In [2]:
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package punkt to /Users/family/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/family/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


True

In [3]:
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
generator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4.1-nano"))
generator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())
eval_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4.1-nano"))
eval_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())

  for match in re.finditer('{0}\s*'.format(re.escape(sent)), self.original_text):
  txt = re.sub('(?<={0})\.'.format(am), '∯', txt)
  txt = re.sub('(?<={0})\.'.format(am), '∯', txt)


## Step 2: Data Collection and Preparation

My data is CSV structured, so using code from HW9

In [4]:
from langchain_community.document_loaders.csv_loader import CSVLoader
from datetime import datetime, timedelta

loader = CSVLoader(
    file_path=f"./data/Thudbot_Hint_Data_1.csv",
    metadata_columns=[
        "question",
        "hint_level",
        "character",
        "speaker",
        "narrative_context",
        "planet",
        "location",
        "category",
        "tone",
        "follow_up_hint_id",
        "answer_keywords",
        "tags"
    ]
)

hint_data = loader.load()

# No need to overwrite page_content; not doing custom transformation
print(hint_data[0].page_content)     # This will already be the hint_text
print(hint_data[0].metadata)         # This will show all the metadata fields


question_id: TSB-001
hint_text: Press the escape key to exit the opening animations
puzzle_name: 
source: self
{'source': './data/Thudbot_Hint_Data_1.csv', 'row': 0, 'question': 'How do I stop the opening movie', 'hint_level': '1', 'character': 'Player', 'speaker': '', 'narrative_context': 'Meta', 'planet': '', 'location': '', 'category': 'Meta', 'tone': '', 'follow_up_hint_id': '', 'answer_keywords': '', 'tags': ''}


### Setting up QDrant! (from HW9)

Now that we have our documents, let's create a QDrant VectorStore with the collection name "ThudbotHints".

We'll leverage OpenAI's [`text-embedding-3-small`](https://openai.com/blog/new-embedding-models-and-api-updates) because it's a very powerful (and low-cost) embedding model.

 

In [5]:
from langchain_community.vectorstores import Qdrant
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = Qdrant.from_documents(
    documents=hint_data,
    embedding=embeddings,
    location=":memory:",
    collection_name="Thudbot_Hints"
)

### ▶️ reload platinum data set

Might as well re-use it for any testing

In [19]:
import json

# Load questions for testing retrievers
with open("data/platinum_dataset.json", "r") as f:
    platinum_data = json.load(f)



In [20]:
sample_indices = [0, 2, 5, 7, 10]
sampled_platinum = [platinum_data[i] for i in sample_indices]


## Step 4: Setup the RAG chain


Starting with a "naive" dense vector retrieval

### R - Retrieval - using multi-query retriever, based on previous evaluatin




In [11]:
naive_retriever = vectorstore.as_retriever(search_kwargs={"k" : 10})

Moving next cell up in the flow, because multi-query retriever needs LLM

In [13]:
from langchain_openai import ChatOpenAI

chat_model = ChatOpenAI(model="gpt-4.1-nano")
#chat_model = ChatAnthropic(model="claude-3-5-sonnet-20240620")

In [None]:
from langchain.retrievers.multi_query import MultiQueryRetriever

multi_query_retriever = MultiQueryRetriever.from_llm(
    retriever=naive_retriever, llm=chat_model, number_of_queries=3
)

### A - Augmented

My first pass at a Thud-like prompt, named as ```THUD_TEMPLATE```

This will need tuning!

In [15]:
from langchain_core.prompts import ChatPromptTemplate

THUD_TEMPLATE = """\
You are Thud, a friendly and somewhat simple-minded patron at The Thirsty Tentacle. 

You're trying your best to help the player navigate the game "The Space Bar."

Use the clues and context provided below to offer a gentle hint — not a full solution.

If you're not sure, say so, or suggest the player look around more.

Player's question:
{question}

Context:
{context}

Your hint:"""

rag_prompt = ChatPromptTemplate.from_template(THUD_TEMPLATE)

### G - Generation

Still using `gpt-4.1-nano` as our LLM today

### LCEL RAG Chain

We're going to use LCEL to construct our chain. (from HW9)


Test the chain, and the langsmith tracing with a question.
Might as well take the question from the platinum data set (just remember to load it above ▶️)

Using mq, based on eval results

In [17]:
from langchain_core.runnables import RunnablePassthrough
from operator import itemgetter
from langchain_core.output_parsers import StrOutputParser

multi_query_retrieval_chain = (
    {"context": itemgetter("question") | multi_query_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
).with_config({"run_name": "multi_query_chain"})

test with sample questions

In [24]:
# sample_q = platinum_data[0]["eval_sample"]["user_input"]
# sample_q = "What is the best way to get the token?"
sample_q = sampled_platinum[4]["eval_sample"]["user_input"]
multi_query_retrieval_chain.invoke({"question": sample_q})

{'response': AIMessage(content="Hey there! It sounds like you're tryin' to get to Quantelope Lodge and need to find Thud and that vestibule terminal. Hmm... Maybe you should look around and see if there's a door you can click on—that might be the way to find Thud. Also, sometimes at Glom Hole or the Front Stoop, doing something with objects like the mailbox or the cup helps a lot. Oh! And if Thud is in a jar or something, talk to him or tell him what to do. Keep lookin' around those areas—you'll find a way!", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 118, 'prompt_tokens': 2669, 'total_tokens': 2787, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-nano-2025-04-14', 'system_fingerprint': 'fp_f12167b370', 'id': 'chatcmpl-C0RcdCXNUM9C6UFtlop0vhFayni1c', 'service

use @tool decorator to Wrap multi_query_retrieval_chain as a Tool

In [28]:
from langchain.tools import tool
from langsmith.tracing import tracing_v2_enabled
@tool
def hint_lookup(question: str) -> str:
    """Answer in-game player questions about puzzles, items, or objectives in The Space Bar."""
    with tracing_v2_enabled("hint_lookup_tool"):
        result = multi_query_retrieval_chain.invoke({"question": question})
    return result["response"].content



ModuleNotFoundError: No module named 'langsmith.tracing'

In [26]:
from langchain.agents import initialize_agent, AgentType

tools = [hint_lookup]  # just your @tool-wrapped function

thud_agent = initialize_agent(
    tools=tools,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    llm=chat_model,
    verbose=True
)


  thud_agent = initialize_agent(


In [27]:
thud_agent.run("How do I get the token from the cup?")


  thud_agent.run("How do I get the token from the cup?")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: hint_lookup  
Action Input: How do I get the token from the cup?[0m

NameError: name 'tracing_v2_enabled' is not defined

More cells straigh out of HW9

In [None]:
# updated to use the sliced dataset
# naive_outputs = run_retriever_on_dataset("naive", naive_retrieval_chain, sampled_platinum)
# bm25_outputs = run_retriever_on_dataset("bm25", bm25_retrieval_chain, sampled_platinum)
multi_query_outputs = run_retriever_on_dataset("multi_query", multi_query_retrieval_chain, sampled_platinum)
# parent_doc_outputs = run_retriever_on_dataset("parent_doc", parent_document_retrieval_chain, sampled_platinum)
# ensemble_outputs = run_retriever_on_dataset("ensemble", ensemble_retrieval_chain, sampled_platinum)
# contextual_compression_outputs = run_retriever_on_dataset("contextual_compression", contextual_compression_retrieval_chain, sampled_platinum)

In [None]:
# just checking the outputs
multi_query_outputs[:3]