# Baseline for an Open-Source Deep Research
Reasoning models and some tool usage, although no conditional execution controlled by the agent.

In [1]:
! pip install pymilvus transformers torchvision unsloth Wikipedia-API langchain_community langchain_core langchain_huggingface langchain_text_splitters langchain_milvus json_repair tqdm



In [2]:
import pymilvus
import transformers
from transformers import TextStreamer
from unsloth import FastLanguageModel
import regex as re
import wikipediaapi
import json
from tqdm import tqdm
from langchain_huggingface import HuggingFacePipeline as Pipeline
from langchain_huggingface import ChatHuggingFace as Chat
from langchain_core.messages import HumanMessage
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import SentenceTransformerEmbeddings
from langchain_milvus import Milvus, Zilliz
from typing import Any, Set
import json
from json_repair import repair_json
from typing import Any, Set

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


## Load reasoning and embedding models

In [3]:
model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit"
max_seq_length = 4048
dtype = None
load_in_4bit = True
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = model_name,
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model)

embeddings = SentenceTransformerEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2"
)

==((====))==  Unsloth 2025.2.4: Fast Llama patching. Transformers: 4.48.2.
   \\   /|    GPU: Tesla T4. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.29.post1. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.96G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/236 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/52.9k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

  embeddings = SentenceTransformerEmbeddings(


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Helper methods
Some useful methods for chatting with the reasoning model, extracting JSON from the output, and converting the JSON into a list of strings.

In [4]:
default_system_prompt = "You are a helpful assistant who answers question truthfully to the best of your knowledge."

def ask_model(prompt, system_prompt=default_system_prompt):
    chat = [
    {
        "role": "system",
        "content": system_prompt,
    },

    {

        "role": "user",
        "content": f"{prompt}",
    },
]

    formatted_prompt = tokenizer.apply_chat_template(
        chat, tokenize=False, add_generation_prompt=True, return_tensors="pt"
    )
    inputs = tokenizer(formatted_prompt, return_tensors="pt").to("cuda")

    streamer = TextStreamer(
        tokenizer, skip_prompt=True, skip_special_tokens=True
    )

    results = model.generate(**inputs, streamer=streamer, max_new_tokens=4048)
    return tokenizer.decode(token_ids=results.cpu().numpy().tolist()[0])

json_re = re.compile(r"```json\n(?s:.)*\n```")

def extract_json(response):
    # TODO: Error handling
    try:
        match = json_re.search(response)
        json_results = '\n'.join(match.group().splitlines()[1:-1])
    except:
        return {}
    return json.loads(repair_json(json_results))

def leaves(struct: Any) -> Set[Any]:
    """Return a set of leaf values found in nested dicts and lists excluding None values."""
    # Ref: https://stackoverflow.com/a/59832594/
    values = set()

    def add_leaves(struct_: Any) -> None:
        if isinstance(struct_, dict):
            for sub_struct in struct_.values():
                add_leaves(sub_struct)
        elif isinstance(struct_, list):
            for sub_struct in struct_:
                add_leaves(sub_struct)
        elif struct_ is not None:
            values.add(struct_)

    add_leaves(struct)
    return values

## Define / refine question
Break down question into sub-questions, sub-sub-questions, and so on. Convert the main question into a report title.

In [5]:
query = "How has The Simpsons changed over time?"
page_title = "The Simpsons"

In [6]:
prompt = f"""What is the topic of the following question? Respond in JSON format.

Question: {query}"""

response = ask_model(prompt)

<think>
Alright, so I need to figure out the topic of the question "How has The Simpsons changed over time?" Let me start by breaking it down. The question is asking about changes in The Simpsons, so it's likely related to the show's evolution. I know The Simpsons is a long-running animated series, so it probably covers various aspects like its content, characters, humor, themes, or its impact over the years.

I should consider what the show was like initially and how it has evolved. The first few seasons were more focused on the Simpson family's daily life, with a mix of humor that poked fun at everyday situations and pop culture. Over time, the show expanded into more satirical takes on various issues and public figures. It also introduced new characters and storylines that dealt with more complex themes, like the rise of corporate America and societal issues.

The art style and animation have also changed. Early episodes had simpler animations and character designs, while later seas

In [7]:
prompt = f"""Break down the following question into intermediate sub-questions to approach answering it. Provide a list of intermediate sub-questions and respond with JSON format. If you cannot produce sub-question then say so. Do not directly answer the following question and only return the sub-questions in JSON format. Your answer must contain JSON.

Question: {query}"""

response = ask_model(prompt)

<think>
Okay, so the user is asking about how The Simpsons has changed over time. I need to break this down into sub-questions to approach the answer properly. First, I should think about the different aspects the show has gone through. Starting with the initial creation and early years makes sense because that's where it all began. Then, moving into the golden era of the 90s when it really became a cultural phenomenon. After that, the show's evolution in the 2000s brings up some changes in tone and cast. The 2010s had some controversy, so that's important to mention. Finally, looking ahead to the future of the show would give a complete picture. Each of these points can be a sub-question. If I can't think of all of them, I should note that. But for now, I think these cover the main areas to explore how The Simpsons has evolved.
</think>

```json
{
  "sub_questions": [
    "What were the key changes in The Simpsons' creation and early years?",
    "How did The Simpsons become a cultura

In [8]:
sub_questions = list(leaves(extract_json(response)))

In [9]:
breakdown = {}

# Hardcode topic for now from output above
topic = "The evolution of The Simpsons as a show over time, covering changes in content, humor, character development, animation, and its role in society."

# Break sub-questions into sub-sub-questions
for q in sub_questions:
    prompt = f"""You are researching the follow topic. Break down the following question into intermediate sub-questions to approach answering it. Provide a list of intermediate sub-questions and respond with JSON format. If you cannot produce sub-questions then say so. Do not directly answer the following question and only return the sub-questions in JSON format. Your answer must contain JSON.

    Topic: {topic}

    Question: {q}"""

    response = ask_model(prompt)
    sub_sub_questions = list(leaves(extract_json(response)))
    breakdown[q] = sub_sub_questions


<think>
Alright, so I need to figure out the key changes in The Simpsons' creation and early years. I'm not super familiar with all the details, but I know The Simpsons is a long-running show, so there must have been some significant shifts over time. Let me break this down.

First, I remember that The Simpsons started as a series of short cartoons. I think they were part of The Tracey Ullman Show. So, maybe the initial creation involved being a part of another show before becoming their own thing. That seems like a big change.

Next, the humor style. I know that The Simpsons is known for its humor, which is a mix of slapstick and satire. I wonder how their style evolved. Did they start with different kinds of jokes and then develop a more consistent style? Maybe they started with more character-driven humor and then incorporated more observational comedy as they went on.

Then there's the development of the main characters. Homer Simpson is the main character. How did he change over t

In [10]:
sub_questions = list(leaves(extract_json(response)))

In [11]:
breakdown

{"What were the key changes in The Simpsons' creation and early years?": ['How did the characters in The Simpsons develop?',
  "What was the societal impact of The Simpsons' creation and early years?",
  'How did the humor in The Simpsons evolve over time?',
  'How did The Simpsons begin as a series?',
  'How did the animation style of The Simpsons change?'],
 "How did The Simpsons' tone and style evolve in the 2000s?": [],
 'How did The Simpsons become a cultural phenomenon in the 1990s?': ['How did The Simpsons reflect 1990s society and values?',
  'What made the characters of The Simpsons relatable and memorable?',
  "How did The Simpsons' humor resonate with audiences in the 1990s?",
  'How did the animation style of The Simpsons contribute to its popularity?',
  "What role did the creative team play in The Simpsons' success?",
  'What was the cultural impact of The Simpsons in the 1990s?'],
 'What is the current status and future of The Simpsons?': [],
 'What challenges did The Si

## Search

### Build vector database on Wikipedia article

In [12]:
wiki_wiki = wikipediaapi.Wikipedia(user_agent='MilvusDeepResearchBot (<insert your email>)', language='en')
page_py = wiki_wiki.page(page_title)

In [13]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
docs = text_splitter.create_documents([page_py.text])

In [14]:
vectorstore = Milvus.from_documents(  # or Zilliz.from_documents
    documents=docs,
    embedding=embeddings,
    connection_args={
        "uri": "./milvus_demo.db",
    },
    drop_old=True,  # Drop the old Milvus collection if it exists
    index_params={
        "metric_type": "COSINE",
        "index_type": "FLAT",  # <= NOTE: Currently a bug where langchain_milvus defaults to "HNSW" index, which doesn't work with Milvus Lite
        "params": {},
    },
)

### Break down question in sub-questions, and so on.

In [15]:
q = 'How has the cast changed over time?'

def question_to_header(question, topic=None):
    if topic is None:
        prompt = f"""Rewrite the following question as a header title. Be concise. Respond in JSON format with an escaped code block.

Question: {q}"""
    else:
        prompt = f"""Rewrite the following question with given context as a header title. Be concise. Respond in JSON format with an escaped code block.

Context: {topic}
Question: {q}"""

    response = ask_model(prompt)
    return list(leaves(extract_json(response)))[0]

In [16]:
header = question_to_header(q, topic)

<think>
Alright, I need to figure out how to rewrite the question "How has the cast changed over time?" into a header title based on the given context. The context mentions the evolution of The Simpsons covering changes in content, humor, character development, animation, and its role in society. 

First, I should focus on the specific aspect of the cast. The original question is about how the cast has evolved, which relates to character development. So, the header should highlight this aspect.

I should make it concise and clear. Maybe something like "Evolution of the Cast" or "Changes in the Cast Over Time." But since the context includes more than just the cast, but the show's overall changes, perhaps combining that would be better. 

Wait, the question is specifically about the cast, so maybe it's better to focus solely on that. So the title should reflect the changes in the cast, which ties into character development as part of the show's evolution.

I think "Changes in the Cast O

In [17]:
from langchain_core.messages import HumanMessage

import transformers
from langchain_huggingface import HuggingFacePipeline as Pipeline
from langchain_huggingface import ChatHuggingFace as Chat

FastLanguageModel.for_inference(model)

hf_pipeline = transformers.pipeline(
    model=model,
    tokenizer=tokenizer,
    task="text-generation",
    # device="cuda",
    # repetition_penalty=1.15,
    return_full_text=False,
    max_new_tokens=4048,
    # output_scores=True,
    # use_cache=False,
    # truncation=True
)

llm = Pipeline(pipeline=hf_pipeline)
chat = Chat(llm=llm)

Device set to use cuda:0


tokenizer_config.json:   0%|          | 0.00/52.9k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

## Analyze
Answer (sub-)sub-questions

In [18]:
# DEBUG: Without providing context
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Define the prompt template for generating AI responses
PROMPT_TEMPLATE = """
You are an AI assistant, and provides answers to questions by using fact based and statistical information when possible.
Use the following pieces of information to provide a concise answer to the question enclosed in $question$ tags.
If you don't know the answer, just say that you don't know, don't try to make up an answer. Answer in a single short paragraph.
$context$
{context}
$/context$

$question$
{question}
$/question$
"""

# Create a PromptTemplate instance with the defined template and input variables
prompt = PromptTemplate(
    template=PROMPT_TEMPLATE, input_variables=["question"]
)
# Convert the vector store to a retriever
retriever = vectorstore.as_retriever()

# Define a function to format the retrieved documents
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Define the RAG (Retrieval-Augmented Generation) chain for AI response generation
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# rag_chain.get_graph().print_ascii()

In [19]:
# Prompt the RAG for each question
answers = {}
total = len(leaves(breakdown)) + 4

pbar = tqdm(total=total)
for k, v in breakdown.items():
    if v == []:
        print(k)
        answers[k] = rag_chain.invoke(k).split('</think>')[-1].strip()
        pbar.update(1)
    else:
        for q in v:
            print(q)
            answers[q] = rag_chain.invoke(q).split('</think>')[-1].strip()
            pbar.update(1)

  0%|          | 0/15 [00:00<?, ?it/s]

How did the characters in The Simpsons develop?


  7%|▋         | 1/15 [00:45<10:41, 45.82s/it]

What was the societal impact of The Simpsons' creation and early years?


 13%|█▎        | 2/15 [01:42<11:19, 52.31s/it]

How did the humor in The Simpsons evolve over time?


 20%|██        | 3/15 [02:42<11:10, 55.87s/it]

How did The Simpsons begin as a series?


 27%|██▋       | 4/15 [03:15<08:34, 46.76s/it]

How did the animation style of The Simpsons change?


 33%|███▎      | 5/15 [03:57<07:28, 44.89s/it]

How did The Simpsons' tone and style evolve in the 2000s?


 40%|████      | 6/15 [04:38<06:33, 43.76s/it]

How did The Simpsons reflect 1990s society and values?


 47%|████▋     | 7/15 [05:17<05:36, 42.11s/it]

What made the characters of The Simpsons relatable and memorable?


 53%|█████▎    | 8/15 [05:27<03:44, 32.01s/it]

How did The Simpsons' humor resonate with audiences in the 1990s?


 60%|██████    | 9/15 [06:14<03:39, 36.59s/it]

How did the animation style of The Simpsons contribute to its popularity?


 67%|██████▋   | 10/15 [06:54<03:07, 37.60s/it]You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


What role did the creative team play in The Simpsons' success?


 73%|███████▎  | 11/15 [07:24<02:21, 35.39s/it]

What was the cultural impact of The Simpsons in the 1990s?


 80%|████████  | 12/15 [08:06<01:52, 37.43s/it]

What is the current status and future of The Simpsons?


 87%|████████▋ | 13/15 [08:37<01:10, 35.36s/it]

What challenges did The Simpsons face in the 2010s?


 93%|█████████▎| 14/15 [09:21<00:38, 38.01s/it]

In [20]:
answers

{'How did the characters in The Simpsons develop?': 'The characters in *The Simpsons* developed over time by evolving from a simple, dysfunctional family concept to multi-dimensional individuals. Created by Matt Groening, the show began as animated shorts featuring a family resembling his own, with "Bart" as a substitute for his name. Initially, the family was portrayed as bumbling and irresponsible, but as the series expanded into a full-length show, the characters grew into more nuanced and layered versions of themselves. Homer Simpson, for instance, transformed from a careless father into a more responsible family man, while Marge evolved into a stronger, more independent figure. Bart and Lisa developed into characters with growth and depth, influenced by each other and their environment. The show\'s success allowed for the exploration of their relationships and individual struggles, supported by a dedicated writing team that added depth and satirical elements to their narratives. T

In [21]:
import pickle
with open('answers.pkl', 'wb') as f:
    pickle.dump(answers, f)

## Synthesize

In [22]:
report = [f'# {topic}\n\n']
for k, v in breakdown.items():
    report.append(f'## {k}\n')
    if v == []:
        report.append(answers[k] + '\n\n')

    else:
        for q in v:
            report.append(f'### {q}\n')
            report.append(answers[q] + '\n\n')

In [23]:
md = ''.join(report)
with open('report.md', 'w') as f:
    print(md, file=f)

In [24]:
md

'# The evolution of The Simpsons as a show over time, covering changes in content, humor, character development, animation, and its role in society.\n\n## What were the key changes in The Simpsons\' creation and early years?\n### How did the characters in The Simpsons develop?\nThe characters in *The Simpsons* developed over time by evolving from a simple, dysfunctional family concept to multi-dimensional individuals. Created by Matt Groening, the show began as animated shorts featuring a family resembling his own, with "Bart" as a substitute for his name. Initially, the family was portrayed as bumbling and irresponsible, but as the series expanded into a full-length show, the characters grew into more nuanced and layered versions of themselves. Homer Simpson, for instance, transformed from a careless father into a more responsible family man, while Marge evolved into a stronger, more independent figure. Bart and Lisa developed into characters with growth and depth, influenced by each 