In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import dspy

lm = dspy.LM("ollama_chat/gpt-oss:20b", api_base="http://localhost:11434", api_key="fake")
dspy.configure(lm=lm)

To invoke the LLM

In [3]:
lm(messages=[{"role": "user", "content": "Hi! How many 'r's are there in strawberry?"}])  

["There are **3** 'r's in the word *strawberry*."]

# 1. Inline signatures

Declare signatures inline using strings and arrows!

## Chain Of Thought
GPT-oss has a 128k context window! Let's make it summarize some documents!

In [5]:
import os

if not os.path.exists("../docs"):
    os.makedirs("../docs")

In [6]:
!wget https://arxiv.org/pdf/2505.20286 -O "../docs/alita_paper.pdf"

--2025-08-08 21:52:57--  https://arxiv.org/pdf/2505.20286
Resolving arxiv.org (arxiv.org)... 151.101.3.42, 151.101.195.42, 151.101.131.42, ...
Connecting to arxiv.org (arxiv.org)|151.101.3.42|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1113373 (1.1M) [application/pdf]
Saving to: ‘../docs/alita_paper.pdf’


2025-08-08 21:52:57 (24.0 MB/s) - ‘../docs/alita_paper.pdf’ saved [1113373/1113373]



In [7]:
from llama_index.core import SimpleDirectoryReader

docs = SimpleDirectoryReader("../docs").load_data()

In [16]:
from IPython.display import display, Markdown
display(Markdown(str(docs[0])))

Doc ID: ae9fbb15-e106-4637-9386-3c49099db9b3
Text: arXiv:2505.20286v1  [cs.AI]  26 May 2025 ALITA : G ENERALIST
AGENT ENABLING SCALABLE AGENTIC REASONING WITH MINIMAL PREDEFINITION
AND MAXIMAL SELF -EVOLUTION Jiahao Qiu∗1, Xuan Qi∗2, Tongcheng
Zhang∗3, Xinzhe Juan3,4, Jiacheng Guo1, Yifu Lu1, Yimin Wang3,4, Zixin
Yao1, Qihan Ren3, Xun Jiang5, Xing Zhou5, Dongrui Liu3, Ling Yang1,
Yue Wu1, Kaixua...

In [17]:
document = ""
for doc in docs:
    document += str(doc)

In [20]:
summarize = dspy.ChainOfThought('document -> summary')
response = summarize(document = document)

In [21]:
display(Markdown(response.summary))

ALITA is a lightweight generalist agent that achieves scalable, autonomous reasoning by minimizing predefined tools and maximizing self‑evolution. It employs a concise toolkit—MCP Brainstorming, ScriptGeneratingTool, and CodeRunningTool—to detect functional gaps, generate scripts, and execute code in isolated environments. Evaluated on the GAIA benchmark, ALITA outperforms more complex agents, attaining 75.15 % pass@1 and 87.27 % pass@3, and its generated MCPs further boost performance across difficulty levels. The study demonstrates that a minimal, self‑evolving design can rival or surpass heavily engineered systems, offering a promising direction for future autonomous agents.

In [23]:
display(Markdown(response.reasoning))

The provided document consists of multiple excerpts from a research paper titled “ALITA: GENERALIST AGENT ENABLING SCALABLE AGENTIC REASONING WITH MINIMAL PREDEFINITION AND MAXIMAL SELF‑EVOLUTION.” I parsed each excerpt to identify the main sections: introduction, methods, tool usage, experimental results, comparisons with other agents, and a brief case study. Key themes emerged: ALITA is a generalist agent that relies on a minimal set of powerful tools (e.g., MCP Brainstorming, ScriptGeneratingTool, CodeRunningTool) and emphasizes self‑evolution rather than extensive pre‑design. The agent was evaluated on the GAIA benchmark, achieving top performance (e.g., 75.15% pass@1, 87.27% pass@3) and outperforming more complex systems. Additional experiments showed that reusing ALITA‑generated MCPs improves performance across difficulty levels. The references and case study sections provide context and demonstrate practical applicability. Based on these observations, I crafted a concise summary that captures the motivation, design philosophy, methodological innovations, empirical results, and significance of the work.

## DSPy predict

Adding an instruction to the Signature helps us to couch the LLM's reply.

> A one shot one document read example

In [None]:
oneshot = dspy.Predict(
    dspy.Signature(
        'document: str, question: str -> answer: str',
        instructions='Only use the document to answer the question and nothing else.'
    )
)

question = 'How does ALITA generate its own MCP tools to achieve autonomous reasoning?'

response = oneshot(question=question, document=document)

In [25]:
display(Markdown(response.answer))

ALITA’s autonomous reasoning is enabled by a self‑evolving “MCP” (Minimal‑Complexity‑Plan) generation process that is driven by its manager agent.  
1. **MCP Brainstorming** – The manager first runs the MCP Brainstorming tool, which scans the current task and the agent’s internal state to detect *functional gaps* (missing capabilities that are needed to progress).  
2. **Tool generation** – Once a gap is identified, the manager uses the same MCP Brainstorming logic to *generate* a new MCP. This new MCP is a concise, purpose‑specific tool (e.g., a ScriptGeneratingTool or a CodeRunningTool) that can be invoked immediately.  
3. **Minimal predefinition** – ALITA starts with only a handful of core, powerful tools. All other tools are produced on‑the‑fly by the manager, so the system does not rely on a large, hand‑crafted toolkit.  
4. **Self‑evolution** – After a new MCP is created, it is added to the agent’s repertoire and can be reused or refined in future tasks, allowing the agent to evolve its own toolset autonomously.

Thus, ALITA generates its own MCP tools by using the MCP Brainstorming component to detect missing functionality and automatically produce new, minimal‑complexity tools that extend its reasoning capabilities.

YES!! No vector database!

In [34]:
dspy.inspect_history()





[34m[2025-08-08T22:08:07.203340][0m

[31mSystem message:[0m

Your input fields are:
1. `document` (str): 
2. `question` (str):
Your output fields are:
1. `answer` (str):
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## document ## ]]
{document}

[[ ## question ## ]]
{question}

[[ ## answer ## ]]
{answer}

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        Only use the document to answer the question and nothing else.


[31mUser message:[0m

[[ ## document ## ]]
Doc ID: ae9fbb15-e106-4637-9386-3c49099db9b3
Text: arXiv:2505.20286v1  [cs.AI]  26 May 2025 ALITA : G ENERALIST
AGENT ENABLING SCALABLE AGENTIC REASONING WITH MINIMAL PREDEFINITION
AND MAXIMAL SELF -EVOLUTION Jiahao Qiu∗1, Xuan Qi∗2, Tongcheng
Zhang∗3, Xinzhe Juan3,4, Jiacheng Guo1, Yifu Lu1, Yimin Wang3,4, Zixin
Yao1, Qihan Ren3, Xun Jiang5, Xing Zhou5, Dongrui Liu3, Ling Yang1,
Yue Wu1, Kaixua...Doc ID: f48c48de-a907-4ed9-bf14-ee8b3

# 2. Programmatic Signatures and how they integrate with the broader LLM ecosystem
Let's create an agent with a signature declared using a programmed class!

14