In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import dspy

lm = dspy.LM("ollama_chat/gpt-oss:20b", api_base="http://localhost:11434", api_key="fake")
dspy.configure(lm=lm)

To invoke the LLM

In [3]:
lm(messages=[{"role": "user", "content": "Hi! How many 'r's are there in strawberry?"}])  

["There are **3** 'r's in the word *strawberry*."]

# 1. Inline signatures

Declare signatures inline using strings and arrows!

## Chain Of Thought
GPT-oss has a 128k context window! Let's make it summarize some documents!

In [4]:
import os

if not os.path.exists("../docs"):
    os.makedirs("../docs")

In [6]:
!wget https://arxiv.org/pdf/2505.20286 -O "../docs/alita_paper.pdf"

--2025-08-08 21:52:57--  https://arxiv.org/pdf/2505.20286
Resolving arxiv.org (arxiv.org)... 151.101.3.42, 151.101.195.42, 151.101.131.42, ...
Connecting to arxiv.org (arxiv.org)|151.101.3.42|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1113373 (1.1M) [application/pdf]
Saving to: ‘../docs/alita_paper.pdf’


2025-08-08 21:52:57 (24.0 MB/s) - ‘../docs/alita_paper.pdf’ saved [1113373/1113373]



In [5]:
from llama_index.core import SimpleDirectoryReader

docs = SimpleDirectoryReader("../docs").load_data()

In [6]:
from IPython.display import display, Markdown
display(Markdown(str(docs[0])))

Doc ID: 1fa1e383-cfb8-4b11-b488-2e3960d1e606
Text: arXiv:2505.20286v1  [cs.AI]  26 May 2025 ALITA : G ENERALIST
AGENT ENABLING SCALABLE AGENTIC REASONING WITH MINIMAL PREDEFINITION
AND MAXIMAL SELF -EVOLUTION Jiahao Qiu∗1, Xuan Qi∗2, Tongcheng
Zhang∗3, Xinzhe Juan3,4, Jiacheng Guo1, Yifu Lu1, Yimin Wang3,4, Zixin
Yao1, Qihan Ren3, Xun Jiang5, Xing Zhou5, Dongrui Liu3, Ling Yang1,
Yue Wu1, Kaixua...

In [7]:
document = ""
for doc in docs:
    document += str(doc)

In [8]:
summarize = dspy.ChainOfThought('document -> summary')
response = summarize(document = document)

In [9]:
display(Markdown(response.summary))

The paper “ALITA: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self‑Evolution” presents a lightweight, generalist agent framework that achieves strong performance on complex reasoning tasks without relying on extensive handcrafted tools or workflows. ALITA’s core idea is a manager agent that orchestrates a small set of powerful tools—MCP Brainstorming, ScriptGeneratingTool, and CodeRunningTool—while generating Meta‑Cognitive Plans (MCPs) on the fly to fill functional gaps. The system emphasizes isolated, reproducible execution environments and automatic recovery from failures. Evaluated on the GAIA benchmark, ALITA outperforms many existing agents, achieving 75.15 % pass@1 and 87.27 % pass@3 with Claude‑Sonnet‑4 and GPT‑4o. Performance tables show consistent gains across difficulty levels, and experiments demonstrate that reusing ALITA‑generated MCPs benefits other frameworks. The work highlights how minimal predefinition combined with self‑evolution can enable scalable, autonomous reasoning in large language model agents.

In [10]:
display(Markdown(response.reasoning))

The input consists of a series of document fragments identified by unique IDs. All fragments relate to a single research paper titled “ALITA: GENERALIST AGENT ENABLING SCALABLE AGENTIC REASONING WITH MINIMAL PREDEFINITION AND MAXIMAL SELF‑EVOLUTION.” The fragments include the title, author list, abstract, introduction, method description, tool usage, experimental results, performance tables, and references. By scanning the text we can identify the key components of the paper:

1. **Purpose and Motivation** – The paper addresses the need for generalist agents that can perform complex tasks without heavy pre‑engineering of tools or workflows, citing the rapid evolution of LLMs into autonomous agents.

2. **Proposed Solution (ALITA)** – A minimal‑predefinition, maximal‑self‑evolution framework. It uses a manager agent that orchestrates a small set of powerful tools (MCP Brainstorming, ScriptGeneratingTool, CodeRunningTool) and relies on self‑generated “MCPs” (Meta‑Cognitive Plans) to fill functional gaps.

3. **Methodology** – The manager agent dynamically selects tools, generates code, runs it in isolated environments, and iteratively refines plans. The system emphasizes isolation, reproducibility, and automatic recovery from failures.

4. **Experimental Evaluation** – ALITA is evaluated on the GAIA benchmark. Results show superior performance compared to other agents, achieving 75.15 % pass@1 and 87.27 % pass@3 with Claude‑Sonnet‑4 and GPT‑4o. Tables compare performance across difficulty levels and against GPT‑4o‑mini.

5. **Additional Findings** – Reuse of ALITA‑generated MCPs improves performance for other frameworks (Open Deep Research‑smolagents). The paper also discusses tool usage, environment activation, and failure handling.

6. **Contextual Details** – The document includes author affiliations, references to related work, and a brief case study example.

From these observations, a concise summary can be constructed that captures the motivation, design, key innovations, and empirical results of the ALITA system.

In [11]:
dspy.inspect_history()





[34m[2025-08-08T22:42:07.848670][0m

[31mSystem message:[0m

Your input fields are:
1. `document` (str):
Your output fields are:
1. `reasoning` (str): 
2. `summary` (str):
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## document ## ]]
{document}

[[ ## reasoning ## ]]
{reasoning}

[[ ## summary ## ]]
{summary}

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        Given the fields `document`, produce the fields `summary`.


[31mUser message:[0m

[[ ## document ## ]]
Doc ID: 1fa1e383-cfb8-4b11-b488-2e3960d1e606
Text: arXiv:2505.20286v1  [cs.AI]  26 May 2025 ALITA : G ENERALIST
AGENT ENABLING SCALABLE AGENTIC REASONING WITH MINIMAL PREDEFINITION
AND MAXIMAL SELF -EVOLUTION Jiahao Qiu∗1, Xuan Qi∗2, Tongcheng
Zhang∗3, Xinzhe Juan3,4, Jiacheng Guo1, Yifu Lu1, Yimin Wang3,4, Zixin
Yao1, Qihan Ren3, Xun Jiang5, Xing Zhou5, Dongrui Liu3, Ling Yang1,
Yue Wu1, Kaixua...Doc ID: 6d081601-b426-4a8d-bbfb-925

## DSPy predict - A zero vector DB example

Adding an instruction to the Signature helps us to couch the LLM's reply.

> Not recommended because the document will greatly clog the LLM's context window. This code just demonstrates the power of having a long context window and how to use DSPy declarative signatures with instructions!

In [15]:
zero_vector_db = dspy.Predict(
    dspy.Signature(
        'document: str, question: str -> answer: str',
        instructions='Only use the document to answer the question and nothing else.'
    )
)

question = 'How does ALITA help LLMs to achieve autonomous reasoning?'
response = zero_vector_db(question=question, document=document)

In [16]:
display(Markdown(response.answer))

ALITA assists large language models (LLMs) in achieving autonomous reasoning by providing a **generalist agent architecture** that emphasizes:

1. **Minimal Predefinition** – ALITA is designed to operate with only a few concise, powerful tools (e.g., MCP Brainstorming, ScriptGeneratingTool, CodeRunningTool) instead of extensive, task‑specific toolkits. This reduces the need for hand‑crafted workflows and allows the LLM to adapt to new tasks more readily.

2. **Maximal Self‑Evolution** – The agent can evolve its own strategies and tool usage during execution, enabling it to refine its reasoning process on the fly without external intervention.

3. **Scalable Agentic Reasoning** – By combining the above principles, ALITA enables LLMs to plan, execute, and iterate over complex tasks autonomously, achieving performance that surpasses many systems with more complex, handcrafted designs (as demonstrated on the GAIA benchmark).

In short, ALITA equips LLMs with a lightweight, self‑adapting framework that streamlines reasoning and execution, thereby fostering true autonomous agent behavior.

YES!! No vector database!

In [17]:
dspy.inspect_history()





[34m[2025-08-08T22:45:02.097425][0m

[31mSystem message:[0m

Your input fields are:
1. `document` (str): 
2. `question` (str):
Your output fields are:
1. `answer` (str):
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## document ## ]]
{document}

[[ ## question ## ]]
{question}

[[ ## answer ## ]]
{answer}

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        Only use the document to answer the question and nothing else.


[31mUser message:[0m

[[ ## document ## ]]
Doc ID: 1fa1e383-cfb8-4b11-b488-2e3960d1e606
Text: arXiv:2505.20286v1  [cs.AI]  26 May 2025 ALITA : G ENERALIST
AGENT ENABLING SCALABLE AGENTIC REASONING WITH MINIMAL PREDEFINITION
AND MAXIMAL SELF -EVOLUTION Jiahao Qiu∗1, Xuan Qi∗2, Tongcheng
Zhang∗3, Xinzhe Juan3,4, Jiacheng Guo1, Yifu Lu1, Yimin Wang3,4, Zixin
Yao1, Qihan Ren3, Xun Jiang5, Xing Zhou5, Dongrui Liu3, Ling Yang1,
Yue Wu1, Kaixua...Doc ID: 6d081601-b426-4a8d-bbfb-92521

# 2. Programmatic Signatures and how they integrate with the broader LLM ecosystem
Let's create an agent with a signature declared using a programmed class!

14