# Lab | BabyAGI with agent

**Change the planner objective below by changing the objective and the associated prompts and potential tools and agents - Wear your creativity and AI engineering hats
You can't get this wrong!**

You would need the OpenAI API KEY and the [SerpAPI KEY](https://serpapi.com/manage-api-keyhttps://serpapi.com/manage-api-key) to run this lab.


## BabyAGI with Tools

This notebook builds on top of [baby agi](baby_agi.html), but shows how you can swap out the execution chain. The previous execution chain was just an LLM which made stuff up. By swapping it out with an agent that has access to tools, we can hopefully get real reliable information

## Install and Import Required Modules

In [1]:
#%pip install langchain langchain-community langchain-experimental langchain-groq langchain-classic langchain-huggingface sentence-transformers faiss-cpu google-search-results

In [2]:
from typing import Optional

# Legacy chains/prompts → langchain-classic
from langchain_classic.chains import LLMChain
from langchain_classic.prompts import PromptTemplate

# BabyAGI still in langchain-experimental
from langchain_experimental.autonomous_agents import BabyAGI

# Groq integration
from langchain_groq import ChatGroq

# HuggingFace embeddings (local, no API key needed)
from langchain_huggingface import HuggingFaceEmbeddings

## Connect to the Vector Store

Depending on what vectorstore you use, this step may look different.

In [3]:
# # %pip install faiss-cpu > /dev/null
# # %pip install google-search-results > /dev/null
# from langchain.docstore import InMemoryDocstore
# from langchain_community.vectorstores import FAISS

In [4]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

GROQ_API_KEY = os.getenv('GROQ_API_KEY')
SERPAPI_API_KEY = os.getenv('SERPAPI_API_KEY')

In [5]:
# Define your embedding model (local, runs on CPU, no API key needed)
from langchain_huggingface import HuggingFaceEmbeddings

embeddings_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

## Define the Chains

BabyAGI relies on three LLM chains:
- Task creation chain to select new tasks to add to the list
- Task prioritization chain to re-prioritize tasks
- Execution Chain to execute the tasks


NOTE: in this notebook, the Execution chain will now be an agent.

In [6]:
from langchain_classic.agents import AgentExecutor, Tool, ZeroShotAgent  # Legacy agents
from langchain_classic.chains import LLMChain  # Legacy chains
from langchain_classic.prompts import PromptTemplate  # Legacy prompts
from langchain_community.utilities import SerpAPIWrapper
from langchain_groq import ChatGroq

llm = ChatGroq(model="llama-3.1-8b-instant", temperature=0)

todo_prompt = PromptTemplate.from_template(
    "You are a planner who is an expert at coming up with a todo list for a given objective. Come up with a todo list for this objective: {objective}"
)
todo_chain = LLMChain(llm=llm, prompt=todo_prompt)
search = SerpAPIWrapper()

def safe_search(query: str) -> str:
    """Wrapper that catches empty SerpAPI results and returns a usable message
    instead of crashing the chain. The agent can then rephrase its query."""
    try:
        return search.run(query)
    except Exception:
        return "No results found for that query. Try rephrasing with simpler or broader terms."

tools = [
    Tool(
        name="Search",
        func=safe_search,
        description="useful for when you need to answer questions about current events",
    ),
    Tool(
        name="TODO",
        func=todo_chain.run,
        description="useful for when you need to come up with todo lists. Input: an objective to create a todo list for. Output: a todo list for that objective. Please be very clear what the objective is!",
    ),
]

prefix = """You are an AI who performs one task based on the following objective: {objective}.
Take into account these previously completed tasks: {context}."""

# ZeroShotAgent injects default format instructions that contain "Question:" as an
# example token. Chat models echo it back on every step, causing an infinite loop.
# We override format_instructions entirely to remove "Question:" from the prompt.
format_instructions = """Use EXACTLY this format:
Thought: think about what to do
Action: the action to take, must be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
(Repeat Thought/Action/Action Input/Observation only if you need more information)
Thought: I have enough information to answer
Final Answer: your complete answer

NEVER write "Action: None". ALWAYS end with "Final Answer:" when done."""

suffix = """Begin!

Task: {task}
Thought:{agent_scratchpad}"""

prompt = ZeroShotAgent.create_prompt(
    tools,
    prefix=prefix,
    suffix=suffix,
    format_instructions=format_instructions,
    input_variables=["objective", "task", "context", "agent_scratchpad"],
)

  todo_chain = LLMChain(llm=llm, prompt=todo_prompt)


In [7]:
llm_chain = LLMChain(llm=llm, prompt=prompt)
tool_names = [tool.name for tool in tools]
agent = ZeroShotAgent(llm_chain=llm_chain, allowed_tools=tool_names)
agent_executor = AgentExecutor.from_agent_and_tools(
    agent=agent, tools=tools, verbose=True,
    handle_parsing_errors=True,
    max_iterations=5,  # prevent infinite loops
)

  agent = ZeroShotAgent(llm_chain=llm_chain, allowed_tools=tool_names)


### Run the BabyAGI

Now it's time to create the BabyAGI controller and watch it try to accomplish your objective.

In [9]:
from langchain_experimental.autonomous_agents.baby_agi import TaskCreationChain
from langchain_classic.chains import LLMChain
from langchain_classic.prompts import PromptTemplate

# Custom prioritization prompt — prevents chat models from adding preamble text
# before the numbered list, which breaks BabyAGI's task_id parser
task_prioritization_prompt = PromptTemplate(
    input_variables=["task_names", "next_task_id", "objective"],
    template=(
        "You are a task prioritization AI.\n"
        "Reprioritize these tasks: {task_names}\n"
        "Ultimate objective: {objective}\n"
        "Rules:\n"
        "- Do NOT remove any tasks.\n"
        "- Do NOT write any introduction, explanation, or extra text.\n"
        "- Output ONLY a numbered list starting from {next_task_id}, one task per line.\n"
        "Output:\n"
    )
)
task_prioritization_chain = LLMChain(llm=llm, prompt=task_prioritization_prompt)

In [10]:
import faiss
from langchain_community.docstore.in_memory import InMemoryDocstore
from langchain_community.vectorstores import FAISS

# Recreate vectorstore AND baby_agi together on every run.
# They must be constructed together — rebuilding only the vectorstore
# leaves baby_agi pointing at the old (already-populated) one,
# causing "Tried to add ids that already exist" on re-runs.
embedding_size = 384  # all-MiniLM-L6-v2 output dimension
fresh_index = faiss.IndexFlatL2(embedding_size)
fresh_vectorstore = FAISS(
    embedding_function=embeddings_model,
    index=fresh_index,
    docstore=InMemoryDocstore({}),
    index_to_docstore_id={}
)

verbose = False
max_iterations: Optional[int] = 3
baby_agi = BabyAGI(
    task_creation_chain=TaskCreationChain.from_llm(llm, verbose=verbose),
    task_prioritization_chain=task_prioritization_chain,
    execution_chain=agent_executor,
    vectorstore=fresh_vectorstore,
    verbose=verbose,
    max_iterations=max_iterations,
)

### Improving the objective for a more grounded answer

In [11]:
OBJECTIVE = "What are the top 3 things to do in SF this weekend?"

baby_agi({"objective": OBJECTIVE})

  baby_agi({"objective": OBJECTIVE})


[95m[1m
*****TASK LIST*****
[0m[0m
1: Make a todo list
[92m[1m
*****NEXT TASK*****
[0m[0m
1: Make a todo list


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to come up with a list of things to do in San Francisco this weekend.
Action: TODO
Action Input: objective = "Make a todo list for things to do in San Francisco this weekend"[0m
Observation: [33;1m[1;3mWhat a fun task. Here's a comprehensive todo list for things to do in San Francisco this weekend:

**Pre-Weekend Planning (Wednesday-Thursday)**

1. **Research and Shortlist Activities**:
	* Look up top attractions in San Francisco (e.g., Golden Gate Bridge, Alcatraz Island, Fisherman's Wharf, Chinatown, etc.)
	* Check opening hours, ticket prices, and any necessary reservations
2. **Book Accommodations**:
	* Research and book a hotel or Airbnb in a convenient location
	* Check for any special deals or discounts
3. **Plan Transportation**:
	* Research public transportation options (e.g., BART,

{'objective': 'What are the top 3 things to do in SF this weekend?'}

## Conclusion

### What we achieved
- Migrated the notebook from OpenAI to a fully open-source stack: **Llama 3.3 70B via Groq** (LLM) and **`all-MiniLM-L6-v2` via HuggingFace** (embeddings), eliminating any dependency on paid OpenAI APIs.
- Got BabyAGI running end-to-end: task creation, prioritization, and execution with real web search via SerpAPI.
- No GPU required — all LLM inference runs remotely on Groq, and the embedding model runs on CPU.

### Constraints encountered
| Constraint | Root cause |
|---|---|
| Infinite agent loops | `ZeroShotAgent` (MRKL format) was designed for completion models, not chat models — required overriding `format_instructions` and adding `max_iterations` |
| Task ID parser crash | BabyAGI's default prioritization prompt causes chat models to add preamble text before the numbered list — required a custom `task_prioritization_chain` |
| Duplicate vector store IDs | `BabyAGI` holds a reference to its vectorstore — recreating the vectorstore alone on re-runs doesn't help; the whole instance must be rebuilt |
| Token burn rate | Loop-heavy runs consumed Groq's 100K daily free token limit quickly |
| SerpAPI empty results | Overly specific or quoted queries return no results — wrapped `search.run` in a `safe_search` fallback |
| Slow runtime & large context | BabyAGI always runs to `max_iterations` regardless of whether the objective is already answered. The TODO tool compounds this by returning multi-paragraph responses that accumulate in the `agent_scratchpad` and are resent to the LLM on every step. For focused factual questions, `max_iterations=1` and removing the TODO tool would cut runtime and token usage significantly. |

### What we learned
- **Completion models ≠ chat models**: Legacy LangChain agents like `ZeroShotAgent` were built around OpenAI's completion API. Chat models need explicit, strict format instructions to follow the ReAct pattern reliably.
- **Objective design matters for agentic systems**: An open-ended objective like *"Write a weather report"* reads as a project to BabyAGI and spawns unbounded sub-tasks. A well-scoped question like *"What are the top 3 things to do in SF this weekend?"* gives the system a natural stopping condition.
- **Stateful objects need full resets**: Any object that accumulates state (vectorstore, agent) must be fully reconstructed between runs — partial resets silently fail.
- **Context grows multiplicatively**: Each BabyAGI iteration × agent steps × tool response length compounds quickly. In agentic systems, verbose tools (like TODO) are a hidden cost — every character they return gets carried forward in the scratchpad for all subsequent LLM calls.