# GPT-Swarm

[Language Agents as Optimizable Graphs](https://arxiv.org/pdf/2402.16823.pdf) by Zhuge, et. al out of Schmidhuber's lab frames agents as graphs (DAGs in their case), where each node is a unit of work (such as an LLM or tool), and the edges define channels where information can be transmitted between the LLMs. These graphs themselves can be composed into larger graphs that can behave as a swarm to accomplish a task.

They then show how an agent can be _optimized_ to solve a given task given external rewards.

They optimize the graph on two levels:

1. Edge-level: They apply the classic [REINFORCE](https://link.springer.com/article/10.1007/BF00992696) algorithm to prune edges between nodes.
2. Node-level: For Agent nodes, they prompt an LLM to update the prompt.



This notebook will walk through how to implement the node-level optimization in LangGraph.

The two main components here are :

1. Node-level memory/history
2. Optimizer


<!-- Some things to bear in mind here: -->
<!-- Graphs each have state. You typicaly don't want to mix outside class state with graph state. Best to put all in graph state (otherwise `.batch()` operations may do some funky things) -->

In [None]:
from typing_extensions import TypedDict


class LearnableState(TypedDict):
    prompt: ChatPromptTemplate
    """The agent's prompt."""
    examples: list
    """Few-shot examples."""


class AgentState(TypedDict):
    parameters: LearnableState
    instructions: str


async def get_new_prompt(negative_examples):
    pass

In [None]:
class OptimizableOperation(Node):
    def __init__(
        self,
        domain: str,
        combine_inputs_as_one: bool,
        prompt: str,
        model_name: Optional[str] = None,
        operation_description: str = "",
        id=None,
        max_domenstrations: int = 4,
    ):
        self.domain = domain
        self.model_name = model_name
        self.llm = LLMRegistry.get(model_name)
        super().__init__(operation_description, id, combine_inputs_as_one)
        self.operation_description = operation_description
        self.prompt = prompt
        self.domenstrations = []
        self.max_domenstrations = max_domenstrations

    def get_complete_prompt(self, inputs):
        pass

    async def evaluate(self, candidate) -> float:
        raise NotImplementedError

    async def get_new_prompt(self, negative_examples):
        tasks = []
        for negative_example in negative_examples:
            meta_prompt = f""" Here is an example when {self.operation_description} gets wrong.
Input:
{negative_example['input']}
------------------
The output was:
{negative_example['output']}
------------------
It received the following feedback:
{negative_example['feedback']}
"""
            tasks.append(
                self.llm.agen(
                    [
                        Message(role="user", content=meta_prompt),
                        Message(
                            role="user",
                            content=f"Identify a problem in {self.operation_description} from the given example and suggest how to prevent it without mentioning the specific example. Responde only one sentence.",
                        ),
                    ],
                    max_tokens=100,
                )
            )

        responds = await asyncio.gather(*tasks)
        advice = ""
        for i, respond in enumerate(responds):
            advice += f"{i + 1}. {respond}\n"

        meta_prompt = f"""I'm trying to define {self.operation_description} by prompting.
My current prompt is:
"{self.prompt}"

To generate an improved prompt, consider the following:
{advice}
Genergate an improved prompt within five sentences. Do not mention a specific task in the prompt!
The prompt should be wrapped with <START> and <END>.
"""
        new_prompt = await self.llm.agen(
            [Message(role="user", content=meta_prompt)], max_tokens=200
        )
        new_prompt = new_prompt.split("<END>")[0].split("<START>")[-1].strip()
        return new_prompt

In [None]:
async def optimize(
    node: OptimizableOperation, learn_demonstration=False, learn_prompt=True
):
    examples = node.memory.query_by_id(node.id)[-4:]
    positive_examples = [
        example for example in examples if node.memory.query_by_id(example["task"])[0]
    ]
    negative_examples = [
        example
        for example in examples
        if not node.memory.query_by_id(example["task"])[0]
    ]

    prompts = [node.prompt]
    demonstrations = [node.domenstrations]
    if learn_demonstration:
        new_domenstrations = node.domenstrations + positive_examples
        if len(new_domenstrations) > node.max_domenstrations:
            new_domenstrations = random.sample(
                new_domenstrations, node.max_domenstrations
            )
        demonstrations.append(new_domenstrations)

    if learn_prompt and len(negative_examples) > 0:
        new_prompt = await node.get_new_prompt(negative_examples)
        prompts.append(new_prompt)

    candidates = [
        (prompt, domenstrations)
        for prompt in prompts
        for domenstrations in demonstrations
    ]
    if len(candidates) == 1:
        return
    tasks = [node.evaluate(candidate) for candidate in candidates]
    scores = await asyncio.gather(*tasks)
    node.prompt, node.domenstrations = candidates[scores.index(max(scores))]