# Create a forge and execute an evolution cycle
In this notebook, we will go through the very few steps needed to run a forge cycle for a given budget. 

As of today, the easiest way to experiment with Ebiose is to use the OpenAI API. To do so, all you have to do is to set your OpenAI API key via an .env file or by replacing `"your-open-api-key"` in the following code block:

In [1]:
import os 
from dotenv import load_dotenv
load_dotenv()
if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = "your-openai-api-key"

> 💡 Recall that you may need to add the root of the repository to your `PYTHONPATH` environment variable. You may also use a `.env` file to do so or execute the following:


In [2]:
import sys
sys.path.append("./../")


## Creating a basic forge

In ebiose, a **forge** is where custom agents are created to solve specific problems. The forge is the exclusive origin of new agents. Within each forge, architects agents orchestrate the creation and improvement of agents by reusing existing building blocks from the ecosystem.

To create a forge and run a cycle, you must provide the following:
- a description of the forge, which defines the problem that must be solved by generated agents;
- the expected format of the agent's input and output, defined as Pydantic models;
- an implementation of the `compute_fitness` abstract method that will be used by the forge to evaluate the generated agents.

Let's say we wich to generate agents specialized in solving math problems. The forge description could be:

In [3]:
forge_description = "Solving math word problems"

Next, we need to define the expected input and output formats of the generated agents. These formats are to be defined as Pydantic models. 

For instance, in our context of solving math problems, we want the agent input to be a string which will represent the math problem to be solved and the agent output to be composed of two fields:
- `solution` which will be the final solution to the math problem, given as an integer;
- `rationale` which will be the rationale behind the found solution.

The IO Pydantic models will thus be:

In [4]:
from pydantic import BaseModel

class AgentInput(BaseModel):
        math_problem: str

class AgentOutput(BaseModel):
    solution: int
    rationale: str

Lastly, we must provide a way of evaluating the generated agents through the implementation of the `compute_fitness` abstract method of `AgentForge` class. For the sake of demonstration, we will here return a random float between 0 and 1, so that we don't spend tokens at evaluation.

In [5]:
import random
random.seed(7)

from ebiose.core.agent import Agent
from ebiose.core.agent_forge import AgentForge

class BasicForge(AgentForge):
    async def compute_fitness(self, agent: Agent, compute_token_id: str, **kwargs: dict[str, any]) -> float:
        return random.random()

  warn(


We can now instantiate the forge with the provided elements:

In [6]:
forge = BasicForge(
    name="Basic forge",
    description=forge_description,
    agent_input_model=AgentInput,
    agent_output_model=AgentOutput,
    default_generated_agent_engine_type="langgraph_engine",
    default_model_endpoint_id="azure-gpt-4o-mini"
)

# Running a forge cycle

Once the forge is instantiated, we can start generating agents by running a **forge cycle**. 

We must first define a list of available LLMs. Here, we will use GPT 4o mini only. 

In [7]:

# from ebiose.compute_intensive_batch_processor.compute_intensive_batch_processor import (
#     ComputeIntensiveBatchProcessor,
# )
# ComputeIntensiveBatchProcessor.initialize()

The forge cycle is based on an evolutionary algorithm that requires two types of specialized agents: 
- **architect agents** which are agents that generate other agents from scratch;
- **genetic operator agents** which generate agents from one or several other existing agents.

For now, hand-made architect and crossover agents are provided in the `GraphUtils` class. We can load them as follows:

In [8]:
# from ebiose.core.engines.graph_engine.utils import GraphUtils

# architect_agent = GraphUtils.get_architect_agent(model_endpoint_id="azure-gpt-4o-mini")
# crossover_agent = GraphUtils.get_crossover_agent(model_endpoint_id="azure-gpt-4o-mini")

These agents are required to instantiate an Ebiose ecosystem:

In [9]:
# from ebiose.core.ecosystem import Ecosystem

# ecosystem = Ecosystem(
#     initial_architect_agents=[architect_agent],
#     initial_genetic_operator_agents=[crossover_agent],
# )

The forge cycle can now be launched by providing the created ecosystem, a budget in dollars (the forge cycle will end once this budget is exhausted) and, optionally, a path in which created agents and fitness will be saved accross generations. Note that we need to use `asyncio.run` to launch the forge cycle.

> 🚨 Before executing the following cell, check the amount of budget you have allocated!

> 💡 If you are using VSCode, install the [*Markdown Preview Mermaid Support* extension](https://marketplace.visualstudio.com/items?itemName=bierner.markdown-mermaid) to allow the display of the generated agent's graphs.

In [10]:
import asyncio
import nest_asyncio

from ebiose.core.evo_forging_cycle import EvoForgingCylceConfig
nest_asyncio.apply()

from pathlib import Path
from datetime import UTC, datetime

# the path where results will be saved
current_time = datetime.now(UTC).strftime("%Y-%m-%d_%H-%M-%S")
SAVE_PATH = Path(f"./../data/") / current_time
if not SAVE_PATH.exists():
    SAVE_PATH.mkdir(parents=True)

# budget for the forge cycle, in dollars
BUDGET = 0.01

cycle_config = EvoForgingCylceConfig(
    budget=BUDGET,
    n_agents_in_population=2,
    n_selected_agents_from_ecosystem=0,
    replacement_ratio=0.5,
    save_path=SAVE_PATH
)

final_agents, final_fitness = asyncio.run(
    forge.run_new_cycle(config=cycle_config)
)


AttributeError: 'tags' is a ClassVar of `LangGraphEngine` and cannot be set on an instance. If you want to set a value on the class, use `LangGraphEngine.tags = value`.

We can now display the best agents that have been returned as follows. Note that:
- all agents can be found in the `SAVE_PATH` directory if you defined one;
- here, the compute fitness only returns a random float, so the following displayed agents have not been truly evaluated. 

Go check [examples/math_forge/math_forge.py](./../examples/math_forge/math_forge.py) to see a fully implemeted forge with a non-random fitness evaluation function.

In [None]:
forge.display_results(final_agents, final_fitness)

# Agent ID: agent-86411c26-42e7-402b-bad0-05a5baeda82d
## Fitness: 0.4016442563343041
```mermaid 
graph LR
	Start_Node[start_node] --> Llm_Node_1(DataExtractor)
	Llm_Node_1(DataExtractor) --> Llm_Node_2(EquationFormulator)
	Llm_Node_2(EquationFormulator) --> Llm_Node_3(CritiqueNode)
	Llm_Node_3(CritiqueNode) -->|If equation is valid.| End_Node[end_node]
	Llm_Node_3(CritiqueNode) -->|If equation needs refinement.| Llm_Node_2(EquationFormulator)
 
``` 
## Prompts:
##### Shared context prompt
You are part of a collaborative system designed to solve math word problems. Your role is to assist in understanding, processing, and resolving these problems through sequential interactions. The StartNode initiates the process, the DataExtractor node identifies key quantities and relationships, the EquationFormulator node creates the mathematical equation, and the CritiqueNode evaluates the equation for accuracy before reaching the EndNode. Your responses should reflect chain-of-thought reasoning, provide clear output for subsequent nodes, and incorporate self-reflection when necessary. Maintain clarity and coherence in your communication to ensure effective collaboration.
##### DataExtractor
Analyze the following math word problem and extract all the key quantities and relationships present.
##### EquationFormulator
Given the extracted data from the problem, formulate the corresponding mathematical equation.
##### CritiqueNode
Review the formulated equation and provide feedback on its validity. If any refinements are needed, specify what changes should be made.

# Agent ID: agent-27f5b771-bf53-4f33-ae74-a3febcb480c0
## Fitness: 0.16636628247192053
```mermaid 
graph LR
	Start_Node[start_node] --> Llm_Node_1(DataExtractor)
	Llm_Node_1(DataExtractor) --> Llm_Node_2(EquationFormulator)
	Llm_Node_2(EquationFormulator) --> End_Node[end_node]
 
``` 
## Prompts:
##### Shared context prompt
You are part of a collaborative system designed to solve math word problems. Your role is to assist in understanding, processing, and resolving these problems through sequential interactions. The StartNode initiates the process, the DataExtractor node identifies key quantities and relationships, the EquationFormulator node creates the mathematical equation, and the EndNode signifies completion. Your responses should reflect chain-of-thought reasoning, provide clear output for subsequent nodes, and incorporate self-reflection when necessary. Maintain clarity and coherence in your communication to ensure effective collaboration.
##### DataExtractor
Analyze the following math word problem and extract all the key quantities and relationships present.
##### EquationFormulator
Given the extracted data from the problem, formulate the corresponding mathematical equation.

