# Optimizing your pipelines

With HybridAGI you can optimize your pipelines with ease, during optimization the system will simulate the entire pipeline and select the best examples for each sub-module of HybridAGI based on the metric used.

## Setting up the Knowledge Base locally

HybridAGI works with a low-latency hybrid vector/graph database called [FalkorDB](https://www.falkordb.com/). This knowledge base should be started when you work with HybridAGI.

Start the database using docker with the following command in you terminal:

```bash
docker run -p 6379:6379 -p 3000:3000 -it --rm falkordb/falkordb:edge
```

Before optimizing the pipeline, we'll first create a nice one, for that we are going to use the pipeline from the code interpreter agent.

In [1]:
import dspy
from hybridagi import HybridAGI

# Setup the LM
lm = dspy.OllamaLocal(model='mistral', max_tokens=1024, stop=["\n\n\n"])

dspy.configure(lm=lm)

agent = HybridAGI(
    agent_name = "optimized_agent",
)

agent.add_programs_from_folders(["programs/code_interpreter"])


  from .autonotebook import tqdm as notebook_tqdm


### Why do we need optimization, and why this is important.

Most people implementing Agent frameworks think that by relying on the "best" LMs of the market is enough, give it the right prompt and it will understand right? right?

LMs needs, at the current state of the art, for each concept an exponential number of examples in the training set to have a linear gain in understanding.

What this means in practice? It means that the current thinking that scaling will solve everything is **wrong**. Exponential examples, means not only that the dataset to train needs to grow exponentially, but *also* that the size of the models needs to grow exponentialy to recall efficiently all that data. As the cost of inference is rising and start-ups starting to fall, people will realize that bigger is not better. Relying on scaling alone with an architecture that doesn't scale is not the way to go.

By giving to the LMs examples of input/outputs in the context, we can enhance the results, no matter how well you explain your task in your prompt, having examples will always help the system.

The more out of distribution your task is, the more you need to give it examples.

This prompt engineering technique is called Few Shot inferences and is one of the most powerfull/simple way to enhance the atomic steps of your pipelines.

### Automatic generation and selection of examples

In DSPy, the process to automatically generate and select the examples at inference time is called bootstrapping.

This process is crucial because hand-writing these examples is time-consuming and not adaptable to changes in the prompt/pipeline. Each time you modify your pipeline, you will need to re-write these examples.

HybridAGI stand above any other Agent framework that doesn't provide a way to generate, evaluate and select automatically these examples.
Now that we have our pipeline set up, and know why we need to optimize it, we are going to select some nice examples.

### Handling infinite loops

One could argue that HybridAGI is sensible to infinite loops, as we use the LMs to evaluate conditional loops.
But before diving into this aspect, let's understand why this can happen. In HybridAGI because we guide the reasoning process of the agent, an infinite loop means that an error occured during the evaluation of a decision by the Agent. To make perfect decisions, we need to populate the LMs context with the right data first, so you need to make sure that your pipeline is correct and use enough intermediate steps if your task is difficult, like adding a critique stage defore evaluating the correctness of an answer.

Then we can enhance the decision process by providing examples like explained above, but we also need to discard during optimization the examples if they introduced an infinite loop (meaning they introduced errors). This behavior is made possible by testing if the agent exeeded the maximum iterations in the metric and return `False` or `0.0`. This condition is handled in every metric we provide.

In [2]:
import dspy
from hybridagi.metrics import factual_answer

trainset = [
    dspy.Example(objective="An object is thrown upward with an initial velocity of 20 m/s. How long does it take for the object to reach a height of 10 m?").with_inputs("objective"),
    dspy.Example(objective="A car accelerates from 0 to 60 mph in 8 seconds. What is the acceleration of the car in m/s²?").with_inputs("objective"),
    dspy.Example(objective="An object is moving in a straight line with a constant acceleration of 2 m/s². If the object's initial velocity is 5 m/s, what is its velocity after 10 seconds?").with_inputs("objective"),
    dspy.Example(objective="A tank contains 100 L of water at a temperature of 20°C. How much heat energy is required to raise the temperature of the water to 80°C?").with_inputs("objective"),
    dspy.Example(objective="A plane is flying at a constant speed of 600 mph and has enough fuel to fly for 5 hours. What is the maximum distance the plane can fly?").with_inputs("objective"),
    dspy.Example(objective="A pendulum has a length of 1 m and is released from an angle of 30 degrees. What is the pendulum's angular velocity when it reaches the bottom of its swing?").with_inputs("objective"),
    dspy.Example(objective="A block of mass 5 kg is placed on a frictionless inclined plane that makes an angle of 30 degrees with the horizontal. What is the acceleration of the block down the plane?").with_inputs("objective"),
    dspy.Example(objective="A proton and an electron are separated by a distance of 1 nm. What is the electrostatic potential energy of the system?").with_inputs("objective"),
    dspy.Example(objective="A cyclist is riding around a circular track with a radius of 50 m. If the cyclist's speed is constant at 10 m/s, what is the magnitude of their acceleration?").with_inputs("objective"),
    dspy.Example(objective="A satellite is orbiting the Earth in a circular orbit with a radius of 42,000 km. What is the satellite's orbital speed?").with_inputs("objective"),
]

valset = [
    dspy.Example(objective="An object is dropped from a height of 10 m. How long does it take for the object to reach the ground?").with_inputs("objective"),
    dspy.Example(objective="A gas is contained in a cylinder with a movable piston. The gas is heated, causing the piston to move outward and the gas to expand. If the initial pressure of the gas is 100 kPa and the final pressure is 50 kPa, what is the ratio of the final volume to the initial volume?").with_inputs("objective"),
    dspy.Example(objective="A 10 V battery is connected to a 2 Ω resistor. What is the current in the circuit and the power dissipated by the resistor?").with_inputs("objective"),
    dspy.Example(objective="A ball is thrown horizontally with an initial velocity of 15 m/s from a height of 20 m. How long does it take for the ball to hit the ground?").with_inputs("objective"),
    dspy.Example(objective="A block of mass 3 kg is sliding on a horizontal surface with a speed of 4 m/s. If the coefficient of kinetic friction between the block and the surface is 0.2, how far will the block slide before coming to a stop?").with_inputs("objective"),
]

agent.optimize(
    trainset = trainset,
    valset = valset,
    metric = factual_answer, # This metric check that the final answer is factually correct according to the program trace
    teacher_lm = None, # Here we are going to use the same LMs to generate the examples, but you could use a bigger one
    epochs = 3, # The number of epochs for the optimization
    max_bootstrapped_demos = 2, # Meaning we select 4 examples to populate the prompt (between 1 and 5 is good, take into account that more example = bigger prompt)
    save_checkpoints = True, # To automatically save the best examples into {agent_name}.json
    verbose = False, # print the intermediate steps
)

Evaluating the baseline...
  0%|          | 0/5 [00:00<?, ?it/s]The time it takes for the object to reach the ground is 1.4278431229270645 seconds.
Average Metric: 4 / 4  (100.0):  80%|████████  | 4/5 [05:03<01:23, 83.84s/it]The block will slide approximately 4.08 meters before coming to a stop.
Average Metric: 4 / 5  (80.0): 100%|██████████| 5/5 [05:29<00:00, 65.82s/it] 
Baseline score: 80.0
Epoch 1/3
Optimizing the underlying prompts...
  0%|          | 0/5 [00:00<?, ?it/s]The time it takes for the object to reach the ground is 1.4278431229270645 seconds.
Average Metric: 3 / 4  (75.0):  80%|████████  | 4/5 [04:16<01:00, 60.79s/it] The block will slide approximately 4.08 meters before coming to a stop.
Average Metric: 3 / 5  (60.0): 100%|██████████| 5/5 [04:42<00:00, 56.42s/it]
  0%|          | 0/5 [00:00<?, ?it/s]The time it takes for the object to reach the ground is 1.4278431229270645 seconds.
Average Metric: 4 / 4  (100.0):  80%|████████  | 4/5 [02:33<00:40, 40.66s/it]The block wi

 10%|█         | 1/10 [00:30<04:30, 30.01s/it]

The acceleration of the car is: 3.3528 m/s²


 20%|██        | 2/10 [01:03<04:12, 31.53s/it]


  0%|          | 0/5 [00:00<?, ?it/s]The time it takes for the object to reach the ground is 1.4278431229270645 seconds.
Average Metric: 1 / 1  (100.0):  20%|██        | 1/5 [34:18<02:23, 35.84s/it]

[2m2024-06-29T14:05:40.703157Z[0m [[31m[1merror    [0m] [1mError for example in dev set: 		 unsupported operand type(s) for +=: 'int' and 'NoneType'[0m [[0m[1m[34mdspy.evaluate.evaluate[0m][0m [36mfilename[0m=[35mevaluate.py[0m [36mlineno[0m=[35m180[0m


Average Metric: 2.0 / 3  (66.7):  60%|██████    | 3/5 [34:48<22:16, 668.45s/it]   

Now you can inspect the examples in the `optimized_agent.json` file containing your compiled examples!

In [None]:

prediction = agent.execute("A block of mass 5 kg is placed on a frictionless inclined plane that makes an angle of 30 degrees with the horizontal. What is the acceleration of the block down the plane?", verbose = True)

print(prediction.final_answer)

[35m --- Step 0 ---
Call Program: main
Program Purpose: A block of mass 5 kg is placed on a frictionless inclined plane that makes an angle of 30 degrees with the horizontal. What is the acceleration of the block down the plane?[0m
[36m --- Step 1 ---
Action Purpose: Plannify how to implement the code to answer the objective's question
Action: {
  "answer": "To solve this problem, we will use the equation for acceleration on an inclined plane, which is given by:\n\na = g * sin(\u03b8)\n\nwhere:\n- a is the acceleration of the block down the plane\n- g is the acceleration due to gravity (approximately 9.81 m/s\u00b2)\n- \u03b8 is the angle of the inclined plane with respect to the horizontal\n\nIn this case, the mass of the block is given as 5 kg, and the angle of the inclined plane is 30 degrees. So, we will code:\n\n```python\n# Define constants\ng = 9.81 # m/s\u00b2\ntheta = math.radians(30) # radians\n\n# Calculate acceleration\na = g * math.sin(theta) # m/s\u00b2\n\nprint(\"The 