# OptiGuide Example




Here we give a simple example, as designed and illustrated in the [OptiGuide paper](https://arxiv.org/abs/2307.03875).
While the original paper is designed specifically for supply chain optimization, the general framework can be easily adapted to other applications with coding capacity.




## OptiGuide for Supply Chain Optimization: System Design Overview

The original system design for OptiGuide, tailored for supply chain optimization, is presented below.

The collaboration among three agents -- Coder, Safeguard, and Interpreter -- lies at the core of this system. They leverage a set of external tools and a large language model (LLM) to address users' questions related to supply chain applications. For a comprehensive understanding of the design and data flow, detailed information can be found in the original [paper](https://arxiv.org/abs/2307.03875).


![optiguide system](https://www.beibinli.com/docs/optiguide/optiguide_system.png)


##  OptiGuide Integration with Autogen

The OptiGuide framework can be implemented using various large language model (LLM) packages or libraries. Autogen provides a multi-agent design that facilitates agent interactions and memory management, resulting in a more elegant implementation of OptiGuide.

Here, we have broken down and implemented the OptiGuide framework within Autogen, with details as follows.

The central role of the OptiGuide agent is to serve as a master agent, overseeing the creation and management of several distinct agents. Additionally, it takes responsibility for handling memory related to user interactions, enabling it to retain valuable information pertaining to users' questions and their corresponding answers. This memory is then shared with other agents within the system, allowing them to effectively consider a user's previous inquiries.

The OptiGuide agent supervises three user proxy agents and two assistant agents, each with a specific function:
- Coder: A user proxy agent dedicated to the task of writing code.
- Safeguard: Another user proxy agent designed to identify and detect adversarial code effectively.
- Interpreter: A user proxy agent specialized in interpreting execution results and presenting them in a human-readable format.
- Pencil: An assistant agent providing support to both the Coder and Interpreter. It possesses the capability to "write" code and provide interpretations.
- Shield: An assistant agent offering support to the Safeguard.

In this implementation, the "Coder" and "Interpreter" refer to the same entity utilizing the Pencil to generate content, which could either be code or English sentences. This design choice enables the Pencil to access and share the same memory while engaging in communication with both the "Coder" and the "Interpreter."

Additionally, it is worth noting that the Safeguard does not possess the same memory as the Coder or Interpreter. This deliberate decision ensures that the Safeguard can impartially act as an adversarial checker. By preventing the exposure of LLM responses of code and interpretations to the Safeguard, potential biases are mitigated, and more reliable results can be achieved.


![optiguide system](https://www.beibinli.com/docs/optiguide/autogen_optiguide_sys.png)

Advantages of this multi-agent design with autogen:
- Collaborative Problem Solving: The collaboration among the user proxy agents (Coder and Interpreter) and the assistant agents (Pencil and Shield) fosters a cooperative problem-solving environment. The agents can share information and knowledge, allowing them to complement each other's abilities and collectively arrive at better solutions. On the other hand, the Safeguard acts as a virtual adversarial checker, which can perform another safety check pass on the generated code.

- Modularity: The division of tasks into separate agents promotes modularity in the system. Each agent can be developed, tested, and maintained independently, simplifying the overall development process and facilitating code management.

- Memory Management: The OptiGuide agent's role in maintaining memory related to user interactions is crucial. The memory retention allows the agents to have context about a user's prior questions, making the decision-making process more informed and context-aware.



In [1]:
# %%capture # no printing logs here.
# # Install FLAML for agents
# %pip install "flaml[autogen,optiguide]~=2.0.0rc4"

In [2]:
# test Gurobi installation
import gurobipy as gp
from gurobipy import GRB
from eventlet.timeout import Timeout

# import auxillary packages
import re
import requests  # for loading the example source code
import openai
import os


# import flaml and autogen
from flaml import autogen
from flaml.autogen.agentchat import Agent, ResponsiveAgent
from flaml.autogen.agentchat.contrib.opti_guide import OptiGuideAgent

In [3]:
autogen.oai.ChatCompletion.start_logging()


config_list = autogen.config_list_from_json(
    "OAI_CONFIG_LIST",
    filter_dict={
        "model": {
#             "gpt-4",
#             "gpt4",
#             "gpt-4-32k",
#             "gpt-4-32k-0314",
            "gpt-3.5-turbo",
            "gpt-3.5-turbo-16k",
            "gpt-3.5-turbo-0301",
            "chatgpt-35-turbo-0301",
            "gpt-35-turbo-v0301",
        }
    }
)


In [4]:
openai.api_type = "azure"
openai.api_version = "2023-03-15-preview"
openai.api_base = "https://your.api.address.here.com"
openai.api_key = "<Your API Key goes here>"


Now, let's import the source code (loading from URL) and also some training examples (defined as string blow).

In [5]:
# Get the source code of our coffee example
code_url = "https://www.beibinli.com/docs/optiguide/coffee.py"

code = requests.get(code_url).text

# show the first head and tail of the source code
print("\n".join(code.split("\n")[:10]))
print(".\n" * 3)
print("\n".join(code.split("\n")[-10:]))


import time

from gurobipy import GRB, Model

# Example data

capacity_in_supplier = {'supplier1': 150, 'supplier2': 50, 'supplier3': 100}

shipping_cost_from_supplier_to_roastery = {
.
.
.

m.update()
model.optimize()

print(time.ctime())
if m.status == GRB.OPTIMAL:
    print(f'Optimal cost: {m.objVal}')
else:
    print("Not solved to optimality. Optimization status:", m.status)




In [6]:
# In-context learning examples.
example_qa = """
----------
Question: Why is it not recommended to use just one supplier for roastery 2?
Answer Code:
```python
z = m.addVars(suppliers, vtype=GRB.BINARY, name="z")
m.addConstr(sum(z[s] for s in suppliers) <= 1, "_")
for s in suppliers:
    m.addConstr(x[s,'roastery2'] <= capacity_in_supplier[s] * z[s], "_")
```

----------
Question: What if there's a 13% jump in the demand for light coffee at cafe1?
Answer Code:
```python
light_coffee_needed_for_cafe["cafe1"] = light_coffee_needed_for_cafe["cafe1"] * (1 + 13/100)
```

"""

Now, let's create an OptiGuide agent and also a user. 

For the OptiGuide agent, we only allow "debug_times" to be 1, which means it can debug its answer once if it encountered errors.

In [7]:
%%capture
agent = OptiGuideAgent(name="OptiGuide Coffee Example", 
                  source_code=code,
                   debug_times=1,
                  example_qa="",    
                llm_config={
        "request_timeout": 600,
        "seed": 42,
        "config_list": config_list,
    })

user = ResponsiveAgent("user", max_consecutive_auto_reply=0,
                         human_input_mode="NEVER", code_execution_config=False)


In [8]:
user.send("What if we prohibit shipping from supplier 1 to roastery 2?", agent)


[33muser[0m (to OptiGuide Coffee Example):

What if we prohibit shipping from supplier 1 to roastery 2?

--------------------------------------------------------------------------------
[33mOptiGuide Coffee Example[0m (to writer):


Answer Code:


--------------------------------------------------------------------------------
[33mwriter[0m (to OptiGuide Coffee Example):

model.addConstr(x['supplier1', 'roastery2'] == 0, "prohibit_supplier1_to_roastery2")

--------------------------------------------------------------------------------
[33mOptiGuide Coffee Example[0m (to safeguard):


--- Code ---
model.addConstr(x['supplier1', 'roastery2'] == 0, "prohibit_supplier1_to_roastery2")

--- One-Word Answer: SAFE or DANGER ---


--------------------------------------------------------------------------------
[33msafeguard[0m (to OptiGuide Coffee Example):

SAFE

--------------------------------------------------------------------------------
Gurobi Optimizer version 9.5.1 build v9.

In [9]:
user.send("What is the impact of supplier1 being able to supply only half the quantity at present?", agent)


[33muser[0m (to OptiGuide Coffee Example):

What is the impact of supplier1 being able to supply only half the quantity at present?

--------------------------------------------------------------------------------
[33mOptiGuide Coffee Example[0m (to writer):


Answer Code:


--------------------------------------------------------------------------------
[33mwriter[0m (to OptiGuide Coffee Example):

# Prohibit the shipment from supplier 1 to roastery 2
model.addConstr(x['supplier1', 'roastery2'] == 0, "no_ship_supplier1_roastery2")

# Reduce the supply capacity of supplier 1 to half
new_capacity = capacity_in_supplier['supplier1'] / 2
model.addConstr(x.sum('supplier1', '*') <= new_capacity, "half_capacity_supplier1")

# Update and optimize the model
model.update()
model.optimize()

# Print the new objective value
if model.status == GRB.OPTIMAL:
    print(f'New optimal cost: {model.objVal}')
else:
    print("Not solved to optimality. Optimization status:", model.status)

---------

## A hard questio
Now, let's try to identify a scenario where the language model cannot answer a question.

In the question that follows, the LLM will consistently attempt to write code with the ">" sign. However, this is problematic as Gurobi doesn't implement the ">" sign to compare two variables. Instead, Gurobi exclusively supports ">=" and "<=" operators.

Upon executing the code, Gurobi will trigger a "NotImplementedError" without providing any further explanation. Thus, the LLM struggles to pinpoint the origin of the coding error. It may make several attempts to debug the code but will ultimately revert to the user with a default surrendering message.

NOTE: If we give the LLM enough chances, it might use a trial-and-error approach and eventually discover an alternative solution, as demonstrated in the following code block:
```python
# Add constraint that roastery 2 produces more light coffee than roastery 1
y_light_sum_roastery1 = sum(y_light['roastery1', c] for c in cafes)
y_light_sum_roastery2 = sum(y_light['roastery2', c] for c in cafes)

model.addConstr((y_light_sum_roastery2 - y_light_sum_roastery1) >= 1, "more_light_coffee_roastery2")
```



In [10]:
# A question that cannot be answered.
# it is because Gurobi syntax. 
user.send("What would happen if roastery 2 produced more light coffee than roastery 1?", agent)


[33muser[0m (to OptiGuide Coffee Example):

What would happen if roastery 2 produced more light coffee than roastery 1?

--------------------------------------------------------------------------------
[33mOptiGuide Coffee Example[0m (to writer):


Answer Code:


--------------------------------------------------------------------------------
[33mwriter[0m (to OptiGuide Coffee Example):

# Add constraint: Roastery 2 should produce more light coffee than Roastery 1
model.addConstr(y_light['roastery2', 'cafe1'] + y_light['roastery2', 'cafe2'] + y_light['roastery2', 'cafe3'] >
                y_light['roastery1', 'cafe1'] + y_light['roastery1', 'cafe2'] + y_light['roastery1', 'cafe3'], "more_light_roastery2")

# Remember to remove this constraint or reset the model when you want to test other scenarios.

--------------------------------------------------------------------------------
[33mOptiGuide Coffee Example[0m (to safeguard):


--- Code ---
# Add constraint: Roastery 2 should

## Human Input Mode

The OptiGuide agent can communicate with several different users.

We can use UserProxyAgent here and if we would like to enable continuous conversation, set human_input_mode="ALWAYS".

In this case, the human might be able to provide feedbacks to help LLM answering the question. So, let's (as human) chat with the OptiGuide agent with a message:
```markdown
Gurobi does not support the '>' sign. Can you rewrite it with a '>=' sign and add a epsilon value of 1 (integer) to the right hand side?"
```

In [11]:
user_2 = ResponsiveAgent("user", max_consecutive_auto_reply=0,
                        human_input_mode="ALWAYS", code_execution_config=False)

In [12]:
user_2.send("What would happen if roastery 2 produced more light coffee than roastery 1?", agent)


[33muser[0m (to OptiGuide Coffee Example):

What would happen if roastery 2 produced more light coffee than roastery 1?

--------------------------------------------------------------------------------
[33mOptiGuide Coffee Example[0m (to writer):


Answer Code:


--------------------------------------------------------------------------------
[33mwriter[0m (to OptiGuide Coffee Example):

To find the optimal solution with the condition that roastery 2 produces more light coffee than roastery 1, you can add the following constraint:

```python
model.addConstr(y_light['roastery2', 'cafe1'] + y_light['roastery2', 'cafe2'] + y_light['roastery2', 'cafe3'] > y_light['roastery1', 'cafe1'] + y_light['roastery1', 'cafe2'] + y_light['roastery1', 'cafe3'], "more_light_coffee_roastery2")
```

This constraint ensures that the total amount of light coffee produced by roastery 2 is greater than the total amount produced by roastery 1.

------------------------------------------------------------

Presolve time: 0.00s
Presolved: 11 rows, 18 columns, 36 nonzeros
Variable types: 0 continuous, 18 integer (0 binary)
Found heuristic solution: objective 3276.0000000

Root relaxation: objective 2.470000e+03, 11 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

*    0     0               0    2470.0000000 2470.00000  0.00%     -    0s

Explored 1 nodes (11 simplex iterations) in 0.01 seconds (0.00 work units)
Thread count was 32 (of 128 available processors)

Solution count 3: 2470 3276 3280 

Optimal solution found (tolerance 1.00e-04)
Best objective 2.470000000000e+03, best bound 2.470000000000e+03, gap 0.0000%
Gurobi Optimizer version 9.5.1 build v9.5.1rc2 (linux64)
Thread count: 64 physical cores, 128 logical processors, using up to 32 threads
Optimize a model with 12 rows, 18 columns and 42 nonzeros
Model fingerprint: 0x1dc3eaf8
Variable t