# LLM Optimization Modelling Experiment

In [3]:
import vertexai
from vertexai.preview.generative_models import GenerativeModel, Image
from IPython.display import Markdown

## 1. Define the problem description

In [147]:
problem = '''Your goal is to invest in several of 10 possible investment strategies in the most optimal way. The historic returns of those strategies are stored in the file "investments_data.csv". Each column represents one strategy and the rows are the past investment outcomes. There is no index and the values are separated by a ;.

The costs for investing in a given investment is stored in a vector A, which has one value for each strategy in order.  
The values are: [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]

You can only invest once into an investment. 

Unfortunately due to other costs and inflation, your available budget at this time is uncertain. There are four possible budget scenarios with different probabilities: scenario 1 with 1000 euros and probability of 0.55, scenario 2  with 1100 euros and probability of 0.4, scenario 3 with 900 euros and probability of 0.04, scenario 4 with 1200 euros and probability of 0.01. 
The tolerable probability of exceeding the budget is 0.4.

Please formulate a mean-variance mathematical model for this optimization problem, considering the past performance of investment strategies and the uncertain budget. You can take 2 as the risk parameter r.'''

## 2. Generate the mathematical model

In [165]:
#Initializing the session. To replicate, make sure the right credentials are saved in a PATH variable
PROJECT_ID = "llm4optproblems"
REGION = "us-central1"
vertexai.init(project=PROJECT_ID, location=REGION)

#Specifying the model
generative_multimodal_model = GenerativeModel("gemini-1.5-pro-preview-0409")

#The propmt applied to all problems
prompt = '''Please formulate a mathematical optimization model for this problem. Include parameters, decision variables, the objective function and the constraints in your answer.
'''

#Generate the response
response = generative_multimodal_model.generate_content([prompt+problem])


In [166]:
#Show the resopnse in a formatted way
Markdown(response.text)

## Mathematical Optimization Model for Investment Strategy Selection

**Parameters:**

* **R<sub>ij</sub>:**  Return of investment strategy j in period i (from "investments_data.csv")
* **A<sub>j</sub>:** Cost of investing in strategy j (given vector A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225])
* **B<sub>s</sub>:** Available budget in scenario s ([1000, 1100, 900, 1200])
* **P<sub>s</sub>:** Probability of scenario s ([0.55, 0.4, 0.04, 0.01])
* **r:** Risk parameter (r = 2)
* **T:**  Number of periods in the historical data
* **J:**  Number of investment strategies (J = 10)
* **S:** Number of budget scenarios (S = 4)
* **α:** Tolerable probability of exceeding the budget (α = 0.4)

**Decision Variables:**

* **x<sub>j</sub>:** Binary variable, equal to 1 if investment strategy j is selected, 0 otherwise.

**Objective Function:**

Maximize the risk-adjusted expected return:

```
Maximize:  ∑<sub>s=1</sub><sup>S</sup> P<sub>s</sub> * [ ∑<sub>j=1</sub><sup>J</sup> ( 1/T * ∑<sub>i=1</sub><sup>T</sup> R<sub>ij</sub> * x<sub>j</sub>) ]  - r * [ ∑<sub>s=1</sub><sup>S</sup> P<sub>s</sub> * ( ∑<sub>j=1</sub><sup>J</sup> ( 1/T * ∑<sub>i=1</sub><sup>T</sup> (R<sub>ij</sub> -  (1/T * ∑<sub>i=1</sub><sup>T</sup> R<sub>ij</sub>))<sup>2</sup> * x<sub>j</sub>) ]
```

The first part of the objective function represents the expected return across all scenarios, weighted by their probabilities. The second part represents the variance of the returns, also weighted by probabilities and penalized by the risk parameter 'r'.

**Constraints:**

1. **Budget constraint:** The total cost of selected investments must not exceed the budget in any scenario with a probability greater than α:

```
∑<sub>j=1</sub><sup>J</sup> A<sub>j</sub> * x<sub>j</sub>  ≤ B<sub>s</sub> + M * y<sub>s</sub>   ∀ s ∈ S
∑<sub>s=1</sub><sup>S</sup> P<sub>s</sub> * y<sub>s</sub> ≤ α
```

where M is a large constant and y<sub>s</sub> is a binary variable that equals 1 if the budget is exceeded in scenario s, and 0 otherwise.

2. **Investment selection constraint:** An investment strategy can be selected only once:

```
x<sub>j</sub> ∈ {0,1}   ∀ j ∈ J
```

3. **Budget exceeding constraint:**

```
y<sub>s</sub> ∈ {0,1}   ∀ s ∈ S
```

This model aims to find the optimal combination of investment strategies that maximizes the risk-adjusted expected return while respecting the budget constraints with a given tolerance for exceeding the budget. 


## 3. Generate the pyomo code

In [167]:
#Second prompt gets the output of the previous step and generates the code
prompt2 = "Please write pyomo code for this mathematical problem. Use sample data where needed. Indicate where you use sample data. \n"
prompt2 += response.text 
response2 = generative_multimodal_model.generate_content([prompt2])

In [168]:
#Showing the code in a formatted way
Markdown(response2.text)

```python
import pyomo.environ as pyo
import pandas as pd

# Sample data - Replace with actual data from "investments_data.csv"
data = {
    'strategy_1': [0.02, 0.03, 0.01, 0.025], 
    'strategy_2': [-0.01, 0.02, 0.03, 0.01],
    'strategy_3': [0.04, 0.03, 0.02, 0.015],
    'strategy_4': [0.01, 0.015, 0.02, 0.025],
    'strategy_5': [0.03, 0.02, 0.01, 0.03],
    'strategy_6': [0.02, 0.04, 0.02, 0.01],
    'strategy_7': [0.01, 0.025, 0.03, 0.02],
    'strategy_8': [-0.02, 0.01, 0.02, 0.03],
    'strategy_9': [0.03, 0.02, 0.04, 0.01],
    'strategy_10': [0.015, 0.02, 0.03, 0.025],
}
investments_data = pd.DataFrame(data)

# Parameters
R = investments_data.values  # Return of investment j in period i
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]  # Cost of investing in strategy j
B = [1000, 1100, 900, 1200]  # Available budget in scenario s
P = [0.55, 0.4, 0.04, 0.01]  # Probability of scenario s
r = 2  # Risk parameter
T = len(investments_data)  # Number of periods
J = len(investments_data.columns)  # Number of investment strategies
S = len(B)  # Number of budget scenarios
alpha = 0.4  # Tolerable probability of exceeding the budget
M = 1e6  # Large constant

# Model
model = pyo.ConcreteModel()

# Sets
model.J = pyo.RangeSet(1, J)
model.S = pyo.RangeSet(1, S)
model.T = pyo.RangeSet(1, T)

# Variables
model.x = pyo.Var(model.J, within=pyo.Binary)
model.y = pyo.Var(model.S, within=pyo.Binary)

# Objective function
def objective_rule(model):
    expected_return = sum(
        P[s-1] * sum((1/T) * sum(R[i-1, j-1] * model.x[j] for i in model.T) for j in model.J) 
        for s in model.S
    )
    variance = sum(
        P[s-1] * sum((1/T) * sum(((R[i-1, j-1] - (1/T) * sum(R[t-1, j-1] for t in model.T))**2) * model.x[j] for i in model.T) for j in model.J)
        for s in model.S
    )
    return expected_return - r * variance
model.objective = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Constraints
def budget_constraint_rule(model, s):
    return sum(A[j-1] * model.x[j] for j in model.J) <= B[s-1] + M * model.y[s]
model.budget_constraint = pyo.Constraint(model.S, rule=budget_constraint_rule)

def budget_exceeding_constraint_rule(model):
    return sum(P[s-1] * model.y[s] for s in model.S) <= alpha
model.budget_exceeding_constraint = pyo.Constraint(rule=budget_exceeding_constraint_rule)

# Solve
solver = pyo.SolverFactory('cbc')  # You can choose other solvers like 'gurobi' or 'cplex' if available
results = solver.solve(model)

# Print results
print(results)
print("Optimal investment strategy:")
for j in model.J:
    if pyo.value(model.x[j]) == 1:
        print(f"Invest in strategy {j}")
```

This code defines the Pyomo model, including the objective function, variables, and constraints. It also includes sample data for `investments_data` that you should replace with your actual data.  Remember to choose a solver that you have installed and that is appropriate for your problem size and type. 


## 4. Input problem data and try running the generated code

In [174]:
import pyomo.environ as pyo
import pandas as pd

# Sample data - Replace with actual data from "investments_data.csv"
# data = {
#     'strategy_1': [0.02, 0.03, 0.01, 0.025], 
#     'strategy_2': [-0.01, 0.02, 0.03, 0.01],
#     'strategy_3': [0.04, 0.03, 0.02, 0.015],
#     'strategy_4': [0.01, 0.015, 0.02, 0.025],
#     'strategy_5': [0.03, 0.02, 0.01, 0.03],
#     'strategy_6': [0.02, 0.04, 0.02, 0.01],
#     'strategy_7': [0.01, 0.025, 0.03, 0.02],
#     'strategy_8': [-0.02, 0.01, 0.02, 0.03],
#     'strategy_9': [0.03, 0.02, 0.04, 0.01],
#     'strategy_10': [0.015, 0.02, 0.03, 0.025],
# }
#Human made data loading:
import csv

# Initialize an empty dictionary to store the data
data = {}

# Read the CSV file
with open('investments_data.csv', 'r') as file:
    reader = csv.reader(file, delimiter=';')
    for row in reader:
        # Assuming each row corresponds to a strategy
        for i, value in enumerate(row):
            # Create strategy keys if not exist
            if f'strategy_{i+1}' not in data:
                data[f'strategy_{i+1}'] = []
            # Append the value to the corresponding strategy
            data[f'strategy_{i+1}'].append(float(value))

investments_data = pd.DataFrame(data)

# Parameters
R = investments_data.values  # Return of investment j in period i
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]  # Cost of investing in strategy j
B = [1000, 1100, 900, 1200]  # Available budget in scenario s
P = [0.55, 0.4, 0.04, 0.01]  # Probability of scenario s
r = 2  # Risk parameter
T = len(investments_data)  # Number of periods
J = len(investments_data.columns)  # Number of investment strategies
S = len(B)  # Number of budget scenarios
alpha = 0.4  # Tolerable probability of exceeding the budget
M = 1e6  # Large constant

# Model
model = pyo.ConcreteModel()

# Sets
model.J = pyo.RangeSet(1, J)
model.S = pyo.RangeSet(1, S)
model.T = pyo.RangeSet(1, T)

# Variables
model.x = pyo.Var(model.J, within=pyo.Binary)
model.y = pyo.Var(model.S, within=pyo.Binary)

# Objective function
def objective_rule(model):
    expected_return = sum(
        P[s-1] * sum((1/T) * sum(R[i-1, j-1] * model.x[j] for i in model.T) for j in model.J) 
        for s in model.S
    )
    variance = sum(
        P[s-1] * sum((1/T) * sum(((R[i-1, j-1] - (1/T) * sum(R[t-1, j-1] for t in model.T))**2) * model.x[j] for i in model.T) for j in model.J)
        for s in model.S
    )
    return expected_return - r * variance
model.objective = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Constraints
def budget_constraint_rule(model, s):
    return sum(A[j-1] * model.x[j] for j in model.J) <= B[s-1] + M * model.y[s]
model.budget_constraint = pyo.Constraint(model.S, rule=budget_constraint_rule)

def budget_exceeding_constraint_rule(model):
    return sum(P[s-1] * model.y[s] for s in model.S) <= alpha
model.budget_exceeding_constraint = pyo.Constraint(rule=budget_exceeding_constraint_rule)

# Solve
solver = pyo.SolverFactory('glpk')  # You can choose other solvers like 'gurobi' or 'cplex' if available
results = solver.solve(model)

# Print results
print(results)
print("Optimal investment strategy:")
for j in model.J:
    if pyo.value(model.x[j]) == 1:
        print(f"Invest in strategy {j}")


Problem: 
- Name: unknown
  Lower bound: 226.084248252474
  Upper bound: 226.084248252474
  Number of objectives: 1
  Number of constraints: 5
  Number of variables: 14
  Number of nonzeros: 48
  Sense: maximize
Solver: 
- Status: ok
  Termination condition: optimal
  Statistics: 
    Branch and bound: 
      Number of bounded subproblems: 5
      Number of created subproblems: 5
  Error rc: 0
  Time: 0.038109779357910156
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

Optimal investment strategy:
Invest in strategy 3
Invest in strategy 6
Invest in strategy 8


## 5. Correct the code to verify model viability (optional)

## 6. Printing the outputs as strings, so they can be saved.

In [175]:
print(response.text)

## Mathematical Optimization Model for Investment Strategy Selection

**Parameters:**

* **R<sub>ij</sub>:**  Return of investment strategy j in period i (from "investments_data.csv")
* **A<sub>j</sub>:** Cost of investing in strategy j (given vector A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225])
* **B<sub>s</sub>:** Available budget in scenario s ([1000, 1100, 900, 1200])
* **P<sub>s</sub>:** Probability of scenario s ([0.55, 0.4, 0.04, 0.01])
* **r:** Risk parameter (r = 2)
* **T:**  Number of periods in the historical data
* **J:**  Number of investment strategies (J = 10)
* **S:** Number of budget scenarios (S = 4)
* **α:** Tolerable probability of exceeding the budget (α = 0.4)

**Decision Variables:**

* **x<sub>j</sub>:** Binary variable, equal to 1 if investment strategy j is selected, 0 otherwise.

**Objective Function:**

Maximize the risk-adjusted expected return:

```
Maximize:  ∑<sub>s=1</sub><sup>S</sup> P<sub>s</sub> * [ ∑<sub>j=1</sub><sup>J</sup> ( 1/T * ∑<sub>i=

In [176]:
print(response2.text)

```python
import pyomo.environ as pyo
import pandas as pd

# Sample data - Replace with actual data from "investments_data.csv"
data = {
    'strategy_1': [0.02, 0.03, 0.01, 0.025], 
    'strategy_2': [-0.01, 0.02, 0.03, 0.01],
    'strategy_3': [0.04, 0.03, 0.02, 0.015],
    'strategy_4': [0.01, 0.015, 0.02, 0.025],
    'strategy_5': [0.03, 0.02, 0.01, 0.03],
    'strategy_6': [0.02, 0.04, 0.02, 0.01],
    'strategy_7': [0.01, 0.025, 0.03, 0.02],
    'strategy_8': [-0.02, 0.01, 0.02, 0.03],
    'strategy_9': [0.03, 0.02, 0.04, 0.01],
    'strategy_10': [0.015, 0.02, 0.03, 0.025],
}
investments_data = pd.DataFrame(data)

# Parameters
R = investments_data.values  # Return of investment j in period i
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]  # Cost of investing in strategy j
B = [1000, 1100, 900, 1200]  # Available budget in scenario s
P = [0.55, 0.4, 0.04, 0.01]  # Probability of scenario s
r = 2  # Risk parameter
T = len(investments_data)  # Number of periods
J = len(investm