# LLM Optimization Modelling Experiment

In [156]:
import vertexai
from vertexai.preview.generative_models import GenerativeModel
from IPython.display import Markdown

## 1. Define the problem description

In [172]:
problem = '''Your goal is to invest in several of 10 possible investment strategies in the most optimal way. The historic returns of those strategies are stored in the file "investments_data.csv". Each column represents one strategy and the rows are the past investment outcomes. There is no index and the values are separated by a ;.

The costs for investing in a given investment is stored in a vector A, which has one value for each strategy in order.  
The values are: [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]

You can only invest once into an investment. 

Unfortunately due to other costs and inflation, your available budget at this time is uncertain. There are four possible budget scenarios with different probabilities: scenario 1 with 1000 euros and probability of 0.55, scenario 2  with 1100 euros and probability of 0.4, scenario 3 with 900 euros and probability of 0.04, scenario 4 with 1200 euros and probability of 0.01. 
The tolerable probability of exceeding the budget is 0.4.

Please formulate a mean-variance mathematical model for this optimization problem, considering the past performance of investment strategies and the uncertain budget. You can take 2 as the risk parameter r.
'''

## 2. Generate the mathematical model

In [206]:
#Initializing the session. To replicate, make sure the right credentials are saved in a PATH variable
PROJECT_ID = "llm4optproblems"
REGION = "us-central1"
vertexai.init(project=PROJECT_ID, location=REGION)

#Specifying the model
generative_multimodal_model = GenerativeModel("gemini-1.5-pro-preview-0409")

#The propmt applied to all problems
prompt = '''Let's think step by step. Please write a mathematical optimization model for this problem. If there are parameter values, make sure to include them in the mathematical formulation.
'''

#Generate the response
response = generative_multimodal_model.generate_content([prompt+problem])


In [207]:
#Show the resopnse in a formatted way
Markdown(response.text)

## Mean-Variance Optimization Model with Uncertain Budget

**Sets and Indices:**

* $S$: Set of investment strategies, $S = \{1, 2, ..., 10\}$
* $B$: Set of budget scenarios, $B = \{1, 2, 3, 4\}$ 

**Parameters:**

* $r_s$: Historical return of strategy $s$, obtained from "investments_data.csv" (calculate mean for each strategy)
* $c_s$: Cost of investing in strategy $s$, obtained from vector A
* $b_j$: Available budget in scenario $j$  ($b_1 = 1000$, $b_2 = 1100$, $b_3 = 900$, $b_4 = 1200$)
* $p_j$: Probability of budget scenario $j$ ($p_1 = 0.55$, $p_2 = 0.4$, $p_3 = 0.04$, $p_4 = 0.01$)
* $\alpha$: Tolerable probability of exceeding the budget (0.4)
* $R$: Risk parameter (2)

**Decision Variables:**

* $x_s$: Binary variable, 1 if we invest in strategy $s$, 0 otherwise

**Objective Function (Mean-Variance):**

Maximize expected return minus a penalty for exceeding the budget and the variance of the portfolio:

```
Maximize  ∑_{s∈S} r_s * x_s - R/2 * ∑_{s∈S} ∑_{t∈S} Cov(r_s, r_t) * x_s * x_t - M * ∑_{j∈B} p_j * y_j
```

where:

* $Cov(r_s, r_t)$ is the covariance of historical returns between strategy $s$ and $t$ (calculated from "investments_data.csv")
* $M$ is a large constant to penalize exceeding the budget
* $y_j$ is a binary variable, 1 if the budget is exceeded in scenario $j$, 0 otherwise

**Constraints:**

1. **Budget Constraint:**
    * For each budget scenario $j$: 
    ```
    ∑_{s∈S} c_s * x_s ≤ b_j + M * y_j
    ```
2. **Probability of Exceeding Budget:**
    * Limit the probability of exceeding the budget to the tolerable level:
    ```
    ∑_{j∈B} p_j * y_j ≤ α
    ```
3. **Investment Decision:**
    * Invest in each strategy at most once:
    ```
    x_s ∈ {0, 1}  ∀ s ∈ S
    ```
4. **Budget Exceeded Indicator:**
    * $y_j ∈ {0, 1}  ∀ j ∈ B$

**Model Interpretation:**

This model aims to find the optimal investment strategy combination that maximizes the expected return while considering risk (variance) and budget constraints. 

* The objective function balances maximizing expected returns with minimizing risk and the penalty for exceeding the budget.
* Constraint 1 ensures that the total investment cost does not exceed the available budget in each scenario.
* Constraint 2 limits the probability of exceeding the budget to the tolerable level.
* Constraint 3 ensures that we only invest in each strategy once. 
* Constraint 4 defines the binary variable for exceeding the budget in each scenario.

**Solving the Model:**

This model can be solved using mixed-integer quadratic programming solvers available in various optimization software packages. The solution will provide the optimal investment strategy considering the uncertain budget and risk preference.


## 3. Generate the pyomo code

In [208]:
#Second prompt gets the output of the previous step and generates the code
prompt2 = "Please write pyomo code for this mathematical problem. Use sample data where needed. Indicate where you use sample data. \n"
prompt2 += response.text 
response2 = generative_multimodal_model.generate_content([prompt2])

In [209]:
#Showing the code in a formatted way
Markdown(response2.text)

```python
# Import Pyomo library
from pyomo.environ import *

# Define the model
model = ConcreteModel()

# Sets and Indices
model.S = Set(initialize=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])  # Sample data for strategies
model.B = Set(initialize=[1, 2, 3, 4])  # Budget scenarios

# Parameters
# Sample data for returns (replace with actual data from investments_data.csv)
model.r = Param(model.S, initialize={1: 0.10, 2: 0.12, 3: 0.08, 4: 0.15, 5: 0.09,
                                    6: 0.11, 7: 0.07, 8: 0.13, 9: 0.14, 10: 0.06})

# Sample data for costs (replace with actual data from vector A)
model.c = Param(model.S, initialize={1: 50, 2: 60, 3: 40, 4: 70, 5: 55,
                                    6: 65, 7: 35, 8: 75, 9: 80, 10: 45})

# Sample data for budget scenarios
model.b = Param(model.B, initialize={1: 1000, 2: 1100, 3: 900, 4: 1200})

# Sample data for probabilities
model.p = Param(model.B, initialize={1: 0.55, 2: 0.4, 3: 0.04, 4: 0.01})

model.alpha = Param(initialize=0.4)  # Tolerable probability of exceeding budget
model.R = Param(initialize=2)  # Risk parameter

# Decision Variables
model.x = Var(model.S, domain=Binary)
model.y = Var(model.B, domain=Binary)

# Objective Function (Mean-Variance)
def objective_rule(model):
    # Sample covariance matrix (replace with actual data calculated from investments_data.csv)
    cov_matrix = [[0.0025, 0.0015], [0.0015, 0.0035]]  # Example for two strategies
    
    # Calculate portfolio variance
    portfolio_variance = sum(cov_matrix[i][j] * model.x[i+1] * model.x[j+1] for i in range(len(model.S)) for j in range(len(model.S)))
    
    # Objective function with sample data for returns and covariances
    return sum(model.r[s] * model.x[s] for s in model.S) - model.R/2 * portfolio_variance - 10000 * sum(model.p[j] * model.y[j] for j in model.B)

model.objective = Objective(rule=objective_rule, sense=maximize)

# Constraints
# Budget Constraint
def budget_constraint_rule(model, j):
    return sum(model.c[s] * model.x[s] for s in model.S) <= model.b[j] + 10000 * model.y[j]

model.budget_constraint = Constraint(model.B, rule=budget_constraint_rule)

# Probability of Exceeding Budget
def probability_constraint_rule(model):
    return sum(model.p[j] * model.y[j] for j in model.B) <= model.alpha

model.probability_constraint = Constraint(rule=probability_constraint_rule)

# Solve the model (requires a solver like CPLEX or Gurobi)
# SolverFactory('cplex').solve(model)

# Print the results
# model.pprint()
```
**Explanation:**

1. **Model Definition:** We define a concrete Pyomo model.
2. **Sets and Indices:** We create sets for strategies (`model.S`) and budget scenarios (`model.B`).
3. **Parameters:** We define parameters for returns (`model.r`), costs (`model.c`), budget scenarios (`model.b`), probabilities (`model.p`), tolerable probability (`model.alpha`), and risk parameter (`model.R`).  **Sample data is used for returns, costs, budget scenarios, probabilities, and the covariance matrix. Replace this with your actual data.**
4. **Decision Variables:** We define binary variables for investment decisions (`model.x`) and budget exceeding indicators (`model.y`).
5. **Objective Function:** We implement the mean-variance objective function, including the penalty for exceeding the budget and the sample calculation for portfolio variance. **Replace the sample covariance matrix with your actual data.**
6. **Constraints:** We implement the budget constraint for each scenario and the constraint on the probability of exceeding the budget.
7. **Solving:** The commented-out lines show how to solve the model using a solver like CPLEX or Gurobi.
8. **Results:** The commented-out line demonstrates how to print the results.

**Important Notes:**

* You need to replace the sample data with your actual data from the "investments_data.csv" file and vector A.
* You need to install a solver (e.g., CPLEX, Gurobi) and set it up with Pyomo to solve the model. 
* The provided covariance matrix is just an example. You need to calculate the actual covariance matrix from your historical returns data. 


## 4. Input problem data and try running the generated code

In [227]:
# Import Pyomo library
from pyomo.environ import *

# Define the model
model = ConcreteModel()

# Sets and Indices
model.S = Set(initialize=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])  # Sample data for strategies
model.B = Set(initialize=[1, 2, 3, 4])  # Budget scenarios

# Parameters
# Sample data for returns (replace with actual data from investments_data.csv)
model.r = Param(model.S, initialize={1: 13.186842, 2: 85.966713, 3: 94.041105, 4: 6.073180, 5: 40.317452,
                                    6: 68.155012, 7: 1.912783, 8: 87.123105, 9: 42.928717, 10: 56.694919}) #Human altered

# Sample data for costs (replace with actual data from vector A)
model.c = Param(model.S, initialize={
    1: 80,
    2: 340,
    3: 410,
    4: 50,
    5: 180,
    6: 221,
    7: 15,
    8: 348,
    9: 191,
    10: 225
})

# Sample data for budget scenarios
model.b = Param(model.B, initialize={1: 1000, 2: 1100, 3: 900, 4: 1200})

# Sample data for probabilities
model.p = Param(model.B, initialize={1: 0.55, 2: 0.4, 3: 0.04, 4: 0.01})

model.alpha = Param(initialize=0.4)  # Tolerable probability of exceeding budget
model.R = Param(initialize=2)  # Risk parameter

# Decision Variables
model.x = Var(model.S, domain=Binary)
model.y = Var(model.B, domain=Binary)


data = pd.read_csv("investments_data.csv", header=None, sep=';')  # Replace with your data file  #HUMAN MODIFIED, helping the model load the data


# Objective Function (Mean-Variance)
def objective_rule(model):
    # Sample covariance matrix (replace with actual data calculated from investments_data.csv)
    cov_matrix = data.cov()  # Example for two strategies #Human altered
    
    # Calculate portfolio variance
    portfolio_variance = sum(cov_matrix[i][j] * model.x[i+1] * model.x[j+1] for i in range(len(model.S)) for j in range(len(model.S)))
    
    # Objective function with sample data for returns and covariances
    return sum(model.r[s] * model.x[s] for s in model.S) - model.R/2 * portfolio_variance - 10000 * sum(model.p[j] * model.y[j] for j in model.B)

model.objective = Objective(rule=objective_rule, sense=maximize)

# Constraints
# Budget Constraint
def budget_constraint_rule(model, j):
    return sum(model.c[s] * model.x[s] for s in model.S) <= model.b[j] + 10000 * model.y[j]

model.budget_constraint = Constraint(model.B, rule=budget_constraint_rule)

# Probability of Exceeding Budget
def probability_constraint_rule(model):
    return sum(model.p[j] * model.y[j] for j in model.B) <= model.alpha

model.probability_constraint = Constraint(rule=probability_constraint_rule)

#Solve the model (requires a solver like CPLEX or Gurobi)
SolverFactory('ipopt').solve(model)

#Print the results
model.pprint()

2 Set Declarations
    B : Size=1, Index=None, Ordered=Insertion
        Key  : Dimen : Domain : Size : Members
        None :     1 :    Any :    4 : {1, 2, 3, 4}
    S : Size=1, Index=None, Ordered=Insertion
        Key  : Dimen : Domain : Size : Members
        None :     1 :    Any :   10 : {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

6 Param Declarations
    R : Size=1, Index=None, Domain=Any, Default=None, Mutable=False
        Key  : Value
        None :     2
    alpha : Size=1, Index=None, Domain=Any, Default=None, Mutable=False
        Key  : Value
        None :   0.4
    b : Size=4, Index=B, Domain=Any, Default=None, Mutable=False
        Key : Value
          1 :  1000
          2 :  1100
          3 :   900
          4 :  1200
    c : Size=10, Index=S, Domain=Any, Default=None, Mutable=False
        Key : Value
          1 :    80
          2 :   340
          3 :   410
          4 :    50
          5 :   180
          6 :   221
          7 :    15
          8 :   348
          9 :  

## 5. Correct the code to verify model viability (optional)

In [221]:

dataa = pd.read_csv("investments_data.csv", header=None, sep=';')  # Replace with your data file  #HUMAN MODIFIED, helping the model load the data
cov_matrix = dataa.cov()

In [228]:
model.objective()

245.5856263020345

## 6. Printing the outputs as strings, so they can be saved.
Those can be rendered as markdown for better readability

In [204]:
print(response.text)

## Mean-Variance Optimization Model with Uncertain Budget

**Sets and Indices:**

* $S$: Set of investment strategies, $S = \{1, 2, ..., 10\}$
* $B$: Set of budget scenarios, $B = \{1, 2, 3, 4\}$

**Parameters:**

* $r_{s,t}$: Return of strategy $s$ in period $t$ (from "investments_data.csv") 
* $T$: Number of historical periods (number of rows in "investments_data.csv")
* $A_s$: Cost of investing in strategy $s$ (from vector A)
* $b_i$: Budget amount in scenario $i$ (1000, 1100, 900, 1200)
* $p_i$: Probability of budget scenario $i$ (0.55, 0.4, 0.04, 0.01) 
* $\alpha$: Tolerable probability of exceeding the budget (0.4)
* $r$: Risk aversion parameter (2)

**Decision Variables:**

* $x_s$: Binary variable, 1 if investing in strategy $s$, 0 otherwise
* $y_i$: Binary variable, 1 if budget scenario $i$ is exceeded, 0 otherwise

**Objective Function (Maximize):**

This is a mean-variance objective function, maximizing the expected return while minimizing the variance of the portfolio.

```

In [205]:
print(response2.text)

## Pyomo Implementation of Mean-Variance Optimization with Uncertain Budget

```python
import pyomo.environ as pyo
import pandas as pd

# Sample Data (replace with actual data)
data = pd.read_csv("investments_data.csv")  # Replace with your data file
T = len(data)
A = [100, 120, 80, 90, 110, 130, 75, 105, 95, 125]  # Sample costs
b = [1000, 1100, 900, 1200]
p = [0.55, 0.4, 0.04, 0.01]
alpha = 0.4
r = 2

# Define the model
model = pyo.ConcreteModel()

# Sets and Indices
model.S = pyo.RangeSet(1, 10)  # Set of strategies
model.B = pyo.RangeSet(1, 4)   # Set of budget scenarios

# Parameters
model.r = pyo.Param(model.S, model.RangeSet(1, T), initialize=lambda model, s, t: data.iloc[t-1, s-1])
model.A = pyo.Param(model.S, initialize=lambda model, s: A[s-1])
model.b = pyo.Param(model.B, initialize=lambda model, i: b[i-1])
model.p = pyo.Param(model.B, initialize=lambda model, i: p[i-1])
model.alpha = pyo.Param(initialize=alpha)
model.risk_aversion = pyo.Param(initialize=r)

# Decision Variab