# LLM Optimization Modelling Experiment

In [156]:
import vertexai
from vertexai.preview.generative_models import GenerativeModel
from IPython.display import Markdown

## 1. Define the problem description

In [172]:
problem = '''Your goal is to invest in several of 10 possible investment strategies in the most optimal way. The historic returns of those strategies are stored in the file "investments_data.csv". Each column represents one strategy and the rows are the past investment outcomes. There is no index and the values are separated by a ;.

The costs for investing in a given investment is stored in a vector A, which has one value for each strategy in order.  
The values are: [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]

You can only invest once into an investment. 

Unfortunately due to other costs and inflation, your available budget at this time is uncertain. There are four possible budget scenarios with different probabilities: scenario 1 with 1000 euros and probability of 0.55, scenario 2  with 1100 euros and probability of 0.4, scenario 3 with 900 euros and probability of 0.04, scenario 4 with 1200 euros and probability of 0.01. 
The tolerable probability of exceeding the budget is 0.4.

Please formulate a mean-variance mathematical model for this optimization problem, considering the past performance of investment strategies and the uncertain budget. You can take 2 as the risk parameter r.
'''

## 2. Generate the mathematical model

In [173]:
#Initializing the session. To replicate, make sure the right credentials are saved in a PATH variable
PROJECT_ID = "llm4optproblems"
REGION = "us-central1"
vertexai.init(project=PROJECT_ID, location=REGION)

#Specifying the model
generative_multimodal_model = GenerativeModel("gemini-1.5-pro-preview-0409")

#The propmt applied to all problems
prompt = '''Let's think step by step. Please write a mathematical optimization model for this problem. If there are parameter values, make sure to include them in the mathematical formulation.
'''

#Generate the response
response = generative_multimodal_model.generate_content([prompt+problem])


In [174]:
#Show the resopnse in a formatted way
Markdown(response.text)

## Mean-Variance Optimization Model with Uncertain Budget

**Sets and Indices:**

* $S$: Set of investment strategies, $S = \{1, 2, ..., 10\}$
* $B$: Set of budget scenarios, $B = \{1, 2, 3, 4\}$

**Parameters:**

* $r_{s,t}$: Return of strategy $s$ in period $t$ (from "investments_data.csv") 
* $T$: Number of historical periods (number of rows in "investments_data.csv")
* $A_s$: Cost of investing in strategy $s$ (from vector A)
* $b_i$: Budget amount in scenario $i$ (1000, 1100, 900, 1200)
* $p_i$: Probability of budget scenario $i$ (0.55, 0.4, 0.04, 0.01) 
* $\alpha$: Tolerable probability of exceeding the budget (0.4)
* $r$: Risk aversion parameter (2)

**Decision Variables:**

* $x_s$: Binary variable, 1 if investing in strategy $s$, 0 otherwise
* $y_i$: Binary variable, 1 if budget scenario $i$ is exceeded, 0 otherwise

**Objective Function (Maximize):**

This is a mean-variance objective function, maximizing the expected return while minimizing the variance of the portfolio.

```
Maximize: (1/T) * Σ_t [ Σ_s (r_{s,t} * x_s) ] - r/2 * Var(Σ_s (r_{s,t} * x_s))
```

**Constraints:**

1. **Budget Constraint:** Ensure the total investment cost does not exceed the budget in each scenario with a probability of at least (1-α).

```
Σ_s (A_s * x_s) <= b_i + M * y_i,  ∀ i ∈ B
Σ_i (p_i * y_i) <= α
```

where M is a large constant ensuring the constraint is relaxed when $y_i = 1$.

2. **Investment Decision:** Invest in each strategy at most once.

```
x_s ∈ {0, 1}, ∀ s ∈ S
```

3. **Budget Scenario:** Only one budget scenario can occur.

```
Σ_i y_i = 1 
y_i ∈ {0, 1}, ∀ i ∈ B
```

**Model Interpretation:**

This model aims to find the optimal investment strategy that maximizes the expected return while considering the risk (variance) and the uncertainty in the available budget. The budget constraint allows for exceeding the budget in certain scenarios, but the probability of this happening should be within the tolerable limit (α). The model incorporates historical returns of the strategies and the cost of investing in each strategy to make informed decisions. 

**Implementation:**

This model can be implemented and solved using various optimization solvers available in software packages like Python, R, or commercial optimization software like CPLEX or Gurobi. 


## 3. Generate the pyomo code

In [175]:
#Second prompt gets the output of the previous step and generates the code
prompt2 = "Please write pyomo code for this mathematical problem. Use sample data where needed. Indicate where you use sample data. \n"
prompt2 += response.text 
response2 = generative_multimodal_model.generate_content([prompt2])

In [176]:
#Showing the code in a formatted way
Markdown(response2.text)

## Pyomo Implementation of Mean-Variance Optimization with Uncertain Budget

```python
import pyomo.environ as pyo
import pandas as pd

# Sample Data (replace with actual data)
data = pd.read_csv("investments_data.csv")  # Replace with your data file
T = len(data)
A = [100, 120, 80, 90, 110, 130, 75, 105, 95, 125]  # Sample costs
b = [1000, 1100, 900, 1200]
p = [0.55, 0.4, 0.04, 0.01]
alpha = 0.4
r = 2

# Define the model
model = pyo.ConcreteModel()

# Sets and Indices
model.S = pyo.RangeSet(1, 10)  # Set of strategies
model.B = pyo.RangeSet(1, 4)   # Set of budget scenarios

# Parameters
model.r = pyo.Param(model.S, model.RangeSet(1, T), initialize=lambda model, s, t: data.iloc[t-1, s-1])
model.A = pyo.Param(model.S, initialize=lambda model, s: A[s-1])
model.b = pyo.Param(model.B, initialize=lambda model, i: b[i-1])
model.p = pyo.Param(model.B, initialize=lambda model, i: p[i-1])
model.alpha = pyo.Param(initialize=alpha)
model.risk_aversion = pyo.Param(initialize=r)

# Decision Variables
model.x = pyo.Var(model.S, domain=pyo.Binary)
model.y = pyo.Var(model.B, domain=pyo.Binary)

# Objective Function
def objective_rule(model):
    # Calculate expected return
    expected_return = sum(model.r[s,t] * model.x[s] for s in model.S for t in model.RangeSet(1, T)) / T
    # Calculate portfolio variance
    portfolio_variance = sum(model.r[s,t]**2 * model.x[s] for s in model.S for t in model.RangeSet(1, T)) / T - expected_return**2
    return expected_return - model.risk_aversion/2 * portfolio_variance
model.objective = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Constraints
model.budget_constraint = pyo.ConstraintList()
M = 10000  # A large constant
for i in model.B:
    model.budget_constraint.add(sum(model.A[s] * model.x[s] for s in model.S) <= model.b[i] + M * model.y[i])
model.probability_constraint = pyo.Constraint(expr=sum(model.p[i] * model.y[i] for i in model.B) <= model.alpha)
model.invest_constraint = pyo.Constraint(expr=sum(model.x[s] for s in model.S) <= 1)
model.scenario_constraint = pyo.Constraint(expr=sum(model.y[i] for i in model.B) == 1)

# Solve the model
solver = pyo.SolverFactory('glpk')  # You can use other solvers like cplex, gurobi
solver.solve(model)

# Print results
print("Optimal solution:")
for s in model.S:
    if pyo.value(model.x[s]) > 0.5:
        print(f"Invest in strategy {s}")
for i in model.B:
    if pyo.value(model.y[i]) > 0.5:
        print(f"Budget scenario {i} is exceeded")
print(f"Expected return: {pyo.value(model.objective)}")
``` 

**Explanation:**

1. **Import Libraries and Data:** Import necessary libraries and sample data. Remember to replace sample data with your actual data.
2. **Define Model:** Create a Pyomo ConcreteModel object.
3. **Sets and Indices:** Define sets for strategies and budget scenarios using RangeSet.
4. **Parameters:** Define parameters using Param. Initialize them with sample data or functions to retrieve data from your data source. 
5. **Decision Variables:** Define binary decision variables x and y using Var and specify their domains.
6. **Objective Function:** Define the objective function using Objective. Implement the mean-variance calculation. 
7. **Constraints:** 
    * **Budget Constraints:** Create a ConstraintList to add budget constraints for each scenario. Use a large constant M to relax the constraint when the budget is exceeded.
    * **Probability Constraint:** Add a constraint to ensure the probability of exceeding the budget is within the tolerable limit.
    * **Investment Constraint:**  Add a constraint to invest in at most one strategy. 
    * **Scenario Constraint:**  Add a constraint to ensure only one budget scenario occurs. 
8. **Solve the model:** Choose a solver (e.g., 'glpk', 'cplex', 'gurobi') and solve the model. 
9. **Print Results:** Print the optimal investment decisions and the expected return. 


## 4. Input problem data and try running the generated code

In [181]:
import pyomo.environ as pyo
import pandas as pd

# Sample Data (replace with actual data)
data = pd.read_csv("investments_data.csv", header=None, sep=';')  # Replace with your data file  #HUMAN MODIFIED, helping the model load the data
T = len(data)
A =  [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]  # Sample costs
b = [1000, 1100, 900, 1200]
p = [0.55, 0.4, 0.04, 0.01]
alpha = 0.4
r = 2

# Define the model
model = pyo.ConcreteModel()

# Sets and Indices
model.S = pyo.RangeSet(1, 10)  # Set of strategies
model.B = pyo.RangeSet(1, 4)   # Set of budget scenarios

# Parameters
model.r = pyo.Param(model.S, model.RangeSet(1, T), initialize=lambda model, s, t: data.iloc[t-1, s-1])
model.A = pyo.Param(model.S, initialize=lambda model, s: A[s-1])
model.b = pyo.Param(model.B, initialize=lambda model, i: b[i-1])
model.p = pyo.Param(model.B, initialize=lambda model, i: p[i-1])
model.alpha = pyo.Param(initialize=alpha)
model.risk_aversion = pyo.Param(initialize=r)

# Decision Variables
model.x = pyo.Var(model.S, domain=pyo.Binary)
model.y = pyo.Var(model.B, domain=pyo.Binary)

# Objective Function
def objective_rule(model):
    # Calculate expected return
    expected_return = sum(model.r[s,t] * model.x[s] for s in model.S for t in model.RangeSet(1, T)) / T
    # Calculate portfolio variance
    portfolio_variance = sum(model.r[s,t]**2 * model.x[s] for s in model.S for t in model.RangeSet(1, T)) / T - expected_return**2
    return expected_return - model.risk_aversion/2 * portfolio_variance
model.objective = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Constraints
model.budget_constraint = pyo.ConstraintList()
M = 10000  # A large constant
for i in model.B:
    model.budget_constraint.add(sum(model.A[s] * model.x[s] for s in model.S) <= model.b[i] + M * model.y[i])
model.probability_constraint = pyo.Constraint(expr=sum(model.p[i] * model.y[i] for i in model.B) <= model.alpha)
model.invest_constraint = pyo.Constraint(expr=sum(model.x[s] for s in model.S) <= 1)
model.scenario_constraint = pyo.Constraint(expr=sum(model.y[i] for i in model.B) == 1)

# Solve the model
solver = pyo.SolverFactory('glpk')  # You can use other solvers like cplex, gurobi
solver.solve(model)

# Print results
print("Optimal solution:")
for s in model.S:
    if pyo.value(model.x[s]) > 0.5:
        print(f"Invest in strategy {s}")
for i in model.B:
    if pyo.value(model.y[i]) > 0.5:
        print(f"Budget scenario {i} is exceeded")
print(f"Expected return: {pyo.value(model.objective)}")

TypeError: Cannot create a Set from data that does not support __contains__.  Expected set-like object supporting collections.abc.Collection interface, but received '_generic_component_decorator'.

## 5. Correct the code to verify model viability (optional)

In [192]:
import pyomo.environ as pyo
import pandas as pd

# Sample Data (replace with actual data)
data = pd.read_csv("investments_data.csv", header=None, sep=';')  # Replace with your data file  #HUMAN MODIFIED, helping the model load the data
T = len(data)
A =  [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]  # Sample costs
b = [1000, 1100, 900, 1200]
p = [0.55, 0.4, 0.04, 0.01]
alpha = 0.4
r = 2

# Define the model
model = pyo.ConcreteModel()

# Sets and Indices
model.S = pyo.RangeSet(1, 10)  # Set of strategies
model.B = pyo.RangeSet(1, 4)   # Set of budget scenarios

# Parameters
model.r = pyo.Param(model.S, range(1, T+1), initialize=lambda model, s, t: data.iloc[t-1, s-1])
model.A = pyo.Param(model.S, initialize=lambda model, s: A[s-1])
model.b = pyo.Param(model.B, initialize=lambda model, i: b[i-1])
model.p = pyo.Param(model.B, initialize=lambda model, i: p[i-1])
model.alpha = pyo.Param(initialize=alpha)
model.risk_aversion = pyo.Param(initialize=r)

# Decision Variables
model.x = pyo.Var(model.S, domain=pyo.Binary)
model.y = pyo.Var(model.B, domain=pyo.Binary)

# Objective Function
def objective_rule(model):
    # Calculate expected return
    expected_return = sum(model.r[s, t] * model.x[s] for s in model.S for t in range(1, T + 1)) / T
    portfolio_variance = sum(model.r[s, t] ** 2 * model.x[s] for s in model.S for t in range(1, T + 1)) / T - expected_return ** 2
    return expected_return - model.risk_aversion * portfolio_variance
model.objective = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Constraints
model.budget_constraint = pyo.ConstraintList()
M = 10000  # A large constant
for i in model.B:
    model.budget_constraint.add(sum(model.A[s] * model.x[s] for s in model.S) <= model.b[i] + M * model.y[i])
model.probability_constraint = pyo.Constraint(expr=sum(model.p[i] * model.y[i] for i in model.B) <= model.alpha)
model.invest_constraint = pyo.Constraint(expr=sum(model.x[s] for s in model.S) <= 1)
#model.scenario_constraint = pyo.Constraint(expr=sum(model.y[i] for i in model.B) == 1)

# Solve the model
solver = pyo.SolverFactory('ipopt')  # You can use other solvers like cplex, gurobi
solver.solve(model)

# Print results
print("Optimal solution:")
for s in model.S:
    if pyo.value(model.x[s]) > 0.5:
        print(f"Invest in strategy {s}")
for i in model.B:
    if pyo.value(model.y[i]) > 0.5:
        print(f"Budget scenario {i} is exceeded")
print(f"Expected return: {pyo.value(model.objective)}")

Optimal solution:
Budget scenario 3 is exceeded
Budget scenario 4 is exceeded
Expected return: -2.7433143047348834e-07


## 6. Printing the outputs as strings, so they can be saved.
Those can be rendered as markdown for better readability

In [204]:
print(response.text)

## Mean-Variance Optimization Model with Uncertain Budget

**Sets and Indices:**

* $S$: Set of investment strategies, $S = \{1, 2, ..., 10\}$
* $B$: Set of budget scenarios, $B = \{1, 2, 3, 4\}$

**Parameters:**

* $r_{s,t}$: Return of strategy $s$ in period $t$ (from "investments_data.csv") 
* $T$: Number of historical periods (number of rows in "investments_data.csv")
* $A_s$: Cost of investing in strategy $s$ (from vector A)
* $b_i$: Budget amount in scenario $i$ (1000, 1100, 900, 1200)
* $p_i$: Probability of budget scenario $i$ (0.55, 0.4, 0.04, 0.01) 
* $\alpha$: Tolerable probability of exceeding the budget (0.4)
* $r$: Risk aversion parameter (2)

**Decision Variables:**

* $x_s$: Binary variable, 1 if investing in strategy $s$, 0 otherwise
* $y_i$: Binary variable, 1 if budget scenario $i$ is exceeded, 0 otherwise

**Objective Function (Maximize):**

This is a mean-variance objective function, maximizing the expected return while minimizing the variance of the portfolio.

```

In [205]:
print(response2.text)

## Pyomo Implementation of Mean-Variance Optimization with Uncertain Budget

```python
import pyomo.environ as pyo
import pandas as pd

# Sample Data (replace with actual data)
data = pd.read_csv("investments_data.csv")  # Replace with your data file
T = len(data)
A = [100, 120, 80, 90, 110, 130, 75, 105, 95, 125]  # Sample costs
b = [1000, 1100, 900, 1200]
p = [0.55, 0.4, 0.04, 0.01]
alpha = 0.4
r = 2

# Define the model
model = pyo.ConcreteModel()

# Sets and Indices
model.S = pyo.RangeSet(1, 10)  # Set of strategies
model.B = pyo.RangeSet(1, 4)   # Set of budget scenarios

# Parameters
model.r = pyo.Param(model.S, model.RangeSet(1, T), initialize=lambda model, s, t: data.iloc[t-1, s-1])
model.A = pyo.Param(model.S, initialize=lambda model, s: A[s-1])
model.b = pyo.Param(model.B, initialize=lambda model, i: b[i-1])
model.p = pyo.Param(model.B, initialize=lambda model, i: p[i-1])
model.alpha = pyo.Param(initialize=alpha)
model.risk_aversion = pyo.Param(initialize=r)

# Decision Variab