# LLM Optimization Modelling Experiment

In [3]:
import vertexai
from vertexai.preview.generative_models import GenerativeModel, Image
from IPython.display import Markdown

## 1. Define the problem description

In [147]:
problem = '''Your goal is to invest in several of 10 possible investment strategies in the most optimal way. The historic returns of those strategies are stored in the file "investments_data.csv". Each column represents one strategy and the rows are the past investment outcomes. There is no index and the values are separated by a ;.

The costs for investing in a given investment is stored in a vector A, which has one value for each strategy in order.  
The values are: [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]

You can only invest once into an investment. 

Unfortunately due to other costs and inflation, your available budget at this time is uncertain. There are four possible budget scenarios with different probabilities: scenario 1 with 1000 euros and probability of 0.55, scenario 2  with 1100 euros and probability of 0.4, scenario 3 with 900 euros and probability of 0.04, scenario 4 with 1200 euros and probability of 0.01. 
The tolerable probability of exceeding the budget is 0.4.

Please formulate a mean-variance mathematical model for this optimization problem, considering the past performance of investment strategies and the uncertain budget. You can take 2 as the risk parameter r.'''

## 2. Generate the mathematical model

In [148]:
#Initializing the session. To replicate, make sure the right credentials are saved in a PATH variable
PROJECT_ID = "llm4optproblems"
REGION = "us-central1"
vertexai.init(project=PROJECT_ID, location=REGION)

#Specifying the model
generative_multimodal_model = GenerativeModel("gemini-1.5-pro-preview-0409")

#The propmt applied to all problems
prompt = '''Please formulate a mathematical optimization model for this problem. Include parameters, decision variables, the objective function and the constraints in your answer.
'''

#Generate the response
response = generative_multimodal_model.generate_content([prompt+problem])


In [149]:
#Show the resopnse in a formatted way
Markdown(response.text)

## Mathematical Optimization Model for Investment Strategy Selection

**Parameters:**

* **R:** Matrix of historical returns for each investment strategy (obtained from "investments_data.csv"). Each column represents a strategy, and each row represents a period.
* **A:** Vector of investment costs for each strategy: [80, 340, 410, 50, 180, 221, 15, 348, 191, 225].
* **S:** Vector of possible budget scenarios: [1000, 1100, 900, 1200].
* **P:** Vector of probabilities for each budget scenario: [0.55, 0.4, 0.04, 0.01].
* **r:** Risk aversion parameter: 2.
* **α:** Tolerable probability of exceeding the budget: 0.4.

**Decision Variables:**

* **x<sub>j</sub>:** Binary variable indicating whether to invest in strategy j (1 if invest, 0 otherwise), where j = 1, 2, ..., 10.

**Objective Function:**

Maximize the expected return minus the risk penalty:

```
Maximize:  ∑<sub>j=1</sub><sup>10</sup> (μ<sub>j</sub> * x<sub>j</sub>) - r * σ<sup>2</sup> 
```

where:

* **μ<sub>j</sub>:** Average historical return of strategy j, calculated as the mean of column j in matrix R.
* **σ<sup>2</sup>:** Variance of the portfolio return, calculated as: 
  ```
  σ<sup>2</sup> = ∑<sub>j=1</sub><sup>10</sup> ∑<sub>k=1</sub><sup>10</sup> (x<sub>j</sub> * x<sub>k</sub> * Cov(R<sub>j</sub>, R<sub>k</sub>))
  ```
  where Cov(R<sub>j</sub>, R<sub>k</sub>) is the covariance between the returns of strategies j and k, calculated from historical data in R.

**Constraints:**

1. **Budget constraint:** The total investment cost should not exceed the budget with a probability greater than α:
   ```
   P(∑<sub>j=1</sub><sup>10</sup> (A<sub>j</sub> * x<sub>j</sub>) > S<sub>i</sub>) ≤ α for all i = 1, 2, 3, 4 
   ```
   This constraint ensures that the probability of exceeding each budget scenario is less than the tolerable probability.
2. **Binary constraint:**
   ```
   x<sub>j</sub> ∈ {0, 1} for all j = 1, 2, ..., 10
   ```
   This constraint ensures that the decision variables are binary, meaning either invest in a strategy or not.

**Model:**

The complete mathematical model is:

```
Maximize:  ∑<sub>j=1</sub><sup>10</sup> (μ<sub>j</sub> * x<sub>j</sub>) - r * ∑<sub>j=1</sub><sup>10</sup> ∑<sub>k=1</sub><sup>10</sup> (x<sub>j</sub> * x<sub>k</sub> * Cov(R<sub>j</sub>, R<sub>k</sub>))

Subject to:

P(∑<sub>j=1</sub><sup>10</sup> (A<sub>j</sub> * x<sub>j</sub>) > S<sub>i</sub>) ≤ α for all i = 1, 2, 3, 4 
x<sub>j</sub> ∈ {0, 1} for all j = 1, 2, ..., 10
```

This model aims to find the optimal investment strategy combination that maximizes the expected return while considering the risk associated with the investment and the uncertainty in the available budget. 


## 3. Generate the pyomo code

In [150]:
#Second prompt gets the output of the previous step and generates the code
prompt2 = "Please write pyomo code for this mathematical problem. Use sample data where needed. Indicate where you use sample data. \n"
prompt2 += response.text 
response2 = generative_multimodal_model.generate_content([prompt2])

In [151]:
#Showing the code in a formatted way
Markdown(response2.text)

```python
import pyomo.environ as pyo
import pandas as pd
import numpy as np

# Sample data for historical returns (replace with actual data from "investments_data.csv")
data = {
    'Strategy 1': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 2': [0.01, 0.02, 0.01, 0.03, 0.02],
    'Strategy 3': [0.03, 0.02, 0.04, 0.03, 0.02],
    'Strategy 4': [0.01, 0.03, 0.02, 0.04, 0.03],
    'Strategy 5': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 6': [0.01, 0.02, 0.01, 0.03, 0.02],
    'Strategy 7': [0.03, 0.02, 0.04, 0.03, 0.02],
    'Strategy 8': [0.01, 0.03, 0.02, 0.04, 0.03],
    'Strategy 9': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 10': [0.01, 0.02, 0.01, 0.03, 0.02]
}
df = pd.DataFrame(data)

# Parameters
R = df.values  # Matrix of historical returns
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]  # Investment costs
S = [1000, 1100, 900, 1200]  # Budget scenarios
P = [0.55, 0.4, 0.04, 0.01]  # Probabilities of budget scenarios
r = 2  # Risk aversion parameter
alpha = 0.4  # Tolerable probability of exceeding the budget

# Calculate average returns and covariance matrix
mu = np.mean(R, axis=0)
cov_matrix = np.cov(R, rowvar=False)

# Create model
model = pyo.ConcreteModel()

# Decision variables
model.x = pyo.Var(range(len(A)), domain=pyo.Binary)

# Objective function
def objective_rule(model):
    expected_return = sum(mu[j] * model.x[j] for j in range(len(A)))
    risk_penalty = r * sum(
        model.x[j] * model.x[k] * cov_matrix[j, k] 
        for j in range(len(A)) for k in range(len(A))
    )
    return expected_return - risk_penalty
model.objective = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Budget constraints
model.budget_constraints = pyo.ConstraintList()
for i in range(len(S)):
    lhs = sum(A[j] * model.x[j] for j in range(len(A)))
    model.budget_constraints.add(expr=pyo.EXPR.inequality(body=lhs, upper=S[i], probability=P[i]))

# Solve model
solver = pyo.SolverFactory('cbc')  # Choose a suitable solver
results = solver.solve(model)

# Print results
print(results)
print("Decision variables:")
for j in range(len(A)):
    print(f"x[{j}] = {model.x[j].value}")
```

This code defines a Pyomo model that optimizes the investment strategy based on the given parameters and constraints. Note that the provided historical return data is sample data and should be replaced with the actual data from "investments_data.csv". The code calculates the average returns and covariance matrix from the provided data, which are then used in the optimization model. The budget constraints are implemented using the `pyo.EXPR.inequality` function with the `probability` argument to account for the probabilistic budget scenarios. Finally, the model is solved using the CBC solver, and the optimal decision variables are printed. 


## 4. Input problem data and try running the generated code

In [152]:
import pyomo.environ as pyo
import pandas as pd
import numpy as np

# Sample data for historical returns (replace with actual data from "investments_data.csv")
data = {
    'Strategy 1': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 2': [0.01, 0.02, 0.01, 0.03, 0.02],
    'Strategy 3': [0.03, 0.02, 0.04, 0.03, 0.02],
    'Strategy 4': [0.01, 0.03, 0.02, 0.04, 0.03],
    'Strategy 5': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 6': [0.01, 0.02, 0.01, 0.03, 0.02],
    'Strategy 7': [0.03, 0.02, 0.04, 0.03, 0.02],
    'Strategy 8': [0.01, 0.03, 0.02, 0.04, 0.03],
    'Strategy 9': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 10': [0.01, 0.02, 0.01, 0.03, 0.02]
}
df = pd.DataFrame(data)

# Parameters
R = df.values  # Matrix of historical returns
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]  # Investment costs
S = [1000, 1100, 900, 1200]  # Budget scenarios
P = [0.55, 0.4, 0.04, 0.01]  # Probabilities of budget scenarios
r = 2  # Risk aversion parameter
alpha = 0.4  # Tolerable probability of exceeding the budget

# Calculate average returns and covariance matrix
mu = np.mean(R, axis=0)
cov_matrix = np.cov(R, rowvar=False)

# Create model
model = pyo.ConcreteModel()

# Decision variables
model.x = pyo.Var(range(len(A)), domain=pyo.Binary)

# Objective function
def objective_rule(model):
    expected_return = sum(mu[j] * model.x[j] for j in range(len(A)))
    risk_penalty = r * sum(
        model.x[j] * model.x[k] * cov_matrix[j, k] 
        for j in range(len(A)) for k in range(len(A))
    )
    return expected_return - risk_penalty
model.objective = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Budget constraints
model.budget_constraints = pyo.ConstraintList()
for i in range(len(S)):
    lhs = sum(A[j] * model.x[j] for j in range(len(A)))
    model.budget_constraints.add(expr=pyo.EXPR.inequality(body=lhs, upper=S[i], probability=P[i]))

# Solve model
solver = pyo.SolverFactory('cbc')  # Choose a suitable solver
results = solver.solve(model)

# Print results
print(results)
print("Decision variables:")
for j in range(len(A)):
    print(f"x[{j}] = {model.x[j].value}")

AttributeError: module 'pyomo.environ' has no attribute 'EXPR'

## 5. Correct the code to verify model viability (optional)

In [155]:
import pyomo.environ as pyo
import pandas as pd
import numpy as np

# Sample data for historical returns (replace with actual data from "investments_data.csv")
data = {
    'Strategy 1': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 2': [0.01, 0.02, 0.01, 0.03, 0.02],
    'Strategy 3': [0.03, 0.02, 0.04, 0.03, 0.02],
    'Strategy 4': [0.01, 0.03, 0.02, 0.04, 0.03],
    'Strategy 5': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 6': [0.01, 0.02, 0.01, 0.03, 0.02],
    'Strategy 7': [0.03, 0.02, 0.04, 0.03, 0.02],
    'Strategy 8': [0.01, 0.03, 0.02, 0.04, 0.03],
    'Strategy 9': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 10': [0.01, 0.02, 0.01, 0.03, 0.02]
}
df = pd.DataFrame(data)

# Parameters
R = df.values  # Matrix of historical returns
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]  # Investment costs
S = [1000, 1100, 900, 1200]  # Budget scenarios
P = [0.55, 0.4, 0.04, 0.01]  # Probabilities of budget scenarios
r = 2  # Risk aversion parameter
alpha = 0.4  # Tolerable probability of exceeding the budget

# Calculate average returns and covariance matrix
mu = np.mean(R, axis=0)
cov_matrix = np.cov(R, rowvar=False)

# Create model
model = pyo.ConcreteModel()

# Decision variables
model.x = pyo.Var(range(len(A)), domain=pyo.Binary)

# Objective function
def objective_rule(model):
    expected_return = sum(mu[j] * model.x[j] for j in range(len(A)))
    risk_penalty = r * sum(
        model.x[j] * model.x[k] * cov_matrix[j, k] 
        for j in range(len(A)) for k in range(len(A))
    )
    return expected_return - risk_penalty
model.objective = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Budget constraints
model.budget_constraints = pyo.ConstraintList()
for i in range(len(S)):
    lhs = sum(A[j] * model.x[j] for j in range(len(A)))
    model.budget_constraints.add(expr=lhs<=S[i])

# Solve model
solver = pyo.SolverFactory('ipopt')  # Choose a suitable solver
results = solver.solve(model)

# Print results
print(results)
print("Decision variables:")
for j in range(len(A)):
    print(f"x[{j}] = {model.x[j].value}")


Problem: 
- Lower bound: -inf
  Upper bound: inf
  Number of objectives: 1
  Number of constraints: 4
  Number of variables: 10
  Sense: unknown
Solver: 
- Status: ok
  Message: Ipopt 3.11.1\x3a Optimal Solution Found
  Termination condition: optimal
  Id: 0
  Error rc: 0
  Time: 0.05002951622009277
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

Decision variables:
x[0] = 0.99999978127438
x[1] = 2.6798733076179074e-07
x[2] = 5.001770214279535e-07
x[3] = 0.999999893445507
x[4] = 0.9999992061161198
x[5] = 0.999992023529714
x[6] = 0.9999999138081366
x[7] = 1.3375313572472333e-06
x[8] = 0.9999988987346486
x[9] = 0.7244504690754556


## 6. Printing the outputs as strings, so they can be saved.

In [156]:
print(response.text)

## Mathematical Optimization Model for Investment Strategy Selection

**Parameters:**

* **R:** Matrix of historical returns for each investment strategy (obtained from "investments_data.csv"). Each column represents a strategy, and each row represents a period.
* **A:** Vector of investment costs for each strategy: [80, 340, 410, 50, 180, 221, 15, 348, 191, 225].
* **S:** Vector of possible budget scenarios: [1000, 1100, 900, 1200].
* **P:** Vector of probabilities for each budget scenario: [0.55, 0.4, 0.04, 0.01].
* **r:** Risk aversion parameter: 2.
* **α:** Tolerable probability of exceeding the budget: 0.4.

**Decision Variables:**

* **x<sub>j</sub>:** Binary variable indicating whether to invest in strategy j (1 if invest, 0 otherwise), where j = 1, 2, ..., 10.

**Objective Function:**

Maximize the expected return minus the risk penalty:

```
Maximize:  ∑<sub>j=1</sub><sup>10</sup> (μ<sub>j</sub> * x<sub>j</sub>) - r * σ<sup>2</sup> 
```

where:

* **μ<sub>j</sub>:** Average hi

In [157]:
print(response2.text)

```python
import pyomo.environ as pyo
import pandas as pd
import numpy as np

# Sample data for historical returns (replace with actual data from "investments_data.csv")
data = {
    'Strategy 1': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 2': [0.01, 0.02, 0.01, 0.03, 0.02],
    'Strategy 3': [0.03, 0.02, 0.04, 0.03, 0.02],
    'Strategy 4': [0.01, 0.03, 0.02, 0.04, 0.03],
    'Strategy 5': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 6': [0.01, 0.02, 0.01, 0.03, 0.02],
    'Strategy 7': [0.03, 0.02, 0.04, 0.03, 0.02],
    'Strategy 8': [0.01, 0.03, 0.02, 0.04, 0.03],
    'Strategy 9': [0.02, 0.01, 0.03, 0.02, 0.01],
    'Strategy 10': [0.01, 0.02, 0.01, 0.03, 0.02]
}
df = pd.DataFrame(data)

# Parameters
R = df.values  # Matrix of historical returns
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]  # Investment costs
S = [1000, 1100, 900, 1200]  # Budget scenarios
P = [0.55, 0.4, 0.04, 0.01]  # Probabilities of budget scenarios
r = 2  # Risk aversion parameter
alpha = 0.4  # To