# LLM Optimization Modelling Experiment

In [1]:
import vertexai
from vertexai.preview.generative_models import GenerativeModel, Image
from IPython.display import Markdown

## 1. Define the problem description

In [304]:
problem = '''Your goal is to invest in several of 10 possible investment strategies in the most optimal way. The historic returns of those strategies are stored in the file "investments_data.csv". Each column represents one strategy and the rows are the past investment outcomes. There is no index and the values are separated by a ;.

The costs for investing in a given investment is stored in a vector A, which has one value for each strategy in order.  
The values are: [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]

You can only invest once into an investment. 

Unfortunately due to other costs and inflation, your available budget at this time is uncertain. There are four possible budget scenarios with different probabilities: scenario 1 with 1000 euros and probability of 0.55, scenario 2  with 1100 euros and probability of 0.4, scenario 3 with 900 euros and probability of 0.04, scenario 4 with 1200 euros and probability of 0.01. 
The tolerable probability of exceeding the budget is 0.4.

Please formulate a mean-variance mathematical model for this optimization problem, considering the past performance of investment strategies and the uncertain budget. You can take 2 as the risk parameter r.'''

## 2. Ask for parameters

In [305]:
#Initializing the session. To replicate, make sure the right credentials are saved in a PATH variable
PROJECT_ID = "llm4optproblems"
REGION = "us-central1"
vertexai.init(project=PROJECT_ID, location=REGION)

#Specifying the model
generative_multimodal_model = GenerativeModel("gemini-1.5-pro-preview-0409")

#The propmt applied to all problems
prompt = '''Please formulate only the variables for this mathematical optimization problem. 
'''

#Generate the response
response = generative_multimodal_model.generate_content([prompt+problem])


In [306]:
#Show the resopnse in a formatted way
Markdown(response.text)

## Sets:
*  **S:** Set of investment strategies (1, 2, ..., 10)
*  **Ω:** Set of budget scenarios (1, 2, 3, 4)

## Parameters:
*  **R<sub>is</sub>:** Return of strategy *s*∈S in historical period *i* (from "investments_data.csv") 
*  **C<sub>s</sub>:** Cost of investing in strategy *s*∈S (from vector A: [80, 340, 410, 50, 180, 221, 15, 348, 191, 225])
*  **B<sub>ω</sub>:** Budget available in scenario *ω*∈Ω ([1000, 1100, 900, 1200])
*  **P<sub>ω</sub>:** Probability of scenario *ω*∈Ω ([0.55, 0.4, 0.04, 0.01])
*  **α:** Maximum tolerable probability of exceeding the budget (0.4)
*  **r:** Risk parameter (2)

## Decision Variables:
*  **x<sub>s</sub>:** Binary variable equal to 1 if investment strategy *s*∈S is selected, 0 otherwise 

## Auxiliary Variables:
*  **y<sub>ω</sub>:** Binary variable equal to 1 if the budget is exceeded in scenario *ω*∈Ω, 0 otherwise 


# 2. Ask for objective

In [307]:
#Second prompt gets the output of the previous step and generates the code
prompt2 = "Please formulate only the objective function for this mathematical optimization problem."
prompt2 += problem + response.text
response2 = generative_multimodal_model.generate_content([prompt2])

In [308]:
Markdown(response2.text)

$$\text{Minimize} \quad  \sum_{i}  \left( \frac{1}{|I|} \sum_{s \in S} (R_{is}  \cdot x_s) -  \frac{1}{|I|-1} \sum_{i} (\sum_{s \in S}(R_{is}  \cdot x_s)-\sum_{i}\sum_{s \in S}(R_{is}  \cdot x_s) / |I|)^2 \right) + r \cdot \sum_{\omega \in \Omega} P_{\omega} \cdot y_{\omega}$$ 


# 3. Ask for constraints

In [309]:
#Second prompt gets the output of the previous step and generates the code
prompt3 = "Please formulate only the constraints for this mathematical optimization problem."
prompt3 += problem + response.text + response2.text
response3 = generative_multimodal_model.generate_content([prompt3])

In [310]:
Markdown(response3.text)

## Constraints:

1. **Budget Constraint:** The total cost of selected investments should not exceed the budget in each scenario *ω*∈Ω, unless the corresponding binary variable *y<sub>ω</sub>* is equal to 1. 
   
   $$  \sum_{s \in S} C_s \cdot x_s  \le B_\omega + y_\omega \cdot M  \quad  \forall \omega \in \Omega $$

   where M is a large positive constant.

2. **Probability of Budget Exceedance:** The probability of exceeding the budget should be less than or equal to the tolerable probability *α*.

   $$ \sum_{\omega \in \Omega} P_\omega \cdot y_\omega \le \alpha $$

3. **Binary Investment Decision:** Each investment strategy can be selected only once.

   $$x_s \in \{0, 1\} \quad \forall s \in S $$

4. **Binary Budget Exceedance:**  The budget is either exceeded or not in each scenario.

   $$y_\omega \in \{0, 1\} \quad \forall \omega \in \Omega $$ 


## 3. Generate the pyomo code

In [311]:
#Second prompt gets the output of the previous step and generates the code
prompt4 = "Please write pyomo code for this mathematical problem. Use sample data where needed. Indicate where you use sample data. \n"
prompt4 += response.text + response2.text + response3.text 
response4 = generative_multimodal_model.generate_content([prompt4])

In [312]:
#Showing the code in a formatted way
Markdown(response4.text)

```python
import pyomo.environ as pyo
import pandas as pd

# Sample data (replace with your actual data)
investments_data = pd.DataFrame({
    'Period1': [0.1, 0.05, 0.08, 0.12, 0.09, 0.06, 0.11, 0.07, 0.13, 0.04],
    'Period2': [0.08, 0.11, 0.06, 0.09, 0.13, 0.07, 0.1, 0.05, 0.12, 0.04],
    # ... add data for other periods
})
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]
B = [1000, 1100, 900, 1200]
P = [0.55, 0.4, 0.04, 0.01]
alpha = 0.4
r = 2

# Model
model = pyo.ConcreteModel()

# Sets
model.S = pyo.Set(initialize=range(1, len(A) + 1))  # Strategies
model.Omega = pyo.Set(initialize=range(1, len(B) + 1))  # Scenarios
model.I = pyo.Set(initialize=range(1, investments_data.shape[1] + 1))  # Historical periods

# Parameters
model.R = pyo.Param(model.I, model.S, initialize=lambda model, i, s: investments_data.iloc[i - 1, s - 1])
model.C = pyo.Param(model.S, initialize=lambda model, s: A[s - 1])
model.B = pyo.Param(model.Omega, initialize=lambda model, w: B[w - 1])
model.P = pyo.Param(model.Omega, initialize=lambda model, w: P[w - 1])
model.alpha = pyo.Param(initialize=alpha)
model.r = pyo.Param(initialize=r)

# Decision Variables
model.x = pyo.Var(model.S, domain=pyo.Binary)
model.y = pyo.Var(model.Omega, domain=pyo.Binary)

# Objective function
def objective_rule(model):
    return sum(
        (1 / len(model.I)) * sum(model.R[i, s] * model.x[s] for s in model.S)
        - (1 / (len(model.I) - 1))
        * sum(
            (sum(model.R[i, s] * model.x[s] for s in model.S) - sum(model.R[i, s] * model.x[s] for i in model.I for s in model.S) / len(model.I)) ** 2
            for i in model.I
        )
        for i in model.I
    ) + model.r * sum(model.P[w] * model.y[w] for w in model.Omega)
model.obj = pyo.Objective(rule=objective_rule, sense=pyo.minimize)

# Constraints
# Constraint 1: Budget Constraint
def budget_constraint(model, w):
    M = 1e6  # Large positive constant
    return sum(model.C[s] * model.x[s] for s in model.S) <= model.B[w] + model.y[w] * M
model.budget_constraint = pyo.Constraint(model.Omega, rule=budget_constraint)

# Constraint 2: Probability of Budget Exceedance
def prob_exceedance_constraint(model):
    return sum(model.P[w] * model.y[w] for w in model.Omega) <= model.alpha
model.prob_exceedance_constraint = pyo.Constraint(rule=prob_exceedance_constraint)

# Solve the model
solver = pyo.SolverFactory('glpk')
solver.solve(model)

# Print the solution
print("Optimal Solution:")
for s in model.S:
    if pyo.value(model.x[s]) > 0.5:
        print(f"  Invest in strategy {s}")
for w in model.Omega:
    if pyo.value(model.y[w]) > 0.5:
        print(f"  Budget exceeded in scenario {w}")
print(f"Objective value: {pyo.value(model.obj)}")

```

This code defines the sets, parameters, variables, objective function, and constraints of your mathematical model.  It then uses the 'glpk' solver to find the optimal solution and prints the selected investment strategies.

Remember to replace the sample data with your actual data from "investments_data.csv" and the vectors A, B, and P. 


## 4. Input problem data and try running the generated code

In [313]:
import pyomo.environ as pyo
import pandas as pd

# Sample data (replace with your actual data)
investments_data = pd.DataFrame({
    'Period1': [0.1, 0.05, 0.08, 0.12, 0.09, 0.06, 0.11, 0.07, 0.13, 0.04],
    'Period2': [0.08, 0.11, 0.06, 0.09, 0.13, 0.07, 0.1, 0.05, 0.12, 0.04],
    # ... add data for other periods
})
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]
B = [1000, 1100, 900, 1200]
P = [0.55, 0.4, 0.04, 0.01]
alpha = 0.4
r = 2

# Model
model = pyo.ConcreteModel()

# Sets
model.S = pyo.Set(initialize=range(1, len(A) + 1))  # Strategies
model.Omega = pyo.Set(initialize=range(1, len(B) + 1))  # Scenarios
model.I = pyo.Set(initialize=range(1, investments_data.shape[1] + 1))  # Historical periods

# Parameters
model.R = pyo.Param(model.I, model.S, initialize=lambda model, i, s: investments_data.iloc[i - 1, s - 1])
model.C = pyo.Param(model.S, initialize=lambda model, s: A[s - 1])
model.B = pyo.Param(model.Omega, initialize=lambda model, w: B[w - 1])
model.P = pyo.Param(model.Omega, initialize=lambda model, w: P[w - 1])
model.alpha = pyo.Param(initialize=alpha)
model.r = pyo.Param(initialize=r)

# Decision Variables
model.x = pyo.Var(model.S, domain=pyo.Binary)
model.y = pyo.Var(model.Omega, domain=pyo.Binary)

# Objective function
def objective_rule(model):
    return sum(
        (1 / len(model.I)) * sum(model.R[i, s] * model.x[s] for s in model.S)
        - (1 / (len(model.I) - 1))
        * sum(
            (sum(model.R[i, s] * model.x[s] for s in model.S) - sum(model.R[i, s] * model.x[s] for i in model.I for s in model.S) / len(model.I)) ** 2
            for i in model.I
        )
        for i in model.I
    ) + model.r * sum(model.P[w] * model.y[w] for w in model.Omega)
model.obj = pyo.Objective(rule=objective_rule, sense=pyo.minimize)

# Constraints
# Constraint 1: Budget Constraint
def budget_constraint(model, w):
    M = 1e6  # Large positive constant
    return sum(model.C[s] * model.x[s] for s in model.S) <= model.B[w] + model.y[w] * M
model.budget_constraint = pyo.Constraint(model.Omega, rule=budget_constraint)

# Constraint 2: Probability of Budget Exceedance
def prob_exceedance_constraint(model):
    return sum(model.P[w] * model.y[w] for w in model.Omega) <= model.alpha
model.prob_exceedance_constraint = pyo.Constraint(rule=prob_exceedance_constraint)

# Solve the model
solver = pyo.SolverFactory('glpk')
solver.solve(model)

# Print the solution
print("Optimal Solution:")
for s in model.S:
    if pyo.value(model.x[s]) > 0.5:
        print(f"  Invest in strategy {s}")
for w in model.Omega:
    if pyo.value(model.y[w]) > 0.5:
        print(f"  Budget exceeded in scenario {w}")
print(f"Objective value: {pyo.value(model.obj)}")

ERROR: Rule failed for Param 'R' with index (1, 3): IndexError: index 2 is out
of bounds for axis 0 with size 2
ERROR: Constructing component 'R' from data=None failed:
        IndexError: index 2 is out of bounds for axis 0 with size 2


IndexError: index 2 is out of bounds for axis 0 with size 2

## 5. Correct the code to verify model viability (optional)

In [295]:
# THE MODEL WAS RUN IN COLLAB WITH COUENNE AND THE RIGHT DATA TO VERIFY THE VIABILITY, BUT GAVE A WRONG OUTCOME.

#CODE OUTPUT:
# investments_data.csv(text/csv) - 36584 bytes, last modified: 03/05/2024 - 100% done
# ERROR:pyomo.core:Rule failed for Param 'R' with index (11, 1):
# IndexError: index 10 is out of bounds for axis 0 with size 10
# ERROR:pyomo.core:Constructing component 'R' from data=None failed:
#     IndexError: index 10 is out of bounds for axis 0 with size 10
# Saving investments_data.csv to investments_data (7).csv
# ---------------------------------------------------------------------------
# IndexError                                Traceback (most recent call last)
# <ipython-input-12-965855bb460e> in <cell line: 51>()
#      49 
#      50 # Parameters
# ---> 51 model.R = pyo.Param(model.I, model.S, initialize=lambda model, i, s: investments_data.iloc[i - 1, s - 1])
#      52 model.C = pyo.Param(model.S, initialize=lambda model, s: A[s - 1])
#      53 model.B = pyo.Param(model.Omega, initialize=lambda model, w: B[w - 1])

# 7 frames
# /usr/local/lib/python3.10/dist-packages/pandas/core/frame.py in _get_value(self, index, col, takeable)
#    3866         if takeable:
#    3867             series = self._ixs(col, axis=1)
# -> 3868             return series._values[index]
#    3869 
#    3870         series = self._get_item_cache(col)

# IndexError: index 10 is out of bounds for axis 0 with size 10

## 6. Print the responses

In [314]:
print(response.text)

## Sets:
*  **S:** Set of investment strategies (1, 2, ..., 10)
*  **Ω:** Set of budget scenarios (1, 2, 3, 4)

## Parameters:
*  **R<sub>is</sub>:** Return of strategy *s*∈S in historical period *i* (from "investments_data.csv") 
*  **C<sub>s</sub>:** Cost of investing in strategy *s*∈S (from vector A: [80, 340, 410, 50, 180, 221, 15, 348, 191, 225])
*  **B<sub>ω</sub>:** Budget available in scenario *ω*∈Ω ([1000, 1100, 900, 1200])
*  **P<sub>ω</sub>:** Probability of scenario *ω*∈Ω ([0.55, 0.4, 0.04, 0.01])
*  **α:** Maximum tolerable probability of exceeding the budget (0.4)
*  **r:** Risk parameter (2)

## Decision Variables:
*  **x<sub>s</sub>:** Binary variable equal to 1 if investment strategy *s*∈S is selected, 0 otherwise 

## Auxiliary Variables:
*  **y<sub>ω</sub>:** Binary variable equal to 1 if the budget is exceeded in scenario *ω*∈Ω, 0 otherwise 



In [315]:
print(response2.text)

$$\text{Minimize} \quad  \sum_{i}  \left( \frac{1}{|I|} \sum_{s \in S} (R_{is}  \cdot x_s) -  \frac{1}{|I|-1} \sum_{i} (\sum_{s \in S}(R_{is}  \cdot x_s)-\sum_{i}\sum_{s \in S}(R_{is}  \cdot x_s) / |I|)^2 \right) + r \cdot \sum_{\omega \in \Omega} P_{\omega} \cdot y_{\omega}$$ 



In [316]:
print(response3.text)

## Constraints:

1. **Budget Constraint:** The total cost of selected investments should not exceed the budget in each scenario *ω*∈Ω, unless the corresponding binary variable *y<sub>ω</sub>* is equal to 1. 
   
   $$  \sum_{s \in S} C_s \cdot x_s  \le B_\omega + y_\omega \cdot M  \quad  \forall \omega \in \Omega $$

   where M is a large positive constant.

2. **Probability of Budget Exceedance:** The probability of exceeding the budget should be less than or equal to the tolerable probability *α*.

   $$ \sum_{\omega \in \Omega} P_\omega \cdot y_\omega \le \alpha $$

3. **Binary Investment Decision:** Each investment strategy can be selected only once.

   $$x_s \in \{0, 1\} \quad \forall s \in S $$

4. **Binary Budget Exceedance:**  The budget is either exceeded or not in each scenario.

   $$y_\omega \in \{0, 1\} \quad \forall \omega \in \Omega $$ 



In [317]:
print(response4.text)

```python
import pyomo.environ as pyo
import pandas as pd

# Sample data (replace with your actual data)
investments_data = pd.DataFrame({
    'Period1': [0.1, 0.05, 0.08, 0.12, 0.09, 0.06, 0.11, 0.07, 0.13, 0.04],
    'Period2': [0.08, 0.11, 0.06, 0.09, 0.13, 0.07, 0.1, 0.05, 0.12, 0.04],
    # ... add data for other periods
})
A = [80, 340, 410, 50, 180, 221, 15, 348, 191, 225]
B = [1000, 1100, 900, 1200]
P = [0.55, 0.4, 0.04, 0.01]
alpha = 0.4
r = 2

# Model
model = pyo.ConcreteModel()

# Sets
model.S = pyo.Set(initialize=range(1, len(A) + 1))  # Strategies
model.Omega = pyo.Set(initialize=range(1, len(B) + 1))  # Scenarios
model.I = pyo.Set(initialize=range(1, investments_data.shape[1] + 1))  # Historical periods

# Parameters
model.R = pyo.Param(model.I, model.S, initialize=lambda model, i, s: investments_data.iloc[i - 1, s - 1])
model.C = pyo.Param(model.S, initialize=lambda model, s: A[s - 1])
model.B = pyo.Param(model.Omega, initialize=lambda model, w: B[w - 1])
model.P = pyo.P