# LLM Optimization Modelling Experiment

In [3]:
import vertexai
from vertexai.preview.generative_models import GenerativeModel, Image
from IPython.display import Markdown

## 1. Define the problem description

In [4]:
problem = '''We are delighted to welcome you, our newest intern on the Analytics team of Massachusetts General Hospital! You have been placed in a challenging role where you will be tasked with solving a real-world problem in the field of medical physics. We are building a pilot program in Boston, and if successful, your work could be applied widely in hospitals with limited capacity in many countries.

You are responsible for determining the best treatment plan for 17 patients who require radiotherapy. Your goal is to optimize the use of two possible treatments: photon therapy and proton therapy. While proton therapy is known to target tumors more precisely, it is also more expensive and has limited capacity in many countries. Therefore, you will need to balance the benefits of proton therapy with its limitations and cost to create an effective treatment plan for each patient.

To determine the best course of action for each patient, you will use a scoring system called the Biological Equivalent Dose (BED). This system allows you to calculate the effectiveness of each patient’s treatment plan by considering the number of proton fractions that can be used while still achieving the highest possible BED.

We have n=17 patients who need radiotherapy. Each patient i needs 15 fractions, which can be photon fractions, proton fractions, or a mix of photon and proton fractions (e.g. 4 proton fractions and 11 photon fractions). We want to use the limited proton therapy capacity as best as possible. We can calculate the BED score for each patient when p proton fractions and 15-p photon fractions are used, as BEDi(p,15-p), i.e., the BED when p proton and 15-p photon fractions are delivered for patient i. The higher the score, the better. 

The data file "ProblemData.csv" contains a 2D matrix of BED scores. It does not have an index. It was made in Excel and saved as csv. The columns are the number of proton fractions and each row represents a patient. In particular, the number at the (i,j) position is the score for patient i receiving j proton fractions. 

Suppose that the total maximal capacity C is 100 proton fractions. To maximize the total BED scores for all the patients, which patients should get proton fractions, and how many should they get? Formulate an integer linear optimization model to solve this problem. Assume you know the value BEDi(j,15-j) for each patient i. '''

## 2. Generate the mathematical model

In [5]:
#Initializing the session. To replicate, make sure the right credentials are saved in a PATH variable
PROJECT_ID = "llm4optproblems"
REGION = "us-central1"
vertexai.init(project=PROJECT_ID, location=REGION)

#Specifying the model
generative_multimodal_model = GenerativeModel("gemini-1.5-pro-preview-0409")

#The propmt applied to all problems
prompt = '''Please formulate a mathematical optimization model for this problem. Include parameters, decision variables, the objective function and the constraints in your answer.
'''

#Generate the response
response = generative_multimodal_model.generate_content([prompt+problem])


In [6]:
#Show the resopnse in a formatted way
Markdown(response.text)

## Mathematical Optimization Model for Radiotherapy Treatment Planning

**Parameters:**

*  `n`: Number of patients (n=17).
* `C`: Total proton fraction capacity (C=100).
* `BEDi(p, 15-p)`: BED score for patient `i` when receiving `p` proton fractions and `15-p` photon fractions (obtained from the "ProblemData.csv" file).

**Decision Variables:**

* `xij`: Binary variable, equal to 1 if patient `i` receives `j` proton fractions, and 0 otherwise.

**Objective Function:**

Maximize the total BED score for all patients:

```
Maximize Σ(i=1 to n) Σ(j=0 to 15) BEDi(j, 15-j) * xij
```

**Constraints:**

1. **Each patient receives exactly 15 fractions:**
   ```
   Σ(j=0 to 15) xij = 1 for all i = 1, ..., n 
   ```

2. **Total proton fraction capacity:**
   ```
   Σ(i=1 to n) Σ(j=0 to 15) j * xij <= C 
   ```

3. **Binary variable constraint:**
   ```
   xij ∈ {0, 1} for all i = 1, ..., n and j = 0, ..., 15
   ```

**Explanation:**

The objective function aims to maximize the total BED score across all patients. The first constraint ensures that each patient receives the required 15 fractions. The second constraint ensures that the total number of proton fractions allocated does not exceed the available capacity. Finally, the third constraint defines the decision variables as binary, indicating whether a specific number of proton fractions is assigned to a patient. 

This model allows you to determine the optimal allocation of proton fractions to patients while considering the limited capacity and aiming for the highest possible total BED score. You can use an integer linear programming solver to find the optimal solution for the decision variables `xij`, which will tell you how many proton fractions each patient should receive. 


## 3. Generate the pyomo code

In [7]:
#Second prompt gets the output of the previous step and generates the code
prompt2 = "Please write pyomo code for this mathematical problem. Use sample data where needed. Indicate where you use sample data. \n"
prompt2 += response.text 
response2 = generative_multimodal_model.generate_content([prompt2])

In [8]:
#Showing the code in a formatted way
Markdown(response2.text)

```python
import pyomo.environ as pyo
import pandas as pd

# Sample Data: Replace with actual data from "ProblemData.csv"
data = {
    'Patient': [f'Patient {i+1}' for i in range(17)],
    'BED_0': [100, 95, 105, 98, 102, 97, 103, 99, 101, 96, 104, 98, 102, 97, 103, 99, 101],
    'BED_1': [105, 100, 110, 103, 107, 102, 108, 104, 106, 101, 109, 103, 107, 102, 108, 104, 106],
    # ... Add BED data for 2 to 14 proton fractions ...
    'BED_15': [150, 145, 155, 148, 152, 147, 153, 149, 151, 146, 154, 148, 152, 147, 153, 149, 151]
}
df = pd.DataFrame(data)

# Model Parameters
n = 17  # Number of patients
C = 100  # Total proton fraction capacity

# Create a Pyomo model
model = pyo.ConcreteModel()

# Sets
model.I = pyo.Set(initialize=df['Patient'].tolist())  # Set of patients
model.J = pyo.Set(initialize=range(16))  # Set of proton fractions (0 to 15)

# Parameters
model.BED = pyo.Param(model.I, model.J, initialize=lambda model, i, j: df[f'BED_{j}'][df['Patient'] == i].values[0])

# Decision Variables
model.x = pyo.Var(model.I, model.J, within=pyo.Binary)

# Objective Function
def objective_rule(model):
    return sum(model.BED[i, j] * model.x[i, j] for i in model.I for j in model.J)
model.obj = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Constraints
def constraint_fractions_rule(model, i):
    return sum(model.x[i, j] for j in model.J) == 1
model.constraint_fractions = pyo.Constraint(model.I, rule=constraint_fractions_rule)

def constraint_capacity_rule(model):
    return sum(j * model.x[i, j] for i in model.I for j in model.J) <= C
model.constraint_capacity = pyo.Constraint(rule=constraint_capacity_rule)

# Solve the model
solver = pyo.SolverFactory('glpk')  # Use an appropriate solver
results = solver.solve(model)

# Display results
print(results)

# Extract optimal solution
for i in model.I:
    for j in model.J:
        if pyo.value(model.x[i, j]) == 1:
            print(f"{i} receives {j} proton fractions.")
```

**Explanation and Sample Data:**

*   **Sample Data:**  The code uses sample data for the BED scores in the `data` dictionary. This data should be replaced with the actual data from your "ProblemData.csv" file.
*   **Pyomo Model:** The code defines the sets, parameters, decision variables, objective function, and constraints using Pyomo's modeling framework.
*   **Solver:** The code uses the 'glpk' solver to find the optimal solution. You can replace this with other solvers as needed.
*   **Output:** The code prints the solver's results and then displays the optimal allocation of proton fractions for each patient.

Remember to update the sample BED data with your actual data and choose an appropriate solver for your problem.


## 4. Input problem data and try running the generated code

In [15]:
import pyomo.environ as pyo
import pandas as pd

# Sample Data: Replace with actual data from "ProblemData.csv"
# data = {
#     'Patient': [f'Patient {i+1}' for i in range(17)],
#     'BED_0': [100, 95, 105, 98, 102, 97, 103, 99, 101, 96, 104, 98, 102, 97, 103, 99, 101],
#     'BED_1': [105, 100, 110, 103, 107, 102, 108, 104, 106, 101, 109, 103, 107, 102, 108, 104, 106],
#     # ... Add BED data for 2 to 14 proton fractions ...
#     'BED_15': [150, 145, 155, 148, 152, 147, 153, 149, 151, 146, 154, 148, 152, 147, 153, 149, 151]
# }

## Human edited data input

# Replace 'your_file.csv' with the path to your CSV file
df = pd.read_csv('ProblemData.csv', header=None)

# Assuming the first row contains column names
columns = ['BED_' + str(i) for i in range(df.shape[1])]

# Assigning column names
df.columns = columns

# Convert DataFrame to dictionary
data = df.to_dict(orient='list')

# Adding 'Patient' key to dictionary
data['Patient'] = [f'Patient {i}' for i in range(1, df.shape[0] + 1)]

###
df = pd.DataFrame(data)

# Model Parameters
n = 17  # Number of patients
C = 100  # Total proton fraction capacity

# Create a Pyomo model
model = pyo.ConcreteModel()

# Sets
model.I = pyo.Set(initialize=df['Patient'].tolist())  # Set of patients
model.J = pyo.Set(initialize=range(16))  # Set of proton fractions (0 to 15)

# Parameters
model.BED = pyo.Param(model.I, model.J, initialize=lambda model, i, j: df[f'BED_{j}'][df['Patient'] == i].values[0])

# Decision Variables
model.x = pyo.Var(model.I, model.J, within=pyo.Binary)

# Objective Function
def objective_rule(model):
    return sum(model.BED[i, j] * model.x[i, j] for i in model.I for j in model.J)
model.obj = pyo.Objective(rule=objective_rule, sense=pyo.maximize)

# Constraints
def constraint_fractions_rule(model, i):
    return sum(model.x[i, j] for j in model.J) == 1
model.constraint_fractions = pyo.Constraint(model.I, rule=constraint_fractions_rule)

def constraint_capacity_rule(model):
    return sum(j * model.x[i, j] for i in model.I for j in model.J) <= C
model.constraint_capacity = pyo.Constraint(rule=constraint_capacity_rule)

# Solve the model
solver = pyo.SolverFactory('glpk')  # Use an appropriate solver
results = solver.solve(model)

# Display results
print(results)

# Extract optimal solution
for i in model.I:
    for j in model.J:
        if pyo.value(model.x[i, j]) == 1:
            print(f"{i} receives {j} proton fractions.")


Problem: 
- Name: unknown
  Lower bound: 8.24
  Upper bound: 8.24
  Number of objectives: 1
  Number of constraints: 18
  Number of variables: 272
  Number of nonzeros: 527
  Sense: maximize
Solver: 
- Status: ok
  Termination condition: optimal
  Statistics: 
    Branch and bound: 
      Number of bounded subproblems: 45
      Number of created subproblems: 45
  Error rc: 0
  Time: 0.04959511756896973
Solution: 
- number of solutions: 0
  number of solutions displayed: 0

Patient 1 receives 8 proton fractions.
Patient 2 receives 8 proton fractions.
Patient 3 receives 3 proton fractions.
Patient 4 receives 0 proton fractions.
Patient 5 receives 5 proton fractions.
Patient 6 receives 0 proton fractions.
Patient 7 receives 4 proton fractions.
Patient 8 receives 15 proton fractions.
Patient 9 receives 4 proton fractions.
Patient 10 receives 5 proton fractions.
Patient 11 receives 6 proton fractions.
Patient 12 receives 0 proton fractions.
Patient 13 receives 10 proton fractions.
Patient 

In [16]:
model.obj()

8.239999999999998

## 5. Correct the code to verify model viability (optional)

## 6. Printing the outputs as strings, so they can be saved.

In [17]:
print(response.text)

## Mathematical Optimization Model for Radiotherapy Treatment Planning

**Parameters:**

*  `n`: Number of patients (n=17).
* `C`: Total proton fraction capacity (C=100).
* `BEDi(p, 15-p)`: BED score for patient `i` when receiving `p` proton fractions and `15-p` photon fractions (obtained from the "ProblemData.csv" file).

**Decision Variables:**

* `xij`: Binary variable, equal to 1 if patient `i` receives `j` proton fractions, and 0 otherwise.

**Objective Function:**

Maximize the total BED score for all patients:

```
Maximize Σ(i=1 to n) Σ(j=0 to 15) BEDi(j, 15-j) * xij
```

**Constraints:**

1. **Each patient receives exactly 15 fractions:**
   ```
   Σ(j=0 to 15) xij = 1 for all i = 1, ..., n 
   ```

2. **Total proton fraction capacity:**
   ```
   Σ(i=1 to n) Σ(j=0 to 15) j * xij <= C 
   ```

3. **Binary variable constraint:**
   ```
   xij ∈ {0, 1} for all i = 1, ..., n and j = 0, ..., 15
   ```

**Explanation:**

The objective function aims to maximize the total BED score a

In [18]:
print(response2.text)

```python
import pyomo.environ as pyo
import pandas as pd

# Sample Data: Replace with actual data from "ProblemData.csv"
data = {
    'Patient': [f'Patient {i+1}' for i in range(17)],
    'BED_0': [100, 95, 105, 98, 102, 97, 103, 99, 101, 96, 104, 98, 102, 97, 103, 99, 101],
    'BED_1': [105, 100, 110, 103, 107, 102, 108, 104, 106, 101, 109, 103, 107, 102, 108, 104, 106],
    # ... Add BED data for 2 to 14 proton fractions ...
    'BED_15': [150, 145, 155, 148, 152, 147, 153, 149, 151, 146, 154, 148, 152, 147, 153, 149, 151]
}
df = pd.DataFrame(data)

# Model Parameters
n = 17  # Number of patients
C = 100  # Total proton fraction capacity

# Create a Pyomo model
model = pyo.ConcreteModel()

# Sets
model.I = pyo.Set(initialize=df['Patient'].tolist())  # Set of patients
model.J = pyo.Set(initialize=range(16))  # Set of proton fractions (0 to 15)

# Parameters
model.BED = pyo.Param(model.I, model.J, initialize=lambda model, i, j: df[f'BED_{j}'][df['Patient'] == i].values[0])

# Decisio