## Kursawe Function Optimization via Apptainer Microservice

In this notebook we will demonstrate how to using LINQX infrastructure to run closed-loop optimization jobs on containerized microservices run via Apptainer. There are 3 main learning objectives in this notebook.

1. How to containerize microservices using Apptainer in a manner that is compatible with LINQX `ApptainerDriverCommand` objects
2. How to build `ApptainerDriverCommand` objects which can create and interact with Apptainer instances
3. How to build `ApptainerDriverCommand` objects into a workflow which can be run for closed-loop optimization using the EvoTorch optimization package

### Other Notebooks
- See `ideal_gas_law_opt.ipynb` for a tutorial on the basic LINQX infrastructure components
- See `interactive_gas_law_opt.ipynb` for a tutorial on how to build and run workflows/optimization with interactive commands
- See `optimize_mol.ipynb` for a tutorial on how to build and run workflows/optimization with containerized microservices using Docker

### Requirements
Please make sure that have the following dependencies installed
- `pydantic version 2.3.0`
- `torch version 2.0.1`
- `evotorch version latest`
- `matplotlib version latest`
- `pandas version latest`
- `spython version latest`

### Background

The Kursawe function is a function which can be used to test an optimization algorithms ability to perform multi-object optimization.

The Kursawe function is defined as follows

```python
min (
    sum(i,1,2) [-10 * exp(-0.2 * sqrt((x_i)^2 + (x_(i+1))^2))], # function 1
    sum(i,1,3) [|x_i|^0.8 + 5 * sin((x_i)^3)] # function 2
)
```
With bounds of
```python
-5 <= x_i <= 5
1 <= i <= 3
```

### Kursawe Python Script

We provide a python script (`microservices/kursawe.py`) which has two endpoints accessed via command line arguments:
1. `--f1` corresponding to `function 1`
2. `--f2` corresponding to `fucnction 2`

The `kursawe.py` script also takes one additional required argument `--Vector` which is designed to take in a list corresponding to `x_1` through `x_3`

It is important that all endpoints of your python script provide output in JSON format. For example in the Kursawe python script our endpoints provide output in the following format:

```json
{
    "f1": 0.000
}
```

### Kursawe Definition File

We also provide a definition file (`microservices/kursawe.def`) which can be used by Apptainer to build a image that can run our Kursawe microservice functions as various applications. A few notes about how this definition file needs should be structured to be compatible with LINQX infrastructure:

1. All input parameters that are needed for a microservice to run should be passed in as enviroment varaibles if possible. For example in the Kursawe microservice we supply the vector from the environemnt varaible `$VECTOR`.

    ```bash
    python kursawe.py --f1 --f2 --vector="$VECTOR"
    ```

    With the ability to set default values in the environment section

    ```bash
    %environment
        export VECTOR="[0,1,2]"
    ```

2. We reccomend defining an app for each endpoint of the microservice. For example in the Kursawe microservice we have two apps, one for running each function.

    ```bash
    %apprun f1 # Runs function 1
        conda run -n torch_env python /kursawe.py --f1 --vector="$VECTOR"
    
    %apprun f2 # runs function 2
        conda run -n torch_env python /kursawe.py --f2 --vector="$VECTOR"
    ```

3. If you want to provide the user the ability to run all applications at once, we reccomend doing this in the runscript

    ```bash
    %runscript # Runs function 1 and function 2
        conda run -n torch_env python /kursawe.py --f1 --f2 --vector="$VECTOR"
    ```


### Building the Image

To build the Apptainer image which will be used in this notebook, navigate over to the `microservices` folder and run the following command

```bash
apptainer build kursawe.sif kursawe.def
```

Please be paitent as the container builds as it may take a while to download and install required Anaconda packages.

Once the image is finsihed building you should see the `kursawe.sif` file in `microservices` directory. You are now ready to proceede to the next steps.

### Imports


In [None]:
%matplotlib inline
import sys
sys.path.append("../JSON/")

from models.parameter.base import ParameterModel
from models.command.container import ApptainerDriverCommand
from models.workflow.base import BaseDriverWorkflow
from models.optimizer.base import BaseObjectiveFunction

import torch
from evotorch import Problem
from evotorch.algorithms import SteadyStateGA
from evotorch.operators import GaussianMutation
from evotorch.logging import PandasLogger

import matplotlib.pyplot as plt
import pandas as pd

### Defining the `ParameterModel` Object and `Parameter` Class

The Kursawe function will take in a vector of length 3 so we need to define a `ParameterModel` object which can provide a template for the vector.

After that we convert the `ParameterModel` object to a `Parameter` class using the `.to_param()` method which gives us a way to build vectors which are confined to the bounds we set in the model.

In [None]:
# Create parameter model for kursawe optimization
vector_model = ParameterModel(
    name='Vector',
    data_type='float',  # The vector is a list of floats
    upper_limit=5,      # The upper bound of the optimization problem is 5
    lower_limit=-5,     # The lower bound of the optimization problem is -5
    default=[0,0,0],    # Give all vectors a default of [0,0,0]
    is_list=True        # The vector is a list
)

# Create the parameter class
Vector = vector_model.to_param()

### Defining the `ApptainerDriverCommand` Objects

After building the `Parameter` class for the vector, we must define an `ApptainerDriverCommand` for each instance of a microservice which we want to run. This is designed to work with parallelization as each command is linked to its own instance of the microservice. When you call the command on parameters (`__call__` override), it will run the command as it is specified in its definition.

Here we set up two `ApptainerDriverCommand`, one for running the `f1` app and another for running the `f2` app. `ApptainerDriverCommand` objects can also be set up to run the runscript or run specific apps at execution time.

In [None]:
IMAGE_NAME = "../microservices/kursawe.sif"
# Build a command for running f1 of the kursawe microservice (app f1)
f1 = ApptainerDriverCommand(
    name='run_function_1',
    parameters={
        "VECTOR": Vector()                  # The only parameter is the vector
    },
    uuid="cluster",                         
    image_name=IMAGE_NAME, # The path to the .sif file
    fn=None,
    run_app=True,                           # We are running an app of the instance
    app='f1',                               # The app we are running is f1 (function 1)
    start_delay=0,                          # There is no need to have a start delay
    has_return=True,                        # We are expecting a result to be returned
)

# Build a command for running f2 of the kursawe microservice (app f2)
f2 = ApptainerDriverCommand(
    name='run_function_2',
    parameters={
        "VECTOR": Vector()   
    },
    uuid="cluster",
    fn=None,
    image_name=IMAGE_NAME,
    run_app=True,
    app='f2',                               # We now want to run the f2 app instead of f1
    start_delay=0,
    has_return=True,
)

We can also run the runscript of the Apptainer image by setting the `run_script` attribute of the `ApptainerDriverCommand` object to True.

In [None]:
# Build a command for running f1 and f2 of the kursawe microservice (runscript)
f1_and_f2= ApptainerDriverCommand(
    name="run_function_1_and_2",
    parameters={
        "VECTOR": Vector()
    },
    uuid="cluster",
    fn=None,
    image_name=IMAGE_NAME,
    run_script=True,
    start_delay=0,
    has_return=True
)

### Starting Apptainer Instances Linked to the Commands

You can start an instance associated with a specific command by calling the `.start()` method. This will start a new Apptainer instances of the image defined in the command and assign that instance to the `ApptainerDriverCommand` (`._instance` attribute). 

Note that it is not required to call the `.start()` method prior to running the command as the `__call__` override will start a new instance if no instance is currently assign to the command. 

In [None]:
# Start the microservices (instances) linked to each command
f1.start()
f2.start()
f1_and_f2.start()
for x in f1._client.instances(quiet=True): print(x.name) 

### Stopping Apptainer Instances Linked to the Commands

If you need to stop an Apptainer instance that is associated with a specific `ApptainerDriverCommand` you can call the `.stop()` method. This will stop the current instance that is associated with the `ApptainerDriverCommand` but another instance can be started for the commmand by calling the `.start()` method or by just calling the command (`__call__` override).

In [None]:
# Stop the microservices (instances) linked to each command
f1.stop()
f2.stop()
f1_and_f2.stop()

### Running Apptainer Instances Linked ot the Commands

To run microservices that are linked to `ApptainerDriverCommand` objects all that needs to be done is to call the object as a function (`__call__` override) with the appropriate parameters passed in as arguments. 

Since we set the parameter `VECTOR` to take in a vector of length 3, we call `f1`, `f1`, and `f1_and_f2` with a vector of length 3. 

In [None]:
output_1 = f1(VECTOR=[-1,3,4])
output_2 = f2(VECTOR=[-1,3,4])
output_1_2 = f1_and_f2(VECTOR=[-1,3,4])
print(f"Output of running app f1: {output_1}")
print(f"Output of running app f2: {output_2}")
print(f"Output of running runscript: {output_1_2}")

### Building a Workflow using Apptainer Based Microservices

We can build our commands into a `DriverWorkflow` object which can be used for closed-loop optimization. This workflow will take in a list of `DriverCommand` objects which will be executed in linear order.

After initalization of the `DriverWorkflow`, we can run the workflow using the `.exec()` method and providing appropraite values for the `list_kwargs` and `list_save_vars` arguments:

- `list_kwargs`: This argument is designed to take in a list of argument mappings that will mapped into each command in the workflow at runtime. If we do not want to provide any arguments for a specific command at runtime we set the position in the list corresponding to that command to `None`. 

- `list_save_vars`: This argument is designed to take in a list of varaibles to save off from the JSON output of each command (provide as a dictionary of string keys and string values) to a workflows set of global varaibles. If no varaibles are supposed to be saved off for a specific command, we set the position in the list corresponding to that command to `None`.

For example in the Kursawe workflow, we want to provide both `f1` and `f2` with the same vector (`{"VECTOR": [-1,3,4]}`) and we want to save off the output of `f1` in the workflow globals with the key `"f1"` and `f2` with the key `"f2"`.

In [None]:
# Build Workflow corresponding to Kursawe function
kursawe_workflow = BaseDriverWorkflow(
    name="kursawe_workflow",
    commands=[
        f1,
        f2,
    ],
)
workflow_output = kursawe_workflow.exec(
    list_kwargs=[
        {"VECTOR": [-1,3,4]},
        {"VECTOR": [-1,3,4]},
    ],
    list_save_vars=[
        {"f1":"f1"},
        {"f2":"f2"}
    ]
)
print(workflow_output)

### Building a Objective (Fitness) Function using a Workflow

In order to run closed-loop optimiation on your Apptainer microservices, you need to build your microservice workflow into a objective (fitness) function that is compatible with the optimization package of your choice.

In this notebook, we are using the EvoTorch optimization package to optimize our workflow. Thus we use the `BaseObjectiveFunction` class which is compatible with EvoTorch optimizers. This class has the following attributes:

- `workflow`: This argument should be the workflow object that will be called (via `.exec()` method) in this objective function

- `order_kwargs`: This argument should corresponds to a list of positions (tuple) of objective function tensor input will be applied to certian commands in the workflow

- `list_save_vars`: Same as previous `list_save_vars` argument

- `fitness_criteria`: This argument should be a list of varaibles to access from the workflow globals and assess as fitness function criteria during optimization

For this example we assign the `workflow` to our `kursawe_workflow` we built, `order_kwargs` to set the `VECTOR` parameter to all positions of the input tensor, `list_save_vars` to save off the output of both function, and `fitness_criteria` to be the output of both functions.

In [None]:
kursawe_obj_fn = BaseObjectiveFunction(
    name="kursawe_obj_fn",
    workflow= kursawe_workflow,
    order_kwargs= [
        {"VECTOR": (0,3)},
        {"VECTOR": (0,3)},
    ],
    list_save_vars=[
        {"f1":"f1"},
        {"f2":"f2"},
    ],
    fitness_criteria=["f1","f2"]
)

print(f"Objective function output tensor: {kursawe_obj_fn(torch.Tensor([-1,3,4]))}")

### Optimization of the Objective Function using EvoTorch

After wrapping the workflow in a fitness function that is compatible with the chosen optimization package, we need to define how the optimization algorithm traverse the search space defined by our function. Using EvoTorch, this is accomplished in the following manner:

1. Defining a Problem: The `Problem` class is designed to define a search spaces and its bounds. We need to create one which is designed to search the space defined by our objective function.

2. Defining a Search Algorithm: We need to have some form of search algorithm which will traverse the space defined by our problem in a directed manner. We choose the `SteadyStataGA` class and tell it to use the `GuassianMutation` operator providing us with a basic evolutionary algorithm. We set the size of the population to 10, meaning that it will initalize and run 10 seperate traversals through the search space

3. Run the Search Algorithm: We will run each population member of the search algorithm for a specific number of steps allowing it to traverse through the space in a directed manner.

In [None]:
# Define the problem which contains our search space
problem = Problem(
    objective_sense=["min","min"],      # In the Kursawe function we want to minimize both f1 and f2
    objective_func=kursawe_obj_fn,      # We want the objective function to be the objective function we created
    bounds=(
        Vector().lower_limit,           # Our lower limit is based on our Vector parameter (-5)
        Vector().upper_limit,           # Out upper limit is based on our Vector paramter (5)
    ),
    solution_length=3,                  # The input vector to the Kursawe function has 3 positions
)

# Define the search algorithm 
searcher = SteadyStateGA(
    problem=problem, 
    popsize=10                          # Our search algorithm will keep track of 10 simultaneous runs
)  

# Have the sarch algorithm use the guassian mutation operator
searcher.use(
    GaussianMutation(problem,stdev=0.1) # Our guassian mutation operator will have a standard devation of 0.1
)
logger = PandasLogger(searcher)

We are going the run each seed of the search algorithm for 10 iterations. This may take a couple of miniutes depending on your system specifications.

In [None]:
searcher.run(10)

### Visualization of Optimiation Results

Now lets see how well we were able to optimize the two objective over the 10 iterations

In [None]:
eval_df = logger.to_dataframe()
print(eval_df)
eval_df["obj0_pop_best_eval"].plot(alpha=0.4, marker=".", label="Function 1")
eval_df["obj1_pop_best_eval"].plot(alpha=0.4, marker=".", label="Function 2")
plt.legend(title="Function")
plt.ylabel("F(x)")
plt.xlabel("Iteration")
plt.title("Kursawe Function Optimization")
plt.show()

We can also see the results of all of the seeds after 10 iterations

In [None]:
results = []
for elem in searcher.population: results.append(list(elem.access_evals()))
population_df = pd.DataFrame(results).astype(float)
population_df = population_df.set_axis(["objective_0", "objective_1"], axis=1)
print(population_df)
plt.clf()
plt.scatter(population_df["objective_0"], population_df["objective_1"], alpha=0.4, color="g")
plt.xlabel("F1(x)")
plt.ylabel("F2(x)")
plt.title("Kursawe Optimization Population Seed Values after 10 Iterations")
plt.show()

### Microservices that run on GPU

There are instances where microservices need GPU access to function correctly. While the Kursawe optimization example that was previously shown does not, we have created a simple image from Nvidia base CUDA image which need GPU access to run correctly.

Go ahead and build `nvidia_smi.sif` from `nvidia_smi.def` located in the `microservices` folder

```bash
apptainer build nvidia_smi.sif nvidia_smi.def
```

The runscript of this image is set to run the `nvidia-smi` command which will show system GPU information if the container has access to system GPU, otherwise it will fail.


We start by creating a command without GPU access and running it

In [None]:
NVIDIA_IMAGE_NAME = '../microservices/nvidia_smi.sif'
nvidia_smi_no_gpu = ApptainerDriverCommand(
    name='nvidia-smi',
    uuid="cluster",                         
    image_name=NVIDIA_IMAGE_NAME, # The path to the .sif file
    fn=None,
    run_script=True,                            # We are running the runscript of the instance                           
    start_delay=0,                              # There is no need to have a start delay                         
    verbose=True
)

nvidia_smi_no_gpu()

Now lets set the commands `gpu` attribute to True and see if the output changes and we can see the GPU listing

In [None]:
nvidia_smi_gpu = ApptainerDriverCommand(
    name='nvidia-smi',
    uuid="cluster",                         
    image_name=NVIDIA_IMAGE_NAME, # The path to the .sif file
    fn=None,
    run_script=True,                            # We are running the runscript of the instance                           
    start_delay=0,                              # There is no need to have a start delay               
    gpu=True,
    verbose=True
)

nvidia_smi_gpu()

Note that GPU drivers are mount upon instance start, so if you do not start or run the command initially with GPU you will need to restart the commands instance (`.start()`) to get the GPU driver mount.

### Stopping Instances

It is important to note that Apptainer instances linked to commands are not automatically stopped upon exiting any Python script or Jupyter notebook. They will need to be stopped via the `ApptainerDriverCommand` objects `.stop()` method. 

You can also stop all instances with the following command

```bash
apptainer instance stop --all
```

In [None]:
f1.stop()
f2.stop()
f1_and_f2.stop()
nvidia_smi_no_gpu.stop()
nvidia_smi_gpu.stop()

### Tested
```
Python 3.10.13
apptainer version 1.1.7-1.el7
```