[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jrkasprzyk/CVEN5393/blob/main/epsilon_nondominance_from_file.ipynb)

*This notebook is part of course notes for CVEN 5393: Water Resource Systems and Management, by Prof. Joseph Kasprzyk at CU Boulder.*

In this notebook, we will perform epsilon non-dominated sorting of solutions, using the Platypus Python library. Generic solution data is included.

# Install platypus-opt and load packages

In [33]:
!pip install platypus-opt



In [34]:
from platypus import *
import numpy as np
import pandas as pd

Functions to perform the sorting

In [35]:
def df_to_pt(df, objective_directions, nobjs, nvars=0, nconstrs=0):
  problem = Problem(nvars=nvars, nobjs=nobjs, nconstrs=nconstrs)
  pt = []
  for index, row in all_solutions_df.iterrows():
    # create solution object
    solution = Solution(problem)

    # save an id for which row of the original
    # dataframe this solution came from. really important
    # for cross-referencing things later!
    solution.id = index

    # populate the objective values into platypus, correcting
    # the maximized objectives by multiplying by -1
    for j in range(num_objs):
      if objective_directions[j] == 'minimize':
        solution.objectives[j] = row[objective_names[j]]
      elif objective_directions[j] == 'maximize':
        solution.objectives[j] = -1.0*row[objective_names[j]]

    # add the solution to the list
    pt.append(solution)
  return pt

In [36]:
def label_eps_nd(df, label_col, epsilons, objective_directions, nobjs, nvars, nconstrs=0):

  # reset the label column
  df[label_col] = False

  # convert to platypus format
  pt = df_to_pt(df, objective_directions, nobjs, nvars, nconstrs)

  # save the epsilon non-dominated solutions to a new list of platypus solutions
  eps_pt = EpsilonBoxArchive(epsilons)
  for solution in pt:
    eps_pt.add(solution)

  # save which ids ended up being epsilon non-dominated
  eps_ids = [sol.id for sol in eps_pt]

  # add labels to the epsilon non-dominated solutions
  for id in eps_ids:
    df.at[id, label_col] = True

  return df

# Prepare list of all solutions

In this step, we need to prepare the solutions that will be analyzed in the sorting process.

A solution to a multi-objective problem comprises the following types of data:


*   *Decision Variables (defining actions, not included in this example)*
*   Objectives (multiple measures of the solution's performance)
*   *Constraint Violations (not included in this example)*
*   *Extra Metrics (other measures of the solution's performance, not included in this example)*

To place solution data in a dataframe, objectives (and other variables) are in columns, and each solution is in its own row. We will call the dataframe `all_solutions_df`. Later, when we do the sorting, we will add columns to store flags that say whether or not a solution is epsilon non-dominated.





In [42]:
# Create dataframe from a dict
# https://builtin.com/data-science/dictionary-to-dataframe

num_decs = 0

objective_names = [
    'Cost',
    'Reservoir Capacity',
    'Reliability',
    'Worst-Case Shortfall',
    'Average Length of Shortfall'
]
num_objs = len(objective_names)

objective_directions = [
    'minimize',
    'maximize',
    'maximize',
    'minimize',
    'minimize',
    'minimize'
]

solutions = {
    '1':  [110.1, 3.3, 0.95, 0.1,  2],
    '2':  [85,    2.1, 0.9,  0.3,  3],
    '3':  [0.1,   1.0, 0.6,  0.4,  6],
    '4':  [0.0,   1.0, 0.5,  0.45, 6],
    '5':  [90.0,  2.7, 0.92, 0.4,  4],
    '6':  [82.0,  3.2, 0.92, 0.4,  4],
    '7':  [50,    6.0, 0.8,  0.1,  1],
    '8':  [55,    4.0, 0.5,  0.05, 3],
    '9':  [120,   5.0, 1.0,  0.4,  1],
    '10': [110.1, 3.3, 0.9,  0.05, 2],
    '11': [39,    1.6, 0.6,  0.6,  2],
    '12': [20,    1.5, 0.6,  0.5,  4]
}

We will save the data in a dataframe. We will add a new column that will store the results of the sorting. In other words, when we do an epsilon non-domination sort, each solution will be labeled with `True` when it is epsilon non-dominated, and `False` if not.

In [43]:
all_solutions_df = pd.DataFrame.from_dict(
    solutions,
    orient='index',
    columns=objective_names)

all_solutions_df["Eps Nd"] = False

# Perform Epsilon Non-dominated Sort

In [51]:
epsilons = [100,    #cost
            0.1,   #capacity
            0.05,  #reliability
            0.01,  #worst-case shortfall
            3,     #length of shortfall
            ]

In [52]:
all_solutions_df = label_eps_nd(all_solutions_df, "Eps Nd", epsilons, objective_directions, num_objs, num_decs, 0)

In [53]:
all_solutions_df

Unnamed: 0,Cost,Reservoir Capacity,Reliability,Worst-Case Shortfall,Average Length of Shortfall,Eps Nd
1,110.1,3.3,0.95,0.1,2,True
2,85.0,2.1,0.9,0.3,3,True
3,0.1,1.0,0.6,0.4,6,False
4,0.0,1.0,0.5,0.45,6,False
5,90.0,2.7,0.92,0.4,4,False
6,82.0,3.2,0.92,0.4,4,True
7,50.0,6.0,0.8,0.1,1,True
8,55.0,4.0,0.5,0.05,3,True
9,120.0,5.0,1.0,0.4,1,True
10,110.1,3.3,0.9,0.05,2,True


Saving the data in this manner allows us to see which of the ‘original’ solutions survived the test.

The code in this notebook assumes all the objectives are included in the epsilon non-dominated sorting process. However, with some modifications, you could perform multiple ‘tests’ on your solutions. For example, imagine that you had labels that indicated that a given row of the big dataframe came from a given optimization experiment .. then you could do lots of interesting things like show which ones are epsilon non-dominated across all experiments, within one experiment, etc. You just repeat the same procedure just assigning different labels to the original set.