# Anytime analysis of moea-benchmark data
> Reproducing the plots and tables from the ECJ paper that consider outputs over a range of maximum function evaluations.

- toc: true 
- badges: true
- comments: true
- categories: [jupyter]
- image: images/anytime.png

This notebook has been rendered as an HTML page for your navigation. Yet, the notebook is also available for cloning, or to be executed online using Binder or Colab.

Below, you will find the figures and tables from the ECJ paper that consider only outputs outputs over a range of maximum function evaluations. ($\textit{FE}_\textit{max}$), which we dub **anytime analysis**. 

In the paper, we mostly focused on selected scenarios, for brevity. In this notebook, results are first presented as in the paper, and then provided for more experimental scenarios, when possible.

Finally, we remark that this first version of the notebook does not include Section 7 plots.

---
## Setup

The data for anytime analysis is provided in the original [moea-benchmark repository](https://github.com/leobezerra/moea-benchmark), and can be read using the `pandas` data science library for Python.

In [1]:
#collapse-hide
import pandas as pd
df_anytime = pd.read_csv("https://github.com/leobezerra/moea-benchmark/raw/master/anytime.csv.gz")

In [2]:
#hide_input
df_anytime.head()

Unnamed: 0,setup,config,algo,indicator,nobj,problem,nvar,seed,0,100,...,49100,49200,49300,49400,49500,49600,49700,49800,49900,50000
0,default,,ibea,rpd,3,DTLZ2,30,1,0.018914,0.018914,...,2.6e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05
1,default,,ibea,rpd,3,DTLZ2,30,2,0.014705,0.014705,...,2.6e-05,2.5e-05,2.5e-05,2.6e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05
2,default,,ibea,rpd,3,DTLZ2,30,3,0.012488,0.012488,...,2.6e-05,2.5e-05,2.6e-05,2.6e-05,2.6e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.4e-05
3,default,,ibea,rpd,3,DTLZ2,30,4,0.016872,0.016872,...,2.5e-05,2.4e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05,2.5e-05
4,default,,ibea,rpd,3,DTLZ2,30,5,0.017557,0.017557,...,2.6e-05,2.6e-05,2.6e-05,2.5e-05,2.5e-05,2.5e-05,2.6e-05,2.6e-05,2.5e-05,2.4e-05


In the data above, `setup` indicates whether settings used are default or tuned. In the latter case, `config` indicates for which $\textit{FE}_\textit{max}$ value the settings were configured. 

Besides `pandas`, we will also use the Plotly interactive data visualization library.

In [3]:
#collapse-hide
import re

import plotly.express as px
import plotly.graph_objects as go

We make three adjustments to the data prior to plotting.
- To improve clarity, we fill the missing `config` values with `default`.
- Since the data was produced using 25 different seeds, we compute the mean of the runs.
- We index the data by $\textit{FE}_\textit{max}$, which greatly increases the memory usage, but is a requirement for plotting time series data with Plotly.

In [4]:
#collapse-hide
df_anytime["config"] = df_anytime["config"].fillna('default')

df_anytime_mean = df_anytime.groupby(["setup", "config", "algo", "indicator", "nobj", "problem", "nvar"]).mean()
df_anytime_mean = df_anytime_mean.drop(columns=["seed"])

ts_anytime = df_anytime_mean.stack().reset_index(
    ["setup", "config", "algo", "indicator", "nobj", "problem", "nvar"], 
    name="value"
)
ts_anytime.index = ts_anytime.index.astype("int")

In [5]:
#hide_input
ts_anytime.head()

Unnamed: 0,setup,config,algo,indicator,nobj,problem,nvar,value
0,default,default,ibea,rpd,3,DTLZ2,30,0.016579
100,default,default,ibea,rpd,3,DTLZ2,30,0.016579
200,default,default,ibea,rpd,3,DTLZ2,30,0.016579
300,default,default,ibea,rpd,3,DTLZ2,30,0.010874
400,default,default,ibea,rpd,3,DTLZ2,30,0.008861


---
## Section 5: Preliminary analysis

In this notebook, we focus on figures and tables that use only anytime analysis, namely Figures 2 and 4.

The remainder figures of this section are provided in the [snapshot analysis notebook](https://leobezerra.github.io/moea-benchmark-analysis/jupyter/2022/03/30/snapshot.html).

### Figure 2

Figure 2 depicts the evolution of the $\textit{HV}_\textit{rd}$ performance of IBEA using DE as underlying EA with different numerical parameter settings, on a given experimental scenario.

A few resources from Plotly can be useful for navigation:
- selecting a subset of the settings, by clicking on their names in the legend
- zooming into a given range of the plot, by selecting an area of the plot

In [6]:
#collapse-hide
ts_ibea_WFG8_3_30 = ts_anytime.query("algo == 'ibea' and problem == 'WFG8' and nobj == 3 and nvar == 30")\
                              .sort_index()
fig2 = px.line(
    ts_ibea_WFG8_3_30,
    y="value",
    x=ts_ibea_WFG8_3_30.index,
    color="config",
    line_dash="config",
)

In [7]:
#hide_input
fig2.show()

Alternatively, we also provide code to produce the full set of plots from the data produced in this IBEA experiment.

In [8]:
#collapse_hide
ts_ibea = ts_anytime.query("algo == 'ibea'")
fig2_full = px.line(
    ts_ibea,
    y="value",
    x=ts_ibea.index,
    color="config",
    line_dash="config",
    animation_frame="problem",
    facet_col="nvar",
    facet_col_wrap=3,
    range_y=(0,0.4)
)

In [9]:
#hide_input
fig2_full.show()

We remark that, for some problems, IBEA might perform better at a given $\textit{FE}_\textit{max}$ snapshot using a setting configure for a different $\textit{FE}_\textit{max}$. 

Yet, when we compute rank sums using the results from the values used for configuration, we see that each setting is the most adequate choice for its corresponding setup.

In [10]:
#collapse_hide
df_ibea_snapshots = ts_ibea.loc[[2500,10000,40000]].reset_index().rename(columns={"index": "FE"})
rs_ibea_snapshots = df_ibea_snapshots.drop(columns=["setup", "algo", "indicator"])\
                                     .pivot_table(index=["problem", "nobj", "nvar", "FE"], columns=["config"])\
                                     .rank(axis=1).groupby("FE").sum()

In [11]:
#hide_input
for FE in [2500, 10000, 40000]:
    rs_ibea_diff = (rs_ibea_snapshots.loc[FE] - rs_ibea_snapshots.loc[FE].min())
    display(rs_ibea_diff.sort_values().to_frame().T)

Unnamed: 0_level_0,value,value,value,value
config,2500.0,10000.0,40000.0,default
2500,0.0,30.0,76.0,90.0


Unnamed: 0_level_0,value,value,value,value
config,10000.0,40000.0,2500.0,default
10000,0.0,19.0,46.0,59.0


Unnamed: 0_level_0,value,value,value,value
config,40000.0,10000.0,default,2500.0
40000,0.0,30.0,33.0,93.0


### Figure 4

Figure 2 depicts the evolution of the $\textit{HV}_\textit{rd}$ performance of different MOEAs using parameter settings tuned for a common stopping criterion ($\textit{FE}_\textit{max} = 10,000$) on a given experimental scenario.

In [12]:
#collapse_hide
ts_10k_WFG3_3_50 = ts_anytime.query("config == 10000 and problem == 'WFG3' and nobj == 3 and nvar == 50")\
                             .sort_index()
fig4 = px.line(
    ts_10k_WFG3_3_50,
    y="value",
    x=ts_10k_WFG3_3_50.index,
    color="algo",
    line_dash="algo",
)

In [13]:
#hide_input
fig4.show()

Alternatively, we also provide code to produce the full set of plots from the data produced in this experiment.

In [14]:
#collapse_hide
ts_10k = ts_anytime.query("config == 10000")
fig4_full = px.line(
    ts_10k,
    y="value",
    x=ts_10k.index,
    color="algo",
    line_dash="algo",
    animation_frame="problem",
    facet_col="nvar",
    facet_col_wrap=3,
    range_y=(0,0.4)
)

for k in fig4_full.layout: 
    if re.search('yaxis[1-9]+', k): 
        fig4_full.layout[k].update(matches=None)

In [15]:
#hide_input
fig4_full.show()

Note that the erratic behavior of NSGA-II has been extensively reported in the literature, being a consequence of its environmental replacement strategy.