# Parameter Sweep Analysis

In order to choose the optimal parameters for the final experiments, we run a parameter sweep over over the following parameters: mutation rate, crossover rate, population size, tournament probability, and age gap. Each parameter can take on a low, medium, or high value.

Parameter sweeps are run over two experiments: a two-objective run of purely topological property objectives and a three-objective run of purely edge-weight property objectives. How well the experiment performed is measured via the proportion of graphs perfectly optimized in the final population. The diversity of the experiment is measured via the entropy of select properties in the final population. Experiments that did not have a perfectly evolved final population do not have their diversity evaluated. The properties selected to have their diversity measured are purely edge-weight properties for the topological objective experiments and vice versa for the edge-weight objective experiments.

In [None]:
import sys
import pandas as pd
sys.path.insert(0, "../")
from paramsweep_jobs import get_parameter_values
from paramsweep_analysis import plot_best_params, plot_parameter_diversity, plot_parameter_performance, score_params

In [None]:
def read_df(network_size):
    df = pd.read_pickle(f"../output/paramsweep/{network_size}/df.pkl")
    params = get_parameter_values(int(network_size))
    param_names = list(params.keys())
    for p,param in enumerate(param_names):
        df[param] = df["param_set"].str.split("_").str[p].map(params[param])
    return df, param_names

## Size 10 Networks

In [None]:
network_size = 10
df, param_names = read_df(network_size)

### Single-parameter analysis - performance
High crossover rate and high mutation rate lead to fewer optimized graphs across both experiments, although more significantly for the edge-weight experiments. All other differences in performance across parameters do not appear significant.

In [None]:
plot_parameter_performance(df, network_size, param_names, "optimized_proportion", save=False)

### Single-parameter analysis - diversity
Changing a single parameter does not significantly change the entropy of the experiments. Entropy is biased with population size so no conclusions should be made from just this plot about population size and diversity.

In [None]:
plot_parameter_diversity(df, network_size, param_names, "entropy", save=False)

### Multi-parameter analysis - diversity
The top figure shows the edge-weight objective experiment's diversity across the parameters leading to the top ten most diverse experiments. We see across all experiments that medium mutation rate, high population size, and low tournament probability lead to the highest entropy. Age gap has the highest entropy at low or medium. Crossover rate appears to not have an ideal parameter value.

The bottom figure shows the topology objective experiment's diversity across the parameters leading to the top ten most diverse experiments. Low mutation rate and high population size lead to the highest entropy. Medium or high tournament probability leads to the highest entropy. The other parameters do not appear to have an overall ideal parameter value.

In [None]:
plot_best_params(df, network_size, param_names, "entropy", save=False)

### Best Parameter Set
This counts how many times each parameter set had the best entropy across all experiments and diversity functions. Only the top parameter sets are returned.

In [None]:
pd.set_option('display.expand_frame_repr', False)
score_params(df, param_names, "entropy")

### Conclusion
Optimal parameters for size 10 networks
- Low or medium crossover rate
- Low or medium mutation rate
- High population size
- Medium tournament probability
- Low age gap

## Size 20 Networks

In [None]:
network_size = 20
df, param_names = read_df(network_size)

### Single-parameter analysis - performance
All graphs in all runs are perfectly optimized.

In [None]:
plot_parameter_performance(df, network_size, param_names, "optimized_proportion", save=False)

### Single-parameter analysis - diversity
Lower age gaps may lead to higher entropy.

In [None]:
plot_parameter_diversity(df, network_size, param_names, "entropy", save=False)

### Multi-parameter analysis - diversity

In [None]:
plot_best_params(df, network_size, param_names, "entropy", save=False)

In [None]:
pd.set_option('display.expand_frame_repr', False)
score_params(df, param_names, "entropy")

### Conclusion
Optimal parameters for size 20 networks
- Low crossover rate
- Low mutation rate
- High population size
- Medium tournament probability
- Low age gap