<a id="top"></a>
# Generate script

This notebook will generate scripts to automatically run MECSim over a range of values for 2 parameters only, useful for generating 2D plots. The generated script will also compare the simulated current response to experimental data and run Bayesian analysis to determine optimal parameters and robust error bars.

For simulations that require more than 2 varying parameters, use `RandomlySampleRange.ipynb`.

To run this notebook you must have [MECSim](http://www.garethkennedy.net/MECSim.html) installed, and have a correctly set-up `Master.sk` skeleton file in the same directory as the MECSim executable.

## Contents
#### Nominal usage
- <p><a href="#ref_guide">Getting started</a></p>
Quick practical guide to running GenerateScript, where key files go and the overall workflow.
- <p><a href="#ref_paras">Parameter set up</a></p>
Set the names and ranges of our 2 varying parameters that we have set up in `Master.sk`.
- <p><a href="#ref_postproc">Post-processing parameters</a></p>
Choose between simple least-squares comparison of MECSim output to experimental results or Bayesian analysis.

#### Advanced usage
- <p><a href="#ref_weights">Set weights for comparison</a></p>
Set here an array of weights used in results comparison. Useful for focusing on specific frequency ranges and/or ignoring others.
- <p><a href="#ref_file_locations">Set location of the Python files</a></p>
This section does not need to be changed in normal operation.
- <p><a href="#ref_settings">Output settings file</a></p>
Set bandwidth, number of harmonics and whether to use the weights set above.
- <p><a href="#ref_prepscript">Prepare script file</a></p>
- <p><a href="#ref_genscript">Generate script and parameter loop</a></p>

<a id="ref_guide"></a>
## Getting started


This guide will answer practical questions such as where to copy all the files? How do I run MECSim over a range of parameters?

Copy the following python scripts to the **python_dir** (default is `./python/`):
1. `HarmonicSplitter.py`
2. `CompareSmoothed.py`
3. `BayesianAnalysis.py` (optional)
4. `SurfacePlotter.py` (optional)

Details of how they fit together in the loop are given <a href="#ref6">here</a>.

After running this script there will be two additional files that must be put into the **`script_dir`** (default is "script"):
1. `Master.sk`
2. `run_mecsim_script.sh`
3. `Settings.inp`


![title](MECSim_ScriptingChart.png)


**NOTE: the skeleton input parameter file for MECSim, "Master.sk", must be setup before running the script output from this python script.**


### Load required libraries



In [None]:
import numpy as np

<a id="ref_paras"></a>
## Parameter set up

Back to <a href="#top">top</a>.

### Set names and ranges of variables

Name mappings and ranges for each variable. These are used with the master template (skeleton file) called "Master.sk" to replace the strings in "$name" with the required values. The bash script generated by this code will run MECSim using each set of parameters written to "Master.inp".

For the user, ensure that the x and y names are correct both here and in "Master.sk", as well as the ranges.


In [None]:
x_name = '$kzero'
x_min = 0.250e-2
x_max = 1.750e-2
del_x = 15.0e-3

y_name = '$Ezero'
y_min = -0.10
y_max = 0.1
del_y = 15.0e-2

### Set fixed parameters

Often the user would like some fixed parameters in the skeleton file (``Master.sk``).

In [None]:
# setup fixed parameters
x_fixed_name = []
x_fixed_value = []

#### copy this for as many fixed parameters as required ####
x_fixed_name.append('$R')
x_fixed_value.append(100)
#### end of code chunk to copy

#### copy this for as many fixed parameters as required ####
x_fixed_name.append('$alpha')
x_fixed_value.append(0.5)
#### end of code chunk to copy


### Experiment parameters

Set the name of the file containing the experimental data to compare with the MECSim simulation current responses.

The raw experimental data file must have the same format as MECSim output which is based on the output from a potentiostat (SPECS)


In [None]:
Experimental_filename = 'MECSim_Example.txt'

<a id="ref_postproc"></a>
## Post-processing parameters

Select automatic post processing. Will be added to the end of the bash script made by this code

Two options:

1. use raw least square values in the results file to be plotted in a surface plot.
2. use Bayesian analysis on the results file automatically done (figures and statistics are sent to output directory)

Back to <a href="#top">top</a>.

In [None]:
# select whether surface plotter to be used
plot_surfaces = False
# select whether Bayesian statistical analysis to be used
bayesian_analysis = True
# have you run a script generating results before? do you want this run to append the results to the existing ones?
results_exists = False

<a id="ref_weights"></a>
### Set weights for comparison

Metric used in `CompareSmoothed.py` is to take the smoothed harmonics from `HarmonicSplitter.py` for both the experimental and simulated data (a function of parameters run by `MECSim`), calculated the least squares difference for each harmonic and combine them via
$$
S = \sum_{j=0}^{n_{harm}} w_j LS_j
$$
where $n_{harm}$ is the number of harmonics, $j=0$ is the dc component, $w_j$ is the weight given to each harmonic (set below) and $LS_j$ is the relative least squares difference given by
$$
LS_j = \frac{ \sum_k^n \left( i^{exp}_k - i^{sim}_k \right)^2 }{ \sum_k^n \left( i^{exp}_k \right)^2 }
$$
where $n$ is the total number of current ($i$), time and voltage points in the smoothed experiemental ($exp$) and simiulated ($sim$) data. The weights ($w_j$) for each harmonic (and dc component) are set as a vector of any sum, or left as the unweighted default of $w_j = 1$.

#### Set weights

Select whether you wish to use custom weights by `output_single_metric = True`, otherwise all LS$_j$ will be output and separated by commas.

Alter the following cell for custom weights. Ensure that this is the same length as the number of harmonics (plus dc)

**Note: for purely dc cases set `number_harmonics = 0`**

Back to <a href="#top">top</a>.


In [None]:
# use custom weights? default is no and use uniform weights for the comparison
output_single_metric = False

# number of harmonics (excluding the dc component - harmonic = 0)
number_harmonics = 0
frequency_bandwidth = 1.    # only 1 needed for simulations
# set to 1 if using weights (else 0)
use_weights = 0
# set weights as a numpy array with same length as number_harmonics + 1 (default is all even)
weights = np.repeat(1.0, number_harmonics+1)

# sample way to change it: can also use functions
#weights = np.array([0.5,1,1,0.5,0.3,0.1,0.05])

## Experienced users only from hereon

### Set location of the python files

Location is relative to the script that will be running. This script is required to be in the same location as the MECSim.exe file - as in the Docker image setup. 

**Note: these generally don't need to be changed**


In [None]:
# python dir contains all .py files
python_dir = 'python/'
# script dir contains script.sh (output here), Master.sk (user prepared) and Settings.inp (output here)
script_dir = 'script/'
# Master.inp dir - important for docker container vs local run
input_dir = 'input/'
# output dir for results
output_dir = 'output/'

# default is running in notebook:
# if running in docker then the container structure requires this be external/
external_dir = 'external/'
# location of parent directory: typically this file will be in python/ so the parent dir is '../'
parent_dir = '../'

### File names

Manually change the name of the results file, default is 'results.txt'. Also set if this file already exists, for example if a previous script was run on a different region of parameter space.

Can also change the default script name from 'run_mecsim_script.sh', although this is not recommended.

In [None]:
# results filename and if it already exists
results_name = 'results.txt'
# change default script name here (not recommended)
script_name = 'run_mecsim_script.sh'

### Simulation file settings

Change the default name of the simulation output file (not recommended)

In [None]:
Simulation_output_filename = 'MECSimOutput_Pot.txt'

### Rarely changed parameters


In [None]:
# store the log files from each simulation
store_log_files = False

<a id="ref_postproc"></a>
### Comparison file FFT settings

Set whether the experimental data file set above needs to have a Fast Fourier Transform (FFT) applied to it, and if so what filename to use.


In [None]:
Experimental_FFT_output_filename = 'ExpSmoothed.txt'
needs_fft_conversion = True

Check that the number of weights for the number of harmonics is correct

In [None]:
if(number_harmonics+1 != len(weights)):
    print "WARNING: Found", len(weights), "weights when there are", number_harmonics+1, "harmonics (including dc)"
    if(number_harmonics+1<len(weights)):
        print "         Too many weights entered - clipping"
        weights = weights[:number_harmonics+1]
    else:
        print "         Not enough weights entered - filling with zeros"
        weights.resize(number_harmonics+1)

Convert np array to csv string (in case not python reading it in!)

In [None]:
txt_weights = ','.join(map(str, weights))
print "Weights:", txt_weights

### Script method

In [None]:
method_type = 'grid'

<a id="ref_settings"></a>
## Output settings file

This file encodes the simulation output filename (shouldn't change), number of harmonics, bandwidth frequency and the settings for the weights.

Back to <a href="#top">top</a>.

In [None]:
f = open(parent_dir+script_dir+'Settings.inp', 'w')
f.write(Simulation_output_filename + "\t# simulation output filename\n")
f.write(str(number_harmonics) + "\t# number of harmonics\n")
f.write(str(frequency_bandwidth) + "\t# bandwidth frequency (Hz)\n")
iUseSingleMetric = 0
if(output_single_metric or output_single_metric==1):
    iUseSingleMetric = 1
f.write(str(iUseSingleMetric) + "\t# 1=use single output metric value (else=0 and each harmonic treated separately\n")
iUseWeights = 0
if(use_weights or use_weights==1):
    iUseWeights = 1
f.write(str(iUseWeights) + "\t# apply different weights to each harmonic (0 to n_harm) - if =1 then weights set below\n")
f.write(txt_weights)
f.close()

<a id="ref_prepscript"></a>
## Prepare script file

Depending on the method type selected then output a text file in bash script format for running MECSim with the analysis tools.

First set any by hand parameters. For example if you have a constant e0val=0.2 but want to keep the skeleton file general with $e0val in there. 

Note that you'll need to be careful to integrate these into the script generation yourself.

Back to <a href="#top">top</a>.

### Define function for dealing with exponential form


In [None]:
def ConvertXToXExpForm(x_min, x_max, del_x):
    """
    Convert min/max/delta into a minimum common exponential form
    e.g. 1.4e-4, 1.0e-5 have 1e-5 as the minimum. 
    So the first becomes 14e-5 while the second remains as it is.
    
    Everything must be integer
    """
    orig_x = [x_min, x_max, del_x]
    x_exp = -1
    sum3 = 0
    while sum3 != 3:
        x_exp += 1
        test_x = np.around([x_min, x_max, del_x], x_exp)
        sum3 = np.sum(orig_x == test_x)
    x_exp = -x_exp
    x_exp_unit = 10.0**x_exp
    x_min_exp = int(x_min / x_exp_unit)
    x_max_exp = int(x_max / x_exp_unit)
    del_x_exp = int(del_x / x_exp_unit)
    return x_min_exp, x_max_exp, del_x_exp, x_exp


### Modify here for more than 2 dimensions

If more dimensions then add more here. Will also need to add to the parameters above and the script logic below.

In [None]:
x_min_exp, x_max_exp, del_x_exp, x_exp = ConvertXToXExpForm(x_min, x_max, del_x)
y_min_exp, y_max_exp, del_y_exp, y_exp = ConvertXToXExpForm(y_min, y_max, del_y)


### Total number of simulations to run

In [None]:
# some rounding issues can pop up here...
def CalcModelsToRun(x_min, x_max, del_x):
    n = int(x_max/del_x) - int(x_min/del_x) + 1
    return(n)

# output total number of models that will be run (2D)
n_simulations = CalcModelsToRun(x_min, x_max, del_x) * CalcModelsToRun(y_min, y_max, del_y)


<a id="ref_genscript"></a>
## Generate script and parameter loop

### Basic grid method

First apply HarmonicSplitter.py to split the experimental data into harmonics (dc and ac) then smooth the resultant current responses.

Next loop over all x,y combinations, each time doing:
1. Use HarmonicSplitter.py to split and smooth the harmonics
2. Use CompareSmoothed.py to compare the smoothed harmonics between the simulated current response and the experimental data.
3. Comparison is calculated as a metric (default uses least squares) which is either a composite of all harmonic data or a list of values, one for each harmonic
3. Append the x, y and metric (S) values to a single file (results_name set above)

After the loop:
1. All input parameters (x, y and any more added by user) as well as the least squared metric (S, either 1 or more values) is now stored in "results_name" (typically Results.txt)
2. BayesianAnalysis.py is run on this results file to determine the probabilities of each set of parameters given that the true fit to the experimental data exists somewhere in the chosen parameter range.
3. BayesianAnalysis will output the posterior probabilities ("posterior.txt") as well as the optimal values with error bars ("opt_parameters.txt") and plots (png and pdf) depending on the settings in BayesianAnalysis.py.
4. Additional plots of the least squares surfaces themselves are output by SurfacePlotter.py if requested.

**This script should be renamed to "run_mecsim_script.sh" and copied to "script/" along with "Settings.inp" created above.**

Back to <a href="#top">top</a>.

In [None]:
if(method_type=='grid'):
    print 'Using grid method to write to: ' + parent_dir+script_dir+script_name
    with open(parent_dir+script_dir+script_name, "w") as text_file:
        text_file.write("#!/bin/bash\n")
        # process harmonics for experimental data - if requested
        if(needs_fft_conversion):
            text_file.write("cp {0}{1} {2}{3}\n".format(script_dir, Experimental_filename, 
                                                        output_dir, Simulation_output_filename))
            text_file.write("python {0}HarmonicSplitter.py\n".format(python_dir))
            text_file.write("mv {0}Smoothed.txt {0}{1} \n".format(output_dir, 
                                 Experimental_FFT_output_filename))
        # setup parameter ranges
        text_file.write("xmin={0}\n".format(x_min_exp))
        text_file.write("xmax={0}\n".format(x_max_exp))
        text_file.write("xdel={0}\n".format(del_x_exp))
        text_file.write("xext=e{0}\n".format(x_exp))
        text_file.write("ymin={0}\n".format(y_min_exp))
        text_file.write("ymax={0}\n".format(y_max_exp))
        text_file.write("ydel={0}\n".format(del_y_exp))
        text_file.write("yext=e{0}\n".format(y_exp))
        text_file.write("x=$xmin\n")
        # output summary
        text_file.write("echo \n")
        text_file.write("echo '2D Grid method run with n_sim = {0}'\n".format(n_simulations))
        text_file.write("echo \n")
        # write header for results output file - else will append
        if(not results_exists):
            text_file.write("echo '{0},{1},S' > {2}{3}\n".format(x_name, y_name, output_dir, results_name))
        # construct loop over parameters
        text_file.write("while [ $x -le $xmax ]\n")
        text_file.write("do\n")
        text_file.write("  y=$ymin\n")
        text_file.write("  while [ $y -le $ymax ]\n")
        text_file.write("  do\n")
        text_file.write("    cp {0}Master.sk {1}Master.inp\n".format(script_dir, input_dir))
        # fixed parameters
        for i in range(len(x_fixed_name)):
            text_file.write("    sed -i 's/{0}/'{1}'/g' {2}Master.inp\n".format(x_fixed_name[i], x_fixed_value[i], input_dir))
        text_file.write("    sed -i 's/{0}/'$x$xext'/g' {1}Master.inp\n".format(x_name, input_dir))
        text_file.write("    sed -i 's/{0}/'$y$yext'/g' {1}Master.inp\n".format(y_name, input_dir))
        text_file.write("    ./MECSim.exe 2>errors.txt\n")
        if(store_log_files):
            text_file.write("  mv {0}log.txt {0}log_$x$xext_$y$yext.txt\n".format(output_dir))
        text_file.write("    python {0}HarmonicSplitter.py\n".format(python_dir))
        text_file.write("    z=$(python {0}CompareSmoothed.py)\n".format(python_dir))
        text_file.write("    echo $x$xext,$y$yext,$z >> {0}{1}\n".format(output_dir, results_name))
        # continuously copy out results - internal container's output directory is empty 
        text_file.write("    cp -p {0}Master.inp {1}\n".format(input_dir, output_dir))
        text_file.write("    cp -p {0}* {1}{0}\n".format(output_dir, external_dir))
        text_file.write("    y=$((y+ydel))\n")
        text_file.write("  done\n")
        text_file.write("  x=$((x+xdel))\n")
        text_file.write("done\n")
        text_file.write("ls -lrt \n")
        text_file.write("ls -lrt output/\n")
        # run Bayesian analysis (*.txt) and plotter (produces bayesian_plot.pdf and png) with results_name
        if(bayesian_analysis):
            text_file.write("python {0}BayesianAnalysis.py {1} {2}{3} {4}{5}\n".format(python_dir, 
                                 results_name, external_dir, output_dir, external_dir, script_dir))
        # run sum of squares plotter (also uses results file name as argument)
        if(plot_surfaces):
            text_file.write("python {0}SurfacePlotter.py {1} {2}{3} {4}{5}\n".format(python_dir, 
                                 results_name, external_dir, output_dir, external_dir, script_dir))
        # copy out the results from analysis scripts
        text_file.write("cp -p {0}* {1}{0}\n".format(output_dir, external_dir))


### Output total number of simulations

In [None]:
print('Total number of simulations to run = ' + str(n_simulations))