<a id="top"></a>
# Randomly sample from a range

This notebook gives a documented methodology for generating scripts to automatically run MECSim over a range of parameters.

In the parameters section we set the input parameters and simulation parameters used to generate a script that will run MECSim and the python analysis codes for each parameter set. These parameters are the electrochemical parameters, e.g. $k_0$ or $E^0$, contained in the "Master.inp" text file MECSim uses to simulate a current response.

A prerequisite for running this script is to have installed [MECSim](http://www.garethkennedy.net/MECSim.html) and to have correctly set up a skeleton input file "Master.sk". The skeleton input file differs from the usual MECSim input file "Master.inp" in that it has labels for the parameters to be varied (e.g. "\$kzero" and "\$Ezero") in the correct locations. These labels must match the variable names below, **x_name**, **y_name** or more if more than 2 dimensions are required. 

This notebook will create a script that can be used to automatically run MECSim for each parameter set, compare the simulated current response to experimental data and finally run the Bayesian analysis to determine the optimal parameters as well as statistically determined error bars. More details are given in the **BayesianAnalysis.ipynb** notebook.

The contents of this notebook are:
- <p><a href="#ref_guide">Practical guide</a></p>
- <p><a href="#ref_paras">Parameters</a></p>
- <p><a href="#ref_postproc">Post-processing parameters</a></p>
- <p><a href="#ref_weights">Set weights for comparison</a></p>
- <p><a href="#ref_settings">Output settings file</a></p>
- <p><a href="#ref_prepscript">Prepare script file</a></p>
- <p><a href="#ref_genscript">Generate script and parameter loop</a></p>

<a id="ref_guide"></a>
## Practical guide


This guide will answer practical questions such as where to copy all the files? How do I run MECSim over a range of parameters?

Copy the following python scripts to the **python_dir** (default is "python"):
1. HarmonicSplitter.py
2. CompareSmoothed.py
3. BayesianAnalysis.py (optional)
4. SurfacePlotter.py (optional)

Details of how they fit together in the loop are given <a href="#ref6">here</a>.

After running this script there will be two additional files that must be put into the **script_dir** (default is "script"):
1. Master.sk
2. run_mecsim_script.sh
3. Settings.inp


![title](MECSim_ScriptingChart.PNG)


**NOTE: the skeleton input parameter file for MECSim, "Master.sk", must be setup before running the script output from this python script.**


### Load required libraries



In [62]:
import numpy as np

<a id="ref_paras"></a>
## Parameters

Back to <a href="#top">top</a>.

### Set names and ranges of variables

Name mappings and ranges for each variable. These are used with the master template (skeleton file) called "Master.sk" to replace the strings in "$name" with the required values. The bash script generated by this code will run MECSim using each set of parameters written to "Master.inp".

This code is intended to run over many parameters with known ranges.

For the user, ensure that the x1, x2... names are correct both here and in "Master.sk", as well as the ranges.

Rather than a sweep which requires a ``del_x`` the random sample requires the total number of simulations to run.

In [63]:
# total number of simulations to run
n_simulations = 100


In [64]:
# setup parameters
x_para_name = []
x_para_min = []
x_para_max = []
x_para_log = []

#### copy this for as many parameters as required ####
x_para_name.append('$kzero')
x_para_min.append(0.50e-2)
x_para_max.append(1.50e-2)
x_para_log.append(False)
#### end of code chunk to copy

#### copy this for as many parameters as required ####
x_para_name.append('$Ezero')
x_para_min.append(-0.10)
x_para_max.append(0.1)
x_para_log.append(False)
#### end of code chunk to copy

#### copy this for as many parameters as required ####
x_para_name.append('$DiffB')
x_para_min.append(1.0e-6)
x_para_max.append(2.0e-5)
x_para_log.append(False)
#### end of code chunk to copy



### Experiment parameters

Set the name of the file containing the experimental data to compare with the MECSim simulation current responses.

The raw experimental data file must have the same format as MECSim output which is based on the output from a potentiostat (SPECS)


In [65]:
Experimental_filename = 'MECSim_Example.txt'

<a id="ref_weights"></a>
### Set weights for comparison


Metric used in ``CompareSmoothed.py`` is to take the smoothed harmonics from ``HarmonicSplitter.py`` for both the experimental and simulated data (a function of parameters run by ``MECSim``), calculated the least squares difference for each harmonic and combine them via
$$
S = \sum_{j=0}^{n_{harm}} w_j LS_j
$$
where $n_{harm}$ is the number of harmonics, $j=0$ is the dc component, $w_j$ is the weight given to each harmonic (set below) and $LS_j$ is the relative least squares difference given by
$$
LS_j = \frac{ \sum_k^n \left( i^{exp}_k - i^{sim}_k \right)^2 }{ \sum_k^n \left( i^{exp}_k \right)^2 }
$$
where $n$ is the total number of current ($i$), time and voltage points in the smoothed experiemental ($exp$) and simiulated ($sim$) data. The weights ($w_j$) for each harmonic (and dc component) are set as a vector of any sum, or left as the unweighted default of $w_j = 1$.


#### Set weights

Select whether you wish to use custom weights by ``output_single_metric = True``, otherwise all LS$_j$ will be output and separated by commas.

Alter the following cell for custom weights. Ensure that this is the same length as the number of harmonics (plus dc)

Back to <a href="#top">top</a>.


In [66]:
# use customer weights?
output_single_metric = True

# number of harmonics (excluding the dc component - harmonic = 0)
number_harmonics = 6
frequency_bandwidth = 1.    # only 1 needed for simulations
# set to 1 if using weights (else 0)
use_weights = 0
# set weights as a numpy array with same length as number_harmonics + 1
weights = np.array([0.5,1,1,0.5,0.3,0.1,0.05])


## Experienced users only from hereon

### Set location of the python files

Location is relative to the script that will be running. This script is required to be in the same location as the MECSim.exe file - as in the Docker image setup. 

**Note: these generally don't need to be changed**


In [67]:
# python dir contains all .py files
python_dir = 'python/'
# script dir contains script.sh (output here), Master.sk (user prepared) and Settings.inp (output here)
script_dir = 'script/'
# Master.inp dir - important for docker container vs local run
input_dir = 'input/'
# output dir for results
output_dir = 'output/'
# location of parent directory: typically this file will be in python/ so the parent dir is '../'
parent_dir = '../'

<a id="ref_postproc"></a>
## Post-processing parameters

Select automatic post processing. Will be added to the end of the bash script made by this code

Two options:

1. use raw least square values in the results file to be plotted in a surface plot.
2. use Bayesian analysis on the results file automatically done (figures and statistics are sent to output directory)

Back to <a href="#top">top</a>.

In [68]:
# select whether surface plotter to be used
plot_surfaces = False
# select whether Bayesian statistical analysis to be used
bayesian_analysis = False

### File names

Manually change the name of the results file, default is 'results.txt'. Also set if this file already exists, for example if a previous script was run on a different region of parameter space.

Can also change the default script name from 'run_mecsim_script.sh', although this is not recommended.

In [69]:
# results filename and if it already exists
results_name = 'results.txt'
results_exists = True
# change default script name here (not recommended)
script_name = 'run_mecsim_script.sh'

### Simulation file settings

Change the default name of the simulation output file (not recommended)

In [70]:
Simulation_output_filename = 'MECSimOutput_Pot.txt'

### Comparison file FFT settings

Set whether the experimental data file set above needs to have a Fast Fourier Transform (FFT) applied to it, and if so what filename to use.


In [71]:
Experimental_FFT_output_filename = 'ExpSmoothed.txt'
needs_fft_conversion = False

Check that the number of weights for the number of harmonics is correct

In [72]:
if(number_harmonics+1 != len(weights)):
    print "WARNING: Found", len(weights), "weights when there are", number_harmonics+1, "harmonics (including dc)"
    if(number_harmonics+1<len(weights)):
        print "         Too many weights entered - clipping"
        weights = weights[:number_harmonics+1]
    else:
        print "         Not enough weights entered - filling with zeros"
        weights.resize(number_harmonics+1)

Convert np array to csv string (in case not python reading it in!)

In [73]:
txt_weights = ','.join(map(str, weights))
print "Weights:", txt_weights

Weights: 0.5,1.0,1.0,0.5,0.3,0.1,0.05


### Script method

In [74]:
method_type = 'random'

<a id="ref_settings"></a>
## Output settings file

This file encodes the simulation output filename (shouldn't change), number of harmonics, bandwidth frequency and the settings for the weights.

Back to <a href="#top">top</a>.

In [75]:
f = open(parent_dir+script_dir+'Settings.inp', 'w')
f.write(Simulation_output_filename + "\t# simulation output filename\n")
f.write(str(number_harmonics) + "\t# number of harmonics\n")
f.write(str(frequency_bandwidth) + "\t# bandwidth frequency (Hz)\n")
iUseSingleMetric = 0
if(output_single_metric or output_single_metric==1):
    iUseSingleMetric = 1
f.write(str(iUseSingleMetric) + "\t# 1=use single output metric value (else=0 and each harmonic treated separately\n")
iUseWeights = 0
if(use_weights or use_weights==1):
    iUseWeights = 1
f.write(str(iUseWeights) + "\t# apply different weights to each harmonic (0 to n_harm) - if =1 then weights set below\n")
f.write(txt_weights)
f.close()

<a id="ref_prepscript"></a>
## Prepare script file

Depending on the method type selected then output a text file in bash script format for running MECSim with the analysis tools.

First set any by hand parameters. For example if you have a constant e0val=0.2 but want to keep the skeleton file general with $e0val in there. 

Note that you'll need to be careful to integrate these into the script generation yourself.

Back to <a href="#top">top</a>.

<a id="ref_genscript"></a>
## Generate script and parameter loop

### Random sampling method

First apply HarmonicSplitter.py to split the experimental data into harmonics (dc and ac) then smooth the resultant current responses.

Setup a loop for the number of samples requested, each time doing:
1. Use ReturnRandomExpFormat.py for each input parameter with associated range
2. Take the randomly generated input parameters and pass them to MECSim
3. Use HarmonicSplitter.py to split and smooth the harmonics
4. Use CompareSmoothed.py to compare the smoothed harmonics between the simulated current response and the experimental data.
5. Comparison is calculated as a metric (default uses least squares) which is either a composite of all harmonic data or a list of values, one for each harmonic
6. Append the x1, x2, ... and metric (S) values to a single file (results_name set above)

After the loop:
1. All input parameters (x1, x2..) as well as the least squared metric (S, either 1 or more values) is now stored in "results_name" (typically Results.txt)
2. BayesianAnalysis.py is run on this results file to determine the probabilities of each set of parameters given that the true fit to the experimental data exists somewhere in the chosen parameter range.
3. BayesianAnalysis will output the posterior probabilities ("posterior.txt") as well as the optimal values with error bars ("opt_parameters.txt") and plots (png and pdf) depending on the settings in BayesianAnalysis.py.
4. Additional plots of the least squares surfaces themselves are output by SurfacePlotter.py if requested.

**This script should be renamed to "run_mecsim_script.sh" and copied to "script/" along with "Settings.inp" created above.**

Back to <a href="#top">top</a>.

In [79]:
if(method_type=='random'):
    print 'Using random sampling method to write to: ' + parent_dir+script_dir+script_name
#    with open(parent_dir+script_dir+script_name, "w") as text_file:
    with open(script_name, "w") as text_file:
        text_file.write("#!/bin/bash\n")
#        text_file.write("cp {0}Settings.inp ./\n".format(script_dir))
        # process harmonics for experimental data - if requested
        if(needs_fft_conversion):
            text_file.write("cp {0}{1} {2}\n".format(script_dir, Experimental_filename, Simulation_output_filename))
            text_file.write("python {0}HarmonicSplitter.py\n".format(python_dir))
            text_file.write("mv Smoothed.txt {0} \n".format(Experimental_FFT_output_filename))
        # setup parameter ranges
        text_file.write("i=1\n")
        text_file.write("imax={0}\n".format(n_simulations))
        # write header for results output file - else will append
        if(not results_exists):
            text_file.write("echo '{0},{1},S' > {2}\n".format(x_name, y_name, results_name))
        # construct loop over parameters
        text_file.write("while [ $i -le $imax ]\n")
        text_file.write("do\n")
        text_file.write("  i=$((i+1))\n")
        text_file.write("  cp {0}Master.sk ./Master.inp\n".format(script_dir))
        for i in range(len(x_para_name)):
            if(x_para_log[i]):
                x_para_text = 'x=$(python ' + python_dir + 'ReturnRandomExpFormat.py ' + str(x_para_min[i]) + ' ' + str(x_para_max[i]) + ' True)'
            else:
                x_para_text = 'x=$(python ' + python_dir + 'ReturnRandomExpFormat.py ' + str(x_para_min[i]) + ' ' + str(x_para_max[i]) + ' False)'
            text_file.write("    {0}\n".format(x_para_text))
            text_file.write("    sed -i 's/{0}/'$x'/g' ./Master.inp\n".format(x_para_name[i]))
            text_file.write("    echo {0} $x\n".format(x_para_name[i]))
    
#        text_file.write("    ./MECSim.exe 2>errors.txt\n")
#        text_file.write("    mv log.txt {0}log_$x$xext_$y$yext.txt\n".format(output_dir))
#        text_file.write("    python {0}HarmonicSplitter.py\n".format(python_dir))
#        text_file.write("    z=$(python {0}CompareSmoothed.py)\n".format(python_dir))
#        text_file.write("    echo $x$xext,$y$yext,$z >> {0}\n".format(results_name))
        text_file.write("done\n")
        # copy out raw results file
#        text_file.write("cp {0} {1}\n".format(results_name, output_dir))
        # run Bayesian analysis (*.txt) and plotter (produces bayesian_plot.pdf and png) with results_name
        if(bayesian_analysis):
            text_file.write("python {0}BayesianAnalysis.py {1}\n".format(python_dir, results_name))
            # for all output file names check BayesianAnalysis.py for consistency
            text_file.write("cp bayesian_plot.* {0}\n".format(output_dir)) # plots
            text_file.write("cp posterior.txt opt_parameters.txt {0}\n".format(output_dir)) 
        # run least squares plotter (also uses results file name as argument)
        if(plot_surfaces):
            text_file.write("python {0}SurfacePlotter.py {1}\n".format(python_dir, results_name))
            text_file.write("cp surface_plot.* {0}\n".format(output_dir))


Using random sampling method to write to: ../script/run_mecsim_script.sh


In [77]:
print("Total number of simulations to be run: " + str(n_simulations))

Total number of simulations to be run: 100
