# Output best fit from posterior output

Takes the previously prepared `Master_with_fixed.sk` and substitutes the best fit values for all parameters using the `posterior.txt` final column to provide the probabities. 

The `Master_with_fixed.sk` file is used rather than `Master.sk` as it has been pre-filled with the fixed parameters set in either `GenerateScript.ipynb` or `RandomlySampleRange.ipynb` during their normal script operation.

Requires: 
1. `Master_with_fixed.sk` exists in the **`script`** directory
2. `posterior.txt` exists in the **`output`** directory

Outputs a `Master.inp` file containing the best fit parameters into the **`output`** directory.


### Caution

**This script should not need to be modified by general users.**

### Set default file locations

In [1]:
# location of parent directory: typically this file will be in python/ so the parent dir is '../'
parent_dir = '../'
# output dir for results
output_dir = 'output/'
# script dir for results including the parent directory
script_dir = 'script/'
# skeleton file
skeleton_file = 'Master_with_fixed.sk'
# output filename for best fit set
best_fit_file = 'Master.inp'
# posterior/results file name
script_results_file = 'posterior.txt'
# using posterior (max prob) or results (min least squares)
using_posterior = True



### Load packages

In [2]:
# import required python packages
import numpy as np
#import pandas as pd
import sys
import re


### Check if in script or notebook mode

Double check that interactive plotting mode is disabled if running this in script mode

In [3]:
thisCodeName = 'OutputBestFit.py'
nLength = len(thisCodeName)
tailString = sys.argv[0]
tailString = tailString[-nLength:]
if(tailString==thisCodeName):
    parent_dir = 'external/'
    if(len(sys.argv)>1):
        # next should be the file name
        best_fit_file = sys.argv[1]

## Read posterior file to get the best fit

Read text file with posteriors for each parameter set. Posteriors stored as the final column, so all others are parameters. Exract parameter names and values for each posterior probability row by row.

In [5]:
# set filename
filename = parent_dir + output_dir + script_results_file
# read header containing parameter names
with open(filename) as f:
    header_line = f.readline().split(',')
# read posterior data
fit_results = np.loadtxt(filename, delimiter=",", skiprows=1)

In [7]:
# determine number of parameters and their names
n_parameters = len(header_line) - 1
para_names = header_line[0:-1]
# find the row with the highest posterior probability (the best fit)
final_metric = fit_results[:, -1]
best_fit = np.unravel_index(np.argmax(final_metric, axis=None), final_metric.shape)[0]
# create a dictionary in python for each parameter value corresponding to the best fit
para_values = [str(x) for x in fit_results[best_fit][0:n_parameters]]
replace_dict = dict(zip(para_names, para_values))

Define a function for replacing the parameter name strings in `Master_with_fixed.sk` with best fit values.

In [9]:
def replace(string, substitutions):
    substrings = sorted(substitutions, key=len, reverse=True)
    regex = re.compile('|'.join(map(re.escape, substrings)))
    return regex.sub(lambda match: substitutions[match.group(0)], string)

## Read skeleton file and output a best fit Master input file

Output the `Master.inp` file with best fit parameters to the **`output`** directory.

In [11]:
f1 = open(parent_dir + script_dir + skeleton_file, 'r')
f2 = open(parent_dir + output_dir + best_fit_file, 'w')
for line in f1:
    output = replace(str(line), replace_dict)
    f2.write(output)
f1.close()
f2.close()