# Execution file for ASSUAGE

This notebook gives an example workflow incorporating all parts of the ASSUAGE methodology.
There are three components: creating the ground truth dataset, building the surrogate model with curation and training of the ML layer, and finally integrating the PyMOS optimisation.

This code contains example ``full`` and ``reduced`` models to demonstrate the workability.
True model gives a non-linear algebraic expression dependent on the first three variables for the true fitness with an amount of randomness: true_val = 2 * var1 ** 2 + 3/2 * var2 - 4 * var3 ** (3/2) + np.random.random()/10


In [6]:
# Import any additional packages used in this jupyter notebook

import os


## Ground truth dataset creation

Code to initiate running the full model to create the ground truth data.

**User-set parameters:**
- `numNewRuns`: total number of new runs to create
- `numCoresPerSim`: maximum number of CPU cores used per simulation  
- `numCores`: total number of CPU cores available for generating the ground-truth dataset


In [None]:
os.system(f"rm -rf groundTruth surrogateCreation")

numNewRuns = 40
numCoresPerSim = 1
numCores = 10

FSItemplateFolder = os.path.join(os.getcwd(), "exampleTemplates","fullModelTemplate")

# Set parameter bounds
lowerBounds = [0.05] * 6 + [0.2] * 6 + [-0.5] * 6 + [0] * 6
upperBounds = [0.5] * 6 + [10] * 6 + [10.5] * 6 + [360] * 6
assert len(lowerBounds) == len(upperBounds), "Upper and lower bound lists must have the same length."


In [8]:

try:
    from encoding import preprocess_parameters as preprocess_func
except:
    print("No preprocess_parameters function found in encoding file")
    preprocess_func = None

try:
    from encoding import parameter_to_model
except:
    print("No parameter_to_model function found in encoding file. This is a necessary function!!")

## Create ground truth data set
from ASSUAGE.createGroundTruth import start_new_runs
start_new_runs(numNewRuns, FSItemplateFolder, lowerBounds, upperBounds, numCoresPerSim, numCores, parameter_to_model, preprocess_func);


Starting new set of runs with first id  0
Starting simulation in run0
Error copying folder: Command '['cp', '-r', '/home/lizzi/ASSUAGE/testTemplates/fullModelTemplate', '/home/lizzi/ASSUAGE/groundTruth/run0']' returned non-zero exit status 1.


cp: cannot stat '/home/lizzi/ASSUAGE/testTemplates/fullModelTemplate': No such file or directory


### Extract data for the surrogate model from ground truth dataset

Requires two functions in encoding.py : extract_surrogate_inputs and extract_surrogate_outputs

In [9]:
from ASSUAGE.createGroundTruth import extractData

try: 
    from encoding import extract_surrogate_inputs, extract_surrogate_outputs
except:
    print("Critical surrogate input and output functions not found. Check these are defined in encoding.py")

extractData(extract_surrogate_inputs, extract_surrogate_outputs);

Critical surrogate input and output functions not found. Check these are defined in encoding.py


NameError: name 'extract_surrogate_outputs' is not defined

In [None]:
from ASSUAGE.surrogateModel.dataExploration import DataExploration

d = DataExploration("surrogateCreation/trainingInput.csv","surrogateCreation/trainingOutput.csv")
d.correlation_matrix()
d.explanatory_dimension()

In [None]:
from ASSUAGE.surrogateModel.mlModelFitting import mlModels

m = mlModels("surrogateCreation/trainingInput.csv","surrogateCreation/trainingOutput.csv")
good_models = m.select_promising_ml_models(accuracy_cutoff=0.9);
m.hyperparameter_optimisation(good_models);
m.train_best_model();
