## Create sample grid for training data - USING KARAKAS 10

In [1]:
%pylab inline

Populating the interactive namespace from numpy and matplotlib


In [2]:
from Chempy.parameter import ModelParameters
from Chempy.cem_function import extract_parameters_and_priors, posterior_function_returning_predictions
a = ModelParameters()

In [31]:
## This calculates a list of 7 trial values for each parameter around the prior value, as an array of 6 lists which will be combined
# Set the desired Gaussian sigma values in the widths parameter (values > prior sigma are used to fully explore parameter space)
# Parameter values are chosen that are evenly distributed in the Gaussian probability space (e.g. 12.5, 25, 37.5 etc. percentile points)

# Create normalized (mean = 0, sigma = 1) grid and associated parameter space output
from scipy.stats import norm as gaussian # Gaussian function (strange import to avoid clash with numpy)
N = a.training_size
widths = a.neural_widths
prob = np.arange(1/(N+1),1-1/(N+1),1/(N+1)) # Create uniform grid of probability
grids_1d = [gaussian.ppf(prob) for _ in range(N+1)] # Create 1d intervals in normalized parameter space

# Create normalized grids of all possible combinations (5^6 ~ 16,000)
norm_param_grid = np.array(np.meshgrid(*grids_1d)).T.reshape(-1,len(a.p0)) # Create list of all arrays
np.save('Neural/norm_training_grid.npy',norm_param_grid) 

# Create grids in parameter space
param_grid = [item*widths+a.p0 for item in norm_param_grid]
np.save('Neural/param_grid.npy',param_grid)

In [32]:
import warnings
warnings.filterwarnings("ignore")

In [None]:
# USING Karakas 10 yields for now
#param_grid = param_grid[:4] # USE THIS FOR TESTING
training_abundances = np.zeros((len(param_grid),22)) # 22 is number of traceable elements - automate this??
for i,item in enumerate(param_grid):
    abundances,_ = posterior_function_returning_predictions((item,a)) # Send new parameters from grid
    training_abundances[i] = abundances
    if i%10 == 0: # For testing
        print("Calculating abundance %d of %d" %(i,len(param_grid)))
        
np.save('Neural/training_abundances.npy', training_abundances)

Calculating abundance 0 of 15625
Calculating abundance 10 of 15625
Calculating abundance 20 of 15625
Calculating abundance 30 of 15625
Calculating abundance 40 of 15625
Calculating abundance 50 of 15625
Calculating abundance 60 of 15625
Calculating abundance 70 of 15625
Calculating abundance 80 of 15625
Calculating abundance 90 of 15625
Calculating abundance 100 of 15625
Calculating abundance 110 of 15625
Calculating abundance 120 of 15625
Calculating abundance 130 of 15625
Calculating abundance 140 of 15625
Calculating abundance 150 of 15625
Calculating abundance 160 of 15625
Calculating abundance 170 of 15625
Calculating abundance 180 of 15625
Calculating abundance 190 of 15625
Calculating abundance 200 of 15625
Calculating abundance 210 of 15625
Calculating abundance 220 of 15625
Calculating abundance 230 of 15625
Calculating abundance 240 of 15625
Calculating abundance 250 of 15625
Calculating abundance 260 of 15625
Calculating abundance 270 of 15625
Calculating abundance 280 of 15

**Automate no. traceable elements **

## NOTES
- All code is now in neural.py file
- Should put changeable parameters e.g. number of choices for each parameter in parameter.py file - DONE
- Find nicer way of using all rows from grid - DONE
- Check whether to use Karakas 10 or Karakas 16 - USE KARAKAS 16
- Automate number of traceable elements - just copy that from code