# Train Neural Network on Equiatomic CrFeCoNi
<b> nanoHUB tools by: </b>  <i>Mackinzie S. Farnell, Zachary D. McClure</i> and <i>Alejandro Strachan</i>, Materials Engineering, Purdue University <br>

Neural networks are trained to predict relaxed vacancy formation energy (vfe), cohesive energy, pressure, and volume, with unrelaxed bispectrum coefficients and Pymatgen descriptors as inputs. A separate model is trained for each output property, and all models are trained on a structure with 25% Cr, 25% Fe, 25% Co, and 25% Ni. The models are built and trained using the Keras library and the results indicate that the models have good predictivie abilities.

Overview
1. Load Bispectrum Coefficients and Output Properties
2. Add Pymatgen Descriptors
3. Split Data into Testing/Training Data
4. Normalize Data
5. Train Neural Network
6. Evaluate Neural Network
7. Analyze Effects of Central Atom Descriptors as Inputs


In [None]:
# import libraries we will need
import sklearn
from sklearn.utils import shuffle

import tensorflow as tf
import keras as ke
from keras.models import load_model

import json as js
import numpy as np
import pymatgen as pymat
import csv
import re

import matplotlib.pyplot as plt
import plotly.offline as p
import plotly.graph_objs as go

## 1. Load Bispectrum Coefficients and Output Properties

The unrelaxed bispectrum coefficients and relaxed vacancy formation energy, cohesive energy, and local atomic pressures and volumes are obtained from a JSON file and stored in a Python dictionary. The bispectrum coefficients are local geometric descriptors based on each atom's 12 nearest neighbors.


In [None]:
# specify properties and filename here
properties = ["Relaxed_VFE", "Cohesive_Energy", "Pressure", "Volume"]
filename = '../data/25_25_25_25.json'
input_prop_key = 'Unrelaxed_Bispectrum_Coefficients'

properties_dictionaries = {}

for output_prop_key in properties:
    # open file and load data into data variable
    with open(filename, 'r') as f:
        data = js.load(f)

    # get relevant information from data variable
    elements = data['element']
    output_properties = data[output_prop_key]
    input_properties = data[input_prop_key]

    # store input and output properties for specific element being searched for
    elements_array = np.array([]) 
    output_properties_array = np.array([])
    input_properties_array = np.array([])

    # create counters to track number of each element
    num_Cr = 0
    num_Fe = 0
    num_Co = 0
    num_Ni = 0
    num_Cu = 0
    
    # iterate through elements and get input and output properties for desired element
    for i, val in enumerate(elements):
        output_properties_array = np.append(output_properties_array, output_properties[i])
        input_properties_array = np.append(input_properties_array, np.asarray(input_properties[i])) 
        if (val == 'Cr'):
            elements_array = np.append(elements_array, 24)
            num_Cr = num_Cr + 1
        elif (val == 'Fe'):
            elements_array = np.append(elements_array, 26)
            num_Fe = num_Fe + 1
        elif (val == 'Co'):
            elements_array = np.append(elements_array, 27)
            num_Co = num_Co + 1
        elif (val == 'Ni'):
            elements_array = np.append(elements_array, 28)
            num_Ni = num_Ni + 1
        elif (val == 'Cu'):
            elements_array = np.append(elements_array, 29)
            num_Cu = num_Cu + 1
 
    # reshape input_properties_element (this should only happen if input property is Bispectrum coefficients)
    num_rows = int (input_properties_array.shape[0]/55)
    input_properties_array = np.reshape(input_properties_array, (num_rows, 55))
    
    # element number is included as input to model
    elements_array = elements_array[np.newaxis].T
    input_properties_array = np.append(input_properties_array, elements_array, 1)

    input_properties_array = input_properties_array.astype(np.float)

    num_elements = np.array([num_Co, num_Cr, num_Fe, num_Ni])

    # if/elif stmts are used to set proper values in dictionary for each property
    if (output_prop_key == 'Relaxed_VFE'):
        min_val = 0.9
        max_val = 2.3
        display = 'Relaxed VFE (eV/atom)'
        units = '(eV/atom)'
    elif (output_prop_key == 'Cohesive_Energy'):
        min_val = -5
        max_val = -4
        display = 'Cohesive Energy (eV/atom)'
        units = '(eV/atom)'
    elif (output_prop_key == 'Pressure'):
        min_val = -7
        max_val = 7
        display = 'Pressure (GPa)'
        units = '(GPa)'
    elif (output_prop_key == 'Volume'):
        min_val = 10.9
        max_val = 11.3
        display = 'Volume (\u212B\u00b3)'
        units = '(\u212B\u00b3)'
    
    # create properties_dictionary
    properties_dictionaries[output_prop_key] = {
        'inputs': input_properties_array,
        'outputs': output_properties_array,
        'length': output_properties_array.shape[0],
        'elements': elements_array,
        'num_element': num_elements,
        'min': min_val,
        'max': max_val,
        'display': display,
        'units': units
    }

Here, we plot histograms to show the distribution of values for each output property, establishing that the values vary across the structure. In the histograms, the elements are separated by color with Cr shown in red, Fe in orange, Co in blue, and Ni in green.

In [None]:
# iterate through each of the properties in dictionary to make the histograms
for key in properties_dictionaries:
    Cr_outs = []
    Fe_outs = []
    Co_outs = []
    Ni_outs = []

    colors = []

    # separate out properties by element
    for i, val in enumerate(properties_dictionaries[key]['elements']):
        if (val == 24):
            colors.append('red')
            Cr_outs.append(properties_dictionaries[key]['outputs'][i])
        elif (val == 26):
            colors.append('orange')
            Fe_outs.append(properties_dictionaries[key]['outputs'][i])
        elif (val == 27):
            colors.append('blue')
            Co_outs.append(properties_dictionaries[key]['outputs'][i])
        elif (val == 28):
            colors.append('green')
            Ni_outs.append(properties_dictionaries[key]['outputs'][i])

    # plot the figures
    fig = go.Figure()
    fig.add_trace(go.Histogram(x=Cr_outs, name = 'Cr', nbinsx=70, marker_color = 'red'))
    fig.add_trace(go.Histogram(x=Fe_outs, name = 'Fe', nbinsx=70, marker_color = 'orange'))
    fig.add_trace(go.Histogram(x=Co_outs, name = 'Co', nbinsx=70, marker_color = 'blue'))
    fig.add_trace(go.Histogram(x=Ni_outs, name = 'Ni', nbinsx=70, marker_color = 'green'))
    fig.update_layout(barmode='overlay')
    fig.update_traces(opacity=0.5)
    fig.update_layout(
        xaxis_title=properties_dictionaries[key]['display'],
        yaxis_title="Frequency",
        font=dict(
            family="Times New Roman, monospace",
            size=24,
            color="black"
        )
    )
    fig.show()

## 2. Add Pymatgen Descriptors

Properties are queried from Pymatgen to use as model inputs along with the bispectrum coefficients. We found that querying additional descriptors helped minimize the error in model predictions, which is further discussed in Section 7: Analyze Effects of Central Atom Descriptors as Inputs.

For all properties, the properties are queried for just the central atom (i.e. the atom we are predicting on). We add central atom descriptors because the output property values vary based on which atom we are predicting on and the central atom descriptors give the neural network a way to distinguish between different atoms. The queried properties are:
    - atomic_radius_calculated
    - atomic_radius 
    - atomic_mass
    - poissons_ratio
    - electrical_resistivity
    - thermal_conductivity
    - brinell_hardness


In [None]:
# declare function to query property from pymatgen for a given element
def get_property(element, property):
    element_object = pymat.Element(element)
    element_prop = getattr(element_object, property)
    return element_prop

# list of properties to add central atom descriptors for
properties = ['atomic_radius_calculated', 'atomic_radius', 'atomic_mass', 
              'poissons_ratio', 'electrical_resistivity', 'thermal_conductivity', 
              'brinell_hardness']

# iterate through all output properties
for key in properties_dictionaries:    
    # iterate through all properties to add
    for add_property in properties:
        atom_properties = []
        elements = properties_dictionaries[key]['elements']
        # determine which element to get property for
        for i in elements:
            if (i == 24):
                ele = 'Cr'
            elif (i == 26):
                ele = 'Fe'
            elif (i == 27):
                ele = 'Co'
            elif (i == 28):
                ele = 'Ni'
            elif (i == 29):
                ele = 'Cu'
            prop = get_property(ele, add_property)
            atom_properties.append(float (prop))

        # add property to array of inputs
        atom_properties = np.asarray(atom_properties) 
        atom_properties = atom_properties[np.newaxis].T
        properties_dictionaries[key]['inputs'] = np.append(properties_dictionaries[key]['inputs'], 
                                                           atom_properties, 1)

We also tested the effects of adding nearest neighbor descriptors to describe the local environment of each atom. We found that these additional descriptors did not improve the model performance.

The central atom and nearest neighbor descriptors were incorporated into one feature using a rule of mixtures equation where $ X_{i} $ is the property value for a nearest neighbor and $ X_{center} $ is the property value for the central atom. This emphasizes the central atom, while still including information about the 12 neighboring atoms.

$$ \ Input Feature = \sum \limits _{i=1} ^{12} ({X_{i}} + X_{center})\ $$

In [None]:
# variable determines if nearest neighbor descriptors are included
add_nearest_neighbors = False

# function to obtain identity of 12 nearest neighbor atoms
def get_composition(c):
    num_Cr = 0
    num_Fe = 0
    num_Co = 0
    num_Ni = 0
    num_Cu = 0
    for i in c[3:15]:
        if (i == '2.0'):
            num_Cr = num_Cr + 1
        elif (i == '4.0'):
            num_Fe = num_Fe + 1
        elif (i == '1.0'):
            num_Co = num_Co + 1
        elif (i == '5.0'):
            num_Ni = num_Ni + 1
        elif (i == '3.0'):
            num_Cu = num_Cu + 1
    composition_dictionary = {'Cr': num_Cr, 'Fe': num_Fe, 'Co': num_Co, 'Ni': num_Ni, 'Cu': num_Cu}
    return composition_dictionary

# list of properties to query
properties_add = ["atomic_radius_calculated", "atomic_radius", "atomic_mass", 
              "poissons_ratio", "electrical_resistivity", "thermal_conductivity", 
              "brinell_hardness", ]

# iterate through all properties
for key in properties_dictionaries:
    
    if not add_nearest_neighbors:
        continue
        
    # iterate through properties to add
    for add_property in properties_add:
        elements_dictionary = {
            'Cr': '2.0',
            'Fe': '4.0',
            'Co': '1.0',
            'Ni': '5.0',
            'Cu': '3.0'
        }

        properties = []
        elements =  ['Cr', 'Fe', 'Co', 'Ni', 'Cu']
        
        # iterate through all elements
        for ele in elements:
            with open('../data/5000Atom_ID_Type_Neighbors_25_25_25_25.csv') as csvfile:
                readCSV = csv.reader(csvfile, delimiter=',')
                i = 0
                for row in readCSV:
                    if (i != 0):
                        if (elements_dictionary[ele] != row[2]):
                            continue
                            
                        # gets dictionary that says how many of each atom are in 12 nearest neighbors
                        comp = get_composition(row)
                        elements_2 = ['Cr', 'Fe', 'Co', 'Ni', 'Cu']
                        
                        # query properties for 12 nearest neighbors + central atom
                        sum_prop = 0
                        for element in elements_2:
                            prop = get_property(element, add_property) + get_property(ele, add_property)
                            sum_prop = sum_prop + prop * comp[element]
                        properties.append(float (sum_prop))
                    i = i + 1

        # add feature to list of inputs
        properties = np.asarray(properties)
        properties = properties[np.newaxis].T
        properties_dictionaries[key]['inputs'] = np.append(properties_dictionaries[key]['inputs'], properties, 1)

## 3. Split Data into Testing/Training Data
The data is split into two groups: training and testing. Training data is used to train the neural network, so that it can determine how to predict the output properties when given bispectrum coefficients and Pymatgen descriptors as inputs. The testing data is used to evaluate the results of the trained model. We use 80% of the data for training and 20% for testing.

In [None]:
# variable determines what percent of data is used for training, must be less than 1
train_percent = 0.8

# iterate through properties
for key in properties_dictionaries:
    index_split_at = int (train_percent * properties_dictionaries[key]['length'])
  
    # shuffle data
    properties_dictionaries[key]['inputs'], properties_dictionaries[key]['outputs'], \
            properties_dictionaries[key]['elements'] = shuffle(properties_dictionaries[key]['inputs'], \
                                                         properties_dictionaries[key]['outputs'], \
                                                         properties_dictionaries[key]['elements'], \
                                                         random_state=0)

    # split data
    train_inputs, test_inputs = np.split(properties_dictionaries[key]['inputs'], [index_split_at])
    train_outputs, test_outputs = np.split(properties_dictionaries[key]['outputs'], [index_split_at])
    train_elements, test_elements = np.split(properties_dictionaries[key]['elements'], [index_split_at])

    # update properties dictionary
    properties_dictionaries[key]['inputs_train'] = train_inputs
    properties_dictionaries[key]['inputs_test'] = test_inputs
    properties_dictionaries[key]['outputs_train'] = train_outputs
    properties_dictionaries[key]['outputs_test'] = test_outputs
    properties_dictionaries[key]['elements_train'] = train_elements
    properties_dictionaries[key]['elements_test'] = test_elements

## 4. Normalize Data
We normalize the inputs and outputs to the model using mean and standard deviation. We normalize the inputs because each bispectrum coefficient has a different range of values and this difference could lead the model to over-emphasize certain bispectrum coefficients. We normalize the outputs because it improves the model's predictions. Each data point (x) is normalized using the mean ($ \mu $) and standard deviation ($ \sigma $) of the set of values. The testing data is normalized with the mean and standard deviation of the training data so that the model predicts on points from the distribution it was trained on. 

$$ x_{new} = \frac{x - µ}{σ}\ $$

In [None]:
# helper function to normalize data
def normalize(test_train_properties, key, means, stdevs):
    dims = test_train_properties[key].shape

    for j in range(0, dims[0]):
        test_train_properties[key][j] = (test_train_properties[key][j] - means)/stdevs
  
    test_train_properties[key] = np.nan_to_num(test_train_properties[key])
    
    return test_train_properties

In [None]:
# iterate through properties
for key in properties_dictionaries:
    # normalize inputs
    means_ins = np.mean(properties_dictionaries[key]["inputs_train"], axis=0)
    stdevs_ins = np.std(properties_dictionaries[key]["inputs_train"], axis=0)
    test_train_properties = normalize(properties_dictionaries[key], "inputs_train", means_ins, stdevs_ins)
    test_train_properties = normalize(properties_dictionaries[key], "inputs_test", means_ins, stdevs_ins)

    # normalize outputs
    means_outs = np.mean(properties_dictionaries[key]["outputs_train"], axis=0)
    stdevs_outs = np.std(properties_dictionaries[key]["outputs_train"], axis=0)
    test_train_properties = normalize(properties_dictionaries[key], "outputs_train", means_outs, stdevs_outs)
    test_train_properties = normalize(properties_dictionaries[key], "outputs_test", means_outs, stdevs_outs)

    # create dictionary to store stats data
    stats_dict = {"means_ins": means_ins, "stdevs_ins": stdevs_ins, 
                  "means_outs": means_outs, "stdevs_outs": stdevs_outs}

    # add stats data for property to dictionary
    properties_dictionaries[key]['stats'] = stats_dict

    stats_dict_list = stats_dict 

    # save states data for each property
    for x in stats_dict_list:
        stats_dict_list[x] = stats_dict_list[x].tolist()
    json = js.dumps(stats_dict)
    f = open("{}_stats.json".format(key), "w")
    f.write(json)
    f.close()

## 5. Train Neural Network
A neural network is a machine learning model that was inspired by how we think the human brain works. Neural networks consist of a collection of neurons that have different weights and biases and are used to predict desired outputs. Here, we set up the architecture for the neural network. The network has a dense layer with 512 neurons followed by a dropout layer with dropout of 0.2. The activation function used is elu and the adagrad optimizer is used. A simple schematic of the network is shown below.

<p float="left">
  <img src=neural_net.png width="700px" height='300px' align="center" /> 
</p>

We tested a few neural network architectures and activation functions to determine how the architecture/activation function affected the model performance and minimize the error of the predictions. The errors are plotted for each property below for a range of different architectures. The architectures tested all had a dropout layer and included:

    - 1 layer with 64 neurons
    - 1 layer with 256 neurons
    - 1 layer with 512 neurons
    - 3 layers with 256 neurons, 512 neurons, and 256 neurons
    - 5 layers with 128 neurons, 256 neurons, 512 neurons, 256 neurons, and 128 neurons
    
For relaxed vacancy formation energy, cohesive energy, and volume, there is no affect of architecture on error. For pressure, increasing the size and number of layers gives small improvements in the error. We ultimately chose to use 1 layer with 512 neurons because this was improved on the smaller one layer networks for the pressure predictions while keeping the network architecture simple.


<p float="left">
  <img src=model-architectures-test.jpg width="700px" height='300px' align="center" /> 
</p>

For the one layer network, we tested a few different activation functions. Say what an activation function is. The activation functions we tested were:

    - elu
    - relu
    - tanh
    - sigmoid
    
The results are shown below. Again, there is not a big difference between the results for most of the properties. We went with elu.


<p float="left">
  <img src=model-activation-fxn-test.jpg width="700px" height='300px' align="center" /> 
</p>

Lastly, we looked at the effect of the dropout layer and found that whether or not there was a dropout layer had minimal effect on the results, so we decided to include the dropout layer in our architecture.


<p float="left">
  <img src=model-dropout-layer-test.jpg width="500px" height='200px' align="center" /> 
</p>

In [None]:
# set seeds to use when initializing layers
seeds = [0,1]

# initializers for each layer
initializer1 = ke.initializers.glorot_normal(seed=seeds[0])
bias_initial1 = ke.initializers.Zeros()
    
initializer2 = ke.initializers.glorot_normal(seed=seeds[1])
bias_initial2 = ke.initializers.Zeros()  

# build model - 1 input layer, 3 hidden layers, 1 output layer
input_layer = ke.layers.Input(shape=(63, ))

layer1 = ke.layers.Dense(512, activation = 'elu', kernel_initializer = initializer1, 
                         bias_initializer = bias_initial1)(input_layer)
layer2 = ke.layers.Dropout(0.2)(layer1)
output = ke.layers.Dense(1, activation = 'linear', kernel_initializer = initializer2, 
                         bias_initializer = bias_initial2)(layer2)

model = ke.models.Model(inputs=input_layer, outputs=output)

# specify optimizer
optimizer = ke.optimizers.Adagrad(0.002)
  
# compile model for training
model.compile(loss = 'mse', optimizer=optimizer, metrics=['mae'])

# print summary of model
model.summary()

Here the neural network is trained to predict each of the output properties when bispectrum coefficients and central atom descriptors are input. We use 10% of the training data for validation. The validation data differs from the testing data in that it is used to evaluate and improve the model during training, while the test data will be used after training to evaluate model performance. 

During training, we also plot a loss graph, showing the mean absolute error for each training epoch. This graphs shows how the error on the training set and the validation set changes throughout training. 

In [None]:
# iterate through properties 
for key in properties_dictionaries:
    saveid = key

    # split data from dictionary into training and testing data
    train_inputs = properties_dictionaries[key]['inputs_train']
    train_outputs = properties_dictionaries[key]['outputs_train']
    test_inputs = properties_dictionaries[key]['inputs_test']
    test_outputs = properties_dictionaries[key]['outputs_test']
    train_elements = properties_dictionaries[key]['elements_train']
    test_elements = properties_dictionaries[key]['elements_test']
    num_elements = properties_dictionaries[key]['num_element']
  
    # trains model
    history = model.fit(train_inputs, train_outputs, batch_size=train_inputs.shape[0],
                      epochs=5000, verbose = False, validation_split = 0.1, shuffle = False)
    model.save('{}_trained_model.h5'.format(saveid))
    
    # saves model in file 'Property_trained_model.h5'
    saved_model = ke.models.load_model('{}_trained_model.h5'.format(saveid))

    # print loss and MAE of model 
    [loss, mae] = saved_model.evaluate(test_inputs, test_outputs, verbose = 0)
    print("Loss: ", loss)
    print("Mean Absolute Error: ", mae)

    # plot mean absolute error as a function of training epoch
    plt.figure()
    plt.xlabel('Epoch')
    plt.ylabel('Mean Abs Error')
    plt.title('Mean Abs Error versus Epoch - {}'.format(key))
    plt.plot(history.epoch, np.array(history.history['mean_absolute_error']),label='Loss on training set')
    plt.plot(history.epoch, np.array(history.history['val_mean_absolute_error']),label='Validation loss')
    plt.legend()
    plt.show()

## 6. Evaluate Neural Network
Model is evaluated by plotting the predicted versus actual data and calculating the (mean absolute error) MAE and (mean squared error) MSE of the test predictions (equations are shown below). We divide the MAE and MSE by the range for each property so that we can compare the error for different output properties. In the predicted versus actual data plots for a well-trained model, all the points on the plot should fall along the x=y line, meaning the model predictions matched the data perfectly. For our models we see that the points are all clustered around the x=y line, which means the neural network has some predictive power.

$$ MSE = \frac{\frac{1}{n}\sum\limits _{i=1} ^{n}(Y_{i}-\hat{Y}_{i})^2}{max-min} $$


$$ MAE = \frac{\frac{1}{n}\sum\limits _{i=1} ^{n}|Y_{i}-\hat{Y}_{i}|}{max-min} $$


In [None]:
for key in properties_dictionaries:
    # put model and stats_dict into variables
    model = '{}_trained_model.h5'.format(key)
    stats_dict = properties_dictionaries[key]['stats']

    # separate out train/test inputs/outputs
    train_inputs = properties_dictionaries[key]['inputs_train']
    train_outputs = properties_dictionaries[key]['outputs_train']
    test_inputs = properties_dictionaries[key]['inputs_test']
    test_outputs = properties_dictionaries[key]['outputs_test']
    train_elements = properties_dictionaries[key]['elements_train']
    test_elements = properties_dictionaries[key]['elements_test']
    num_elements = properties_dictionaries[key]['num_element']

    saved_model = ke.models.load_model(model)

    # undo normalization on output data for plotting
    train_data = np.zeros((len(train_outputs),2))
    train_data[:,0] = train_outputs * stats_dict["stdevs_outs"] + stats_dict["means_outs"]
    test_data = np.zeros((len(test_outputs),2))
    test_data[:,0] = test_outputs * stats_dict["stdevs_outs"] + stats_dict["means_outs"]
    
    # predict outputs of train/test data
    predict_train_data = saved_model.predict(train_inputs)
    predict_train_data = predict_train_data * stats_dict["stdevs_outs"] + stats_dict["means_outs"]
    predict_test_data = saved_model.predict(test_inputs)
    predict_test_data = predict_test_data * stats_dict["stdevs_outs"] + stats_dict["means_outs"]

    # store predicted values in train/test_data arrays
    for i in range(predict_train_data.shape[0]):
        train_data[i,1] = predict_train_data[i]
    for i in range(predict_test_data.shape[0]):
        test_data[i,1] = predict_test_data[i]

    element_ids = [24, 26, 27, 28]
    
    fig = go.Figure()
    fig.add_trace(go.Scatter(x=train_data[:,0], y=train_data[:,1],
                mode='markers',
                name='Training Dataset'))
    fig.add_trace(go.Scatter(x=test_data[:,0], y=test_data[:,1],
                mode='markers',
                name='Test Dataset'))
    x_lin = [-24, 24]
    fig.add_trace(go.Scatter(x=x_lin, y=x_lin,
                mode='lines',
                name='lines'))
    fig.update_xaxes(range=[properties_dictionaries[key]['min'], properties_dictionaries[key]['max']])
    fig.update_yaxes(range=[properties_dictionaries[key]['min'], properties_dictionaries[key]['max']])
    fig.update_layout(
        showlegend=False,
        xaxis_title="Molecular Mechanics {}".format(properties_dictionaries[key]['units']),
        yaxis_title="Neural Network {}".format(properties_dictionaries[key]['units']),
        title = "{}".format(key),
        font=dict(
            family="Times New Roman, monospace",
            size=24,
            color= "black"
        )    
    )

    fig.show()
    
    # calculate errors
    test_mse = np.mean((test_data[:,1]-test_data[:,0])**2)
    test_mae = np.mean(np.abs(test_data[:,1]-test_data[:,0]))
    test_error = (test_data[:,1]-test_data[:,0])
  
    # find a normalized mse and mae
    max = np.amax(test_data[:,0])
    min = np.amin(test_data[:,0])
    test_range = np.abs(max - min)
    mse_norm = test_mse/test_range
    mae_norm = test_mae/test_range

    # print MSE/MAE
    print(f'Test_MAE/range: {mae_norm:.5f}')
    print(f'Test_MSE/range: {mse_norm:.5f}')
    

## 7. Analyze Effects of Central Atom Descriptors as Inputs

We compare the results of using the following inputs for the model:
- only bispectrum coefficients
- bispectrum coefficients and central atom descriptors
- bispectrum coefficients, central atom descriptors, and nearest neighbor descriptors.

A plot that shows the error for each property with these three sets of inputs is shown below. For relaxed vacancy formation energy, pressure, and volume, using the central atom descriptors leads to small improvements in the MAE. For the cohesive energy, the central atom descriptors provide a larger improvement for the Cr and Co atoms. Adding nearest neighbor descriptors does not offer additional improvements beyond the central atom descriptors.

<p float="left">
  <img src=MAE-compare-bs-centr-nn.jpg width="700px" height='300px' align="center" /> 
</p>

The scatter plots below show the neural network predictions versus the molecular mechanics predictions for each property. Using central atom descriptors results in the neural network predictions more closely matching the molecular mechanics predictions.

<p float="left">
  <img src=bs-coeffs/relaxed-vfe-CrFeCoNi-bs-centr-compare.png width="700px" height='300px' align="center" /> 
</p>

<p float="left">
  <img src=bs-coeffs/cohesive-energy-CrFeCoNi-bs-centr-compare.png width="700px" height='300px' align="center" /> 
</p>

<p float="left">
  <img src=bs-coeffs/pressure-CrFeCoNi-bs-centr-compare.png width="700px" height='300px' align="center" /> 
</p>

<p float="left">
  <img src=bs-coeffs/volume-CrFeCoNi-bs-centr-compare.png width="700px" height='300px' align="center" /> 
</p>


In [None]:
print('done!')