# trixs-dl-models
## Models
* [A Deep-Learning model](Train_Run_DL_Models.ipynb) based on the [Random Forest model](https://github.com/TRI-AMDD/trixs/blob/Torrisi_XANES_RF_2020/notebooks/Train_Run_Models.ipynb) from a published article: 
   * [Random forest machine learning models for interpretable X-ray absorption near-edge structure spectrum-property relationships](https://www.nature.com/articles/s41524-020-00376-6)
* Current version: DL vs RF on pointwise spectra data
  * NN model without regularization : [Train_Run_DL_Models.ipynb](Train_Run_DL_Models.ipynb)
  * NN model with regularization : [Train_Run_DL_Models_V2.ipynb](Train_Run_DL_Models.ipynb)
  * CNN model:
    * trained with original data : [Train_Run_DL_Models_CNN_originalData.ipynb](Train_Run_DL_Models_CNN_originalData.ipynb)
    * data augmentation (average pooling): [Train_Run_DL_Models_CNN_moreData.ipynb](Train_Run_DL_Models_CNN_moreData.ipynb)
* All scenarios:
  
| Status                   | Model      | Data      | Iteration/Epoch | Cross-Validation | Kernel-Size | Feature Importance | Notebook | Performance Bader | Performance MD | Performance All |
| ------------------------ | -----------| --------- | -------------   | ---------------- | ----------- | --  | -------- | -------- | -------- | -------- |
| :heavy_check_mark: | Random Forest    | original  | 300             | 3                | -           | Yes | [done](Train_Run_DL_Models_CNN_originalData.ipynb)| ![img](figures_feffnorm/feff_cnn_originalData_20_bader_uniparity.svg) | ![img](figures_feffnorm/feff_cnn_originalData_20_md_uniparity.svg) | ![img](figures_feffnorm/feff_cnn_originalData_20_all_perf.svg) | 
| :heavy_check_mark: | Neural Networks  | original  | 300             | 3                | -           | Yes | [done](Train_Run_DL_Models.ipynb)| 
| :heavy_check_mark: | CNN              | original  | 300             | 3                | 5           | No | [done](Train_Run_DL_Models_CNN_originalData.ipynb)|  ![img](figures_feffnorm/feff_cnn_originalData_5_bader_uniparity_nn.svg) | ![img](figures_feffnorm/feff_cnn_originalData_5_md_uniparity.svg) |  ![img](figures_feffnorm/feff_cnn_originalData_5_all_perf_nn.svg) |  
| :heavy_check_mark: | CNN              | original  | 300             | 3                | 10          | No | [done](Train_Run_DL_Models_CNN_originalData_10.ipynb)| ![img](figures_feffnorm/feff_cnn_originalData_10_bader_uniparity_nn.svg) | ![img](figures_feffnorm/feff_cnn_originalData_10_md_uniparity_nn.svg) |  ![img](figures_feffnorm/feff_cnn_originalData_10_all_perf_nn.svg) | 
| :heavy_check_mark: | CNN              | original  | 300             | 3                | 20          | No | [done](Train_Run_DL_Models_CNN_originalData_20.ipynb)|  ![img](figures_feffnorm/feff_cnn_originalData_20_bader_uniparity_nn.svg) | ![img](figures_feffnorm/feff_cnn_originalData_20_md_uniparity_nn.svg) |  ![img](figures_feffnorm/feff_cnn_originalData_20_all_perf_nn.svg) |  
| :heavy_check_mark: | Random Forest    | augmented | 300             | 3                | -           | Yes | [done](Train_Run_DL_Models_CNN_moreData.ipyn)| ![img](figures_feffnorm/feff_cnn_moreData_20_bader_uniparity.svg) | ![img](figures_feffnorm/feff_cnn_moreData_20_md_uniparity.svg) |  ![img](figures_feffnorm/feff_cnn_moreData_20_all_perf.svg) | 
| :heavy_check_mark: | Neural Networks  | augmented | 300             | 3                | -           | Yes | [done](Train_Run_DL_Models_moreData.ipynb)|
| :heavy_check_mark: | CNN              | augmented | 300             | 3                | 5           | No | [done](Train_Run_DL_Models_CNN_moreData_5.ipynb)| ![img](figures_feffnorm/feff_cnn_moreData_5_bader_uniparity_nn.svg) | ![img](figures_feffnorm/feff_cnn_moreData_5_md_uniparity_nn.svg) | ![img](figures_feffnorm/feff_cnn_moreData_5_all_perf_nn.svg) | 
| :heavy_check_mark: | CNN              | augmented | 300             | 3                | 10          | No | [done](Train_Run_DL_Models_CNN_moreData_10.ipynb)| ![img](figures_feffnorm/feff_cnn_moreData_10_bader_uniparity_nn.svg) | ![img](figures_feffnorm/feff_cnn_moreData_10_md_uniparity_nn.svg) | ![img](figures_feffnorm/feff_cnn_moreData_10_all_perf_nn.svg) | 
| :heavy_check_mark: | CNN              | augmented | 300             | 3                | 20          | No | [done](Train_Run_DL_Models_CNN_moreData_20.ipynb)| ![img](figures_feffnorm/feff_cnn_moreData_20_bader_uniparity_nn.svg) | ![img](figures_feffnorm/feff_cnn_moreData_20_md_uniparity_nn.svg) | ![img](figures_feffnorm/feff_cnn_moreData_20_all_perf_nn.svg) | 


## Data:
* training data: https://data.matr.io/4/

```
wget https://s3.amazonaws.com/publications.matr.io/4/deployment/data/xanes_2019.zip

unzip xanes_2019.zip

git clone https://github.com/fengchenLBL/trixs-dl-models.git

cp -rf matrio_folder/spectral_data matrio_folder/model_data ./trixs-dl-models

cd trixs-dl-models
```

## References:
* [https://www.nature.com/articles/s41524-020-00376-6](https://www.nature.com/articles/s41524-020-00376-6)
* [https://github.com/TRI-AMDD/trixs/blob/Torrisi_XANES_RF_2020/notebooks/Train_Run_Models.ipynb](https://github.com/TRI-AMDD/trixs/blob/Torrisi_XANES_RF_2020/notebooks/Train_Run_Models.ipynb)
* [https://data.matr.io/4/](https://data.matr.io/4/)


In [7]:
import os  
import pandas as pd
import ipywidgets as widgets
from ipywidgets import interactive
from IPython.display import display
from IPython.display import SVG

## Results

In [2]:
### Random Forest w/ Original Dataset 
rf_df_originalData = pd.read_csv('figures_feffnorm/pointwise_table_feff_cnn_originalData_20.csv')
rf_df_originalData_b = 'figures_feffnorm/feff_cnn_originalData_20_bader_uniparity.svg'
rf_df_originalData_md = 'figures_feffnorm/feff_cnn_originalData_20_md_uniparity.svg'

### CNN w/ Original Dataset & Kernel size 20 
cnn_df_originalData_20 = pd.read_csv('figures_feffnorm/pointwise_table_feff_cnn_originalData_20_nn.csv')
cnn_df_originalData_20_b = 'figures_feffnorm/feff_cnn_originalData_20_bader_uniparity_nn.svg'
cnn_df_originalData_20_md = 'figures_feffnorm/feff_cnn_originalData_20_md_uniparity_nn.svg'

### CNN w/ Original Dataset & Kernel size 10
cnn_df_originalData_10 = pd.read_csv('figures_feffnorm/pointwise_table_feff_cnn_originalData_10_nn.csv')
cnn_df_originalData_10_b = 'figures_feffnorm/feff_cnn_originalData_10_bader_uniparity_nn.svg'
cnn_df_originalData_10_md = 'figures_feffnorm/feff_cnn_originalData_10_md_uniparity_nn.svg'

### CNN w/ Original Dataset & Kernel size 5 
cnn_df_originalData_5 = pd.read_csv('figures_feffnorm/pointwise_table_feff_cnn_originalData_5_nn.csv')
cnn_df_originalData_5_b = 'figures_feffnorm/feff_cnn_originalData_5_bader_uniparity_nn.svg'
cnn_df_originalData_5_md = 'figures_feffnorm/feff_cnn_originalData_5_md_uniparity_nn.svg'

### Random Forest w/ Augmented Dataset 
rf_df_moreData = pd.read_csv('figures_feffnorm/pointwise_table_feff_cnn_moreData_20.csv')
rf_df_moreData_b = 'figures_feffnorm/feff_cnn_moreData_20_bader_uniparity.svg'
rf_df_moreData_md = 'figures_feffnorm/feff_cnn_moreData_20_md_uniparity.svg'

### CNN w/ Augmented Dataset & Kernel size 20 
cnn_df_moreData_20 = pd.read_csv('figures_feffnorm/pointwise_table_feff_cnn_moreData_20_nn.csv')
cnn_df_moreData_20_b = 'figures_feffnorm/feff_cnn_moreData_20_bader_uniparity_nn.svg'
cnn_df_moreData_20_md = 'figures_feffnorm/feff_cnn_moreData_20_md_uniparity_nn.svg'

### CNN w/ Augmented Dataset & Kernel size 10
cnn_df_moreData_10 = pd.read_csv('figures_feffnorm/pointwise_table_feff_cnn_moreData_10_nn.csv')
cnn_df_moreData_10_b = 'figures_feffnorm/feff_cnn_moreData_10_bader_uniparity_nn.svg'
cnn_df_moreData_10_md = 'figures_feffnorm/feff_cnn_moreData_10_md_uniparity_nn.svg'

### CNN w/ Augmented Dataset & Kernel size 5 
cnn_df_moreData_5 = pd.read_csv('figures_feffnorm/pointwise_table_feff_cnn_moreData_5_nn.csv')
cnn_df_moreData_5_b = 'figures_feffnorm/feff_cnn_moreData_5_bader_uniparity_nn.svg'
cnn_df_moreData_5_md = 'figures_feffnorm/feff_cnn_moreData_5_md_uniparity_nn.svg'

In [3]:
### Dropdown List to show performance tables
dfs1 = {'Random Forest: Original': rf_df_originalData,
       'CNN: Original & Kernel 5': cnn_df_originalData_5, 
       'CNN: Original & Kernel 10': cnn_df_originalData_10,
       'CNN: Original & Kernel 20': cnn_df_originalData_20
      }


dfs2 = {
       'Random Forest: Augmented': rf_df_moreData,
       'CNN: Augmented & Kernel 5': cnn_df_moreData_5, 
       'CNN: Augmented & Kernel 10': cnn_df_moreData_10,
       'CNN: Augmented & Kernel 20': cnn_df_moreData_20
      }

items1 = list(dfs1.keys())
items1.extend(list(dfs2.keys()))
items2 = list(dfs2.keys())
items2.extend(list(dfs1.keys()))

dfs1.update(dfs2)

def view1(table=''):
    if table=='': table=items1[0]
    return(display(dfs1[table]))

def view2(table=''):
    if table=='': 
        table=items2[0]
    return(display(dfs1[table]))
 
w1 = widgets.Dropdown(options=items1)
w2 = widgets.Dropdown(options=items2)


fig_dfs = {'RF: Original b': rf_df_originalData_b,
           'RF: Original md': rf_df_originalData_md,
           'CNN: Original & Kernel 5 b': cnn_df_originalData_5_b,
           'CNN: Original & Kernel 5 md': cnn_df_originalData_5_md,
           'CNN: Original & Kernel 10 b': cnn_df_originalData_10_b,
           'CNN: Original & Kernel 10 md': cnn_df_originalData_10_md,
           'CNN: Original & Kernel 20 b': cnn_df_originalData_20_b,
           'CNN: Original & Kernel 20 md': cnn_df_originalData_20_md,
           'RF: Augmented b': rf_df_moreData_b,
           'RF: Augmented md': rf_df_moreData_md,
           'CNN: Augmented & Kernel 5 b': cnn_df_moreData_5_b,
           'CNN: Augmented & Kernel 5 md': cnn_df_moreData_5_md,
           'CNN: Augmented & Kernel 10 b': cnn_df_moreData_10_b,
           'CNN: Augmented & Kernel 10 md': cnn_df_moreData_10_md,
           'CNN: Augmented & Kernel 20 b': cnn_df_moreData_20_b,
           'CNN: Augmented & Kernel 20 md': cnn_df_moreData_20_md
          }

## Original Dataset 
__select the model name from the dropdown list__

In [4]:
interactive(view1, table=w1)

Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,51.38,87.54 $\pm$ 0.27,81.53 $ \pm$ 0.89,87.58 $ \pm$ 0.27,88.60 $ \pm$ 0.42,13.79 $\pm$14.95,0.070 $\pm$0.001,86.41 $\pm$0.36,0.013 $\pm$0.000
1,V,38.28,88.47 $\pm$ 0.51,96.42 $ \pm$ 0.38,82.07 $ \pm$ 0.78,87.69 $ \pm$ 0.48,78.89 $\pm$1.44,0.080 $\pm$0.000,92.98 $\pm$0.11,0.015 $\pm$0.000
2,Cr,59.15,86.60 $\pm$ 0.69,94.28 $ \pm$ 0.52,68.09 $ \pm$ 1.36,90.59 $ \pm$ 0.53,80.97 $\pm$1.47,0.060 $\pm$0.001,83.83 $\pm$0.34,0.019 $\pm$0.000
3,Mn,49.55,80.12 $\pm$ 0.26,61.43 $ \pm$ 1.91,80.37 $ \pm$ 0.24,81.30 $ \pm$ 0.33,68.41 $\pm$3.43,0.060 $\pm$0.000,91.91 $\pm$0.10,0.017 $\pm$0.000
4,Fe,46.26,84.24 $\pm$ 0.18,82.75 $ \pm$ 0.57,83.21 $ \pm$ 0.22,86.08 $ \pm$ 0.29,72.03 $\pm$2.34,0.090 $\pm$0.000,89.24 $\pm$0.12,0.015 $\pm$0.000
5,Co,52.62,81.45 $\pm$ 0.47,80.81 $ \pm$ 0.54,68.70 $ \pm$ 0.67,88.56 $ \pm$ 0.40,73.07 $\pm$3.10,0.060 $\pm$0.001,91.23 $\pm$0.16,0.016 $\pm$0.000
6,Ni,67.06,89.02 $\pm$ 0.64,83.10 $ \pm$ 1.14,79.06 $ \pm$ 1.37,93.25 $ \pm$ 0.41,7.14 $\pm$19.31,0.060 $\pm$0.001,89.44 $\pm$0.22,0.014 $\pm$0.000
7,Cu,67.75,85.06 $\pm$ 0.46,63.11 $ \pm$ 2.60,89.56 $ \pm$ 0.29,81.90 $ \pm$ 1.16,38.32 $\pm$9.89,0.080 $\pm$0.001,70.37 $\pm$0.55,0.047 $\pm$0.000
8,'Avgs.',54.01,85.31,80.43,79.83,87.25,54.08,0.07,86.92,0.02


## Augmented Dataset 
__select the model name from the dropdown list__

In [5]:
interactive(view2, table=w2)

Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,34.43,91.10 $\pm$ 0.04,97.36 $ \pm$ 0.07,86.63 $ \pm$ 0.05,89.37 $ \pm$ 0.11,21.71 $\pm$25.82,0.070 $\pm$0.000,64.13 $\pm$0.13,0.027 $\pm$0.000
1,V,34.35,88.29 $\pm$ 0.03,95.18 $ \pm$ 0.09,81.90 $ \pm$ 0.11,87.60 $ \pm$ 0.11,78.30 $\pm$2.72,0.080 $\pm$0.000,89.73 $\pm$0.04,0.025 $\pm$0.000
2,Cr,34.93,93.98 $\pm$ 0.10,98.88 $ \pm$ 0.04,90.78 $ \pm$ 0.15,92.32 $ \pm$ 0.11,77.03 $\pm$3.43,0.070 $\pm$0.000,88.07 $\pm$0.05,0.031 $\pm$0.000
3,Mn,34.54,89.17 $\pm$ 0.05,98.81 $ \pm$ 0.03,84.31 $ \pm$ 0.09,85.26 $ \pm$ 0.04,79.40 $\pm$2.73,0.060 $\pm$0.000,86.55 $\pm$0.06,0.027 $\pm$0.000
4,Fe,35.2,89.53 $\pm$ 0.05,95.13 $ \pm$ 0.06,84.61 $ \pm$ 0.09,88.99 $ \pm$ 0.08,66.35 $\pm$6.47,0.100 $\pm$0.000,79.29 $\pm$0.04,0.027 $\pm$0.000
5,Co,34.19,91.44 $\pm$ 0.13,96.79 $ \pm$ 0.11,87.09 $ \pm$ 0.21,90.48 $ \pm$ 0.14,64.92 $\pm$7.12,0.080 $\pm$0.000,82.61 $\pm$0.06,0.029 $\pm$0.000
6,Ni,34.98,93.45 $\pm$ 0.04,99.30 $ \pm$ 0.01,90.28 $ \pm$ 0.07,91.17 $ \pm$ 0.05,-21.42 $\pm$48.99,0.070 $\pm$0.000,79.26 $\pm$0.07,0.026 $\pm$0.000
7,Cu,34.78,94.52 $\pm$ 0.09,96.05 $ \pm$ 0.17,91.98 $ \pm$ 0.13,95.59 $ \pm$ 0.09,52.02 $\pm$13.11,0.090 $\pm$0.000,54.66 $\pm$0.09,0.060 $\pm$0.000
8,'Avgs.',34.67,91.44,97.19,87.2,90.1,52.29,0.08,78.04,0.03


## Performance Metrics from All Moodels 

In [6]:
for x in dfs1:
    print(x)
    display(dfs1[x])

Random Forest: Original


Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,51.38,87.54 $\pm$ 0.27,81.53 $ \pm$ 0.89,87.58 $ \pm$ 0.27,88.60 $ \pm$ 0.42,13.79 $\pm$14.95,0.070 $\pm$0.001,86.41 $\pm$0.36,0.013 $\pm$0.000
1,V,38.28,88.47 $\pm$ 0.51,96.42 $ \pm$ 0.38,82.07 $ \pm$ 0.78,87.69 $ \pm$ 0.48,78.89 $\pm$1.44,0.080 $\pm$0.000,92.98 $\pm$0.11,0.015 $\pm$0.000
2,Cr,59.15,86.60 $\pm$ 0.69,94.28 $ \pm$ 0.52,68.09 $ \pm$ 1.36,90.59 $ \pm$ 0.53,80.97 $\pm$1.47,0.060 $\pm$0.001,83.83 $\pm$0.34,0.019 $\pm$0.000
3,Mn,49.55,80.12 $\pm$ 0.26,61.43 $ \pm$ 1.91,80.37 $ \pm$ 0.24,81.30 $ \pm$ 0.33,68.41 $\pm$3.43,0.060 $\pm$0.000,91.91 $\pm$0.10,0.017 $\pm$0.000
4,Fe,46.26,84.24 $\pm$ 0.18,82.75 $ \pm$ 0.57,83.21 $ \pm$ 0.22,86.08 $ \pm$ 0.29,72.03 $\pm$2.34,0.090 $\pm$0.000,89.24 $\pm$0.12,0.015 $\pm$0.000
5,Co,52.62,81.45 $\pm$ 0.47,80.81 $ \pm$ 0.54,68.70 $ \pm$ 0.67,88.56 $ \pm$ 0.40,73.07 $\pm$3.10,0.060 $\pm$0.001,91.23 $\pm$0.16,0.016 $\pm$0.000
6,Ni,67.06,89.02 $\pm$ 0.64,83.10 $ \pm$ 1.14,79.06 $ \pm$ 1.37,93.25 $ \pm$ 0.41,7.14 $\pm$19.31,0.060 $\pm$0.001,89.44 $\pm$0.22,0.014 $\pm$0.000
7,Cu,67.75,85.06 $\pm$ 0.46,63.11 $ \pm$ 2.60,89.56 $ \pm$ 0.29,81.90 $ \pm$ 1.16,38.32 $\pm$9.89,0.080 $\pm$0.001,70.37 $\pm$0.55,0.047 $\pm$0.000
8,'Avgs.',54.01,85.31,80.43,79.83,87.25,54.08,0.07,86.92,0.02


CNN: Original & Kernel 5


Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,51.38,86.37 $\pm$ 0.71,78.93 $ \pm$ 2.94,86.53 $ \pm$ 0.81,87.59 $ \pm$ 0.90,31.82 $\pm$12.40,0.070 $\pm$0.004,84.07 $\pm$2.35,0.015 $\pm$0.003
1,V,38.28,84.38 $\pm$ 1.40,95.59 $ \pm$ 0.71,78.05 $ \pm$ 1.58,81.39 $ \pm$ 2.07,80.36 $\pm$1.94,0.080 $\pm$0.003,94.51 $\pm$1.08,0.015 $\pm$0.003
2,Cr,59.15,82.98 $\pm$ 1.35,91.52 $ \pm$ 1.73,65.15 $ \pm$ 3.15,87.28 $ \pm$ 1.00,84.78 $\pm$2.49,0.070 $\pm$0.008,87.56 $\pm$1.94,0.024 $\pm$0.005
3,Mn,49.55,80.35 $\pm$ 0.98,55.45 $ \pm$ 2.88,81.04 $ \pm$ 1.03,81.91 $ \pm$ 1.05,78.64 $\pm$2.57,0.060 $\pm$0.002,92.24 $\pm$0.62,0.017 $\pm$0.001
4,Fe,46.26,81.09 $\pm$ 1.30,80.19 $ \pm$ 1.77,80.40 $ \pm$ 1.37,82.33 $ \pm$ 2.10,76.61 $\pm$2.15,0.090 $\pm$0.004,88.24 $\pm$1.96,0.017 $\pm$0.003
5,Co,52.62,79.27 $\pm$ 2.08,78.51 $ \pm$ 1.75,68.51 $ \pm$ 2.14,86.24 $ \pm$ 2.35,78.30 $\pm$4.14,0.070 $\pm$0.005,89.29 $\pm$0.80,0.020 $\pm$0.002
6,Ni,67.06,83.71 $\pm$ 1.79,69.75 $ \pm$ 6.60,72.40 $ \pm$ 2.86,89.88 $ \pm$ 1.02,28.82 $\pm$20.63,0.060 $\pm$0.003,85.59 $\pm$7.85,0.022 $\pm$0.011
7,Cu,67.75,81.69 $\pm$ 2.33,67.44 $ \pm$ 4.45,86.94 $ \pm$ 1.82,72.61 $ \pm$ 3.26,56.27 $\pm$8.93,0.080 $\pm$0.005,60.63 $\pm$5.93,0.050 $\pm$0.002
8,'Avgs.',54.01,82.48,77.17,77.38,83.65,64.45,0.07,85.27,0.02


CNN: Original & Kernel 10


Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,51.38,86.22 $\pm$ 0.92,78.35 $ \pm$ 3.67,86.51 $ \pm$ 0.87,87.42 $ \pm$ 0.81,36.33 $\pm$9.26,0.070 $\pm$0.003,86.26 $\pm$2.53,0.015 $\pm$0.003
1,V,38.28,85.21 $\pm$ 1.17,95.93 $ \pm$ 0.62,78.89 $ \pm$ 1.38,82.47 $ \pm$ 1.70,79.74 $\pm$1.71,0.080 $\pm$0.002,94.59 $\pm$0.69,0.016 $\pm$0.002
2,Cr,59.15,84.04 $\pm$ 1.91,92.90 $ \pm$ 1.48,67.67 $ \pm$ 3.86,87.81 $ \pm$ 1.55,85.64 $\pm$2.60,0.070 $\pm$0.005,86.56 $\pm$2.37,0.023 $\pm$0.003
3,Mn,49.55,80.55 $\pm$ 0.98,60.53 $ \pm$ 4.42,81.03 $ \pm$ 0.87,81.94 $ \pm$ 1.36,78.49 $\pm$2.18,0.060 $\pm$0.001,92.16 $\pm$0.63,0.018 $\pm$0.001
4,Fe,46.26,81.05 $\pm$ 1.22,81.01 $ \pm$ 1.94,80.26 $ \pm$ 1.27,82.03 $ \pm$ 1.46,75.76 $\pm$1.46,0.090 $\pm$0.002,86.51 $\pm$3.81,0.018 $\pm$0.002
5,Co,52.62,80.38 $\pm$ 1.20,80.68 $ \pm$ 3.41,69.16 $ \pm$ 1.72,87.02 $ \pm$ 1.45,77.28 $\pm$2.69,0.070 $\pm$0.002,89.48 $\pm$2.18,0.021 $\pm$0.005
6,Ni,67.06,82.94 $\pm$ 1.18,67.02 $ \pm$ 4.47,69.78 $ \pm$ 2.63,89.63 $ \pm$ 0.84,19.16 $\pm$15.23,0.060 $\pm$0.004,85.12 $\pm$6.80,0.022 $\pm$0.009
7,Cu,67.75,81.63 $\pm$ 1.56,65.98 $ \pm$ 3.79,86.99 $ \pm$ 1.09,73.08 $ \pm$ 2.00,55.81 $\pm$8.25,0.080 $\pm$0.005,59.28 $\pm$2.48,0.052 $\pm$0.002
8,'Avgs.',54.01,82.75,77.8,77.54,83.92,63.52,0.07,85.0,0.02


CNN: Original & Kernel 20


Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,51.38,86.31 $\pm$ 0.70,77.45 $ \pm$ 2.98,86.76 $ \pm$ 0.72,87.46 $ \pm$ 0.76,37.25 $\pm$9.07,0.070 $\pm$0.003,85.34 $\pm$3.59,0.015 $\pm$0.002
1,V,38.28,84.82 $\pm$ 0.90,95.44 $ \pm$ 0.78,78.31 $ \pm$ 1.29,82.40 $ \pm$ 1.85,77.50 $\pm$1.41,0.080 $\pm$0.002,93.92 $\pm$0.49,0.017 $\pm$0.001
2,Cr,59.15,82.94 $\pm$ 1.63,91.57 $ \pm$ 0.98,64.84 $ \pm$ 2.53,87.57 $ \pm$ 1.57,85.00 $\pm$2.03,0.070 $\pm$0.005,84.91 $\pm$2.52,0.024 $\pm$0.003
3,Mn,49.55,79.76 $\pm$ 1.41,56.69 $ \pm$ 2.95,80.42 $ \pm$ 1.30,81.18 $ \pm$ 1.87,77.23 $\pm$3.36,0.060 $\pm$0.003,90.91 $\pm$1.02,0.020 $\pm$0.002
4,Fe,46.26,80.52 $\pm$ 1.58,79.96 $ \pm$ 2.20,79.62 $ \pm$ 1.51,81.85 $ \pm$ 2.05,72.56 $\pm$1.93,0.100 $\pm$0.002,86.01 $\pm$1.77,0.020 $\pm$0.003
5,Co,52.62,79.80 $\pm$ 1.55,79.80 $ \pm$ 2.46,68.90 $ \pm$ 1.66,86.49 $ \pm$ 1.81,75.78 $\pm$4.16,0.070 $\pm$0.004,89.07 $\pm$1.77,0.021 $\pm$0.003
6,Ni,67.06,83.06 $\pm$ 0.81,67.92 $ \pm$ 5.22,70.41 $ \pm$ 0.95,89.63 $ \pm$ 1.03,17.95 $\pm$21.24,0.070 $\pm$0.004,82.30 $\pm$4.81,0.025 $\pm$0.005
7,Cu,67.75,82.34 $\pm$ 0.75,65.95 $ \pm$ 3.24,87.55 $ \pm$ 0.53,74.00 $ \pm$ 2.37,38.85 $\pm$11.46,0.090 $\pm$0.004,58.47 $\pm$3.65,0.053 $\pm$0.003
8,'Avgs.',54.01,82.44,76.85,77.1,83.82,60.27,0.08,83.87,0.02


Random Forest: Augmented


Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,34.43,91.10 $\pm$ 0.04,97.36 $ \pm$ 0.07,86.63 $ \pm$ 0.05,89.37 $ \pm$ 0.11,21.71 $\pm$25.82,0.070 $\pm$0.000,64.13 $\pm$0.13,0.027 $\pm$0.000
1,V,34.35,88.29 $\pm$ 0.03,95.18 $ \pm$ 0.09,81.90 $ \pm$ 0.11,87.60 $ \pm$ 0.11,78.30 $\pm$2.72,0.080 $\pm$0.000,89.73 $\pm$0.04,0.025 $\pm$0.000
2,Cr,34.93,93.98 $\pm$ 0.10,98.88 $ \pm$ 0.04,90.78 $ \pm$ 0.15,92.32 $ \pm$ 0.11,77.03 $\pm$3.43,0.070 $\pm$0.000,88.07 $\pm$0.05,0.031 $\pm$0.000
3,Mn,34.54,89.17 $\pm$ 0.05,98.81 $ \pm$ 0.03,84.31 $ \pm$ 0.09,85.26 $ \pm$ 0.04,79.40 $\pm$2.73,0.060 $\pm$0.000,86.55 $\pm$0.06,0.027 $\pm$0.000
4,Fe,35.2,89.53 $\pm$ 0.05,95.13 $ \pm$ 0.06,84.61 $ \pm$ 0.09,88.99 $ \pm$ 0.08,66.35 $\pm$6.47,0.100 $\pm$0.000,79.29 $\pm$0.04,0.027 $\pm$0.000
5,Co,34.19,91.44 $\pm$ 0.13,96.79 $ \pm$ 0.11,87.09 $ \pm$ 0.21,90.48 $ \pm$ 0.14,64.92 $\pm$7.12,0.080 $\pm$0.000,82.61 $\pm$0.06,0.029 $\pm$0.000
6,Ni,34.98,93.45 $\pm$ 0.04,99.30 $ \pm$ 0.01,90.28 $ \pm$ 0.07,91.17 $ \pm$ 0.05,-21.42 $\pm$48.99,0.070 $\pm$0.000,79.26 $\pm$0.07,0.026 $\pm$0.000
7,Cu,34.78,94.52 $\pm$ 0.09,96.05 $ \pm$ 0.17,91.98 $ \pm$ 0.13,95.59 $ \pm$ 0.09,52.02 $\pm$13.11,0.090 $\pm$0.000,54.66 $\pm$0.09,0.060 $\pm$0.000
8,'Avgs.',34.67,91.44,97.19,87.2,90.1,52.29,0.08,78.04,0.03


CNN: Augmented & Kernel 5


Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,34.43,97.52 $\pm$ 0.46,99.58 $ \pm$ 0.06,96.47 $ \pm$ 0.63,96.62 $ \pm$ 0.73,83.36 $\pm$1.09,0.050 $\pm$0.002,79.26 $\pm$0.74,0.025 $\pm$0.001
1,V,34.35,97.73 $\pm$ 0.15,99.44 $ \pm$ 0.09,96.78 $ \pm$ 0.20,97.02 $ \pm$ 0.31,91.13 $\pm$1.18,0.050 $\pm$0.002,91.92 $\pm$0.23,0.025 $\pm$0.000
2,Cr,34.93,98.84 $\pm$ 0.14,99.92 $ \pm$ 0.06,98.25 $ \pm$ 0.23,98.37 $ \pm$ 0.25,91.97 $\pm$0.23,0.050 $\pm$0.000,93.80 $\pm$0.33,0.027 $\pm$0.001
3,Mn,34.54,97.58 $\pm$ 0.22,99.66 $ \pm$ 0.05,96.54 $ \pm$ 0.30,96.75 $ \pm$ 0.30,90.86 $\pm$0.54,0.050 $\pm$0.000,87.99 $\pm$2.08,0.029 $\pm$0.003
4,Fe,35.2,96.72 $\pm$ 0.32,98.83 $ \pm$ 0.30,95.43 $ \pm$ 0.43,96.02 $ \pm$ 0.44,87.23 $\pm$0.24,0.070 $\pm$0.004,80.81 $\pm$3.05,0.028 $\pm$0.001
5,Co,34.19,97.56 $\pm$ 0.77,99.70 $ \pm$ 0.22,96.48 $ \pm$ 1.07,96.60 $ \pm$ 1.06,86.06 $\pm$2.72,0.060 $\pm$0.004,84.30 $\pm$1.78,0.030 $\pm$0.001
6,Ni,34.98,98.52 $\pm$ 0.16,99.78 $ \pm$ 0.03,97.80 $ \pm$ 0.23,98.06 $ \pm$ 0.26,80.41 $\pm$2.14,0.050 $\pm$0.001,82.19 $\pm$2.35,0.027 $\pm$0.001
7,Cu,34.78,99.48 $\pm$ 0.08,99.58 $ \pm$ 0.06,99.25 $ \pm$ 0.11,99.62 $ \pm$ 0.08,88.15 $\pm$2.55,0.050 $\pm$0.004,84.39 $\pm$1.68,0.039 $\pm$0.002
8,'Avgs.',34.67,97.99,99.56,97.12,97.38,87.4,0.05,85.58,0.03


CNN: Augmented & Kernel 10


Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,34.43,98.03 $\pm$ 0.27,99.56 $ \pm$ 0.11,97.17 $ \pm$ 0.38,97.44 $ \pm$ 0.42,83.50 $\pm$1.50,0.050 $\pm$0.002,79.90 $\pm$1.33,0.024 $\pm$0.001
1,V,34.35,97.33 $\pm$ 0.28,99.34 $ \pm$ 0.04,96.19 $ \pm$ 0.37,96.55 $ \pm$ 0.51,92.70 $\pm$0.56,0.050 $\pm$0.003,91.46 $\pm$1.05,0.025 $\pm$0.002
2,Cr,34.93,99.01 $\pm$ 0.04,99.92 $ \pm$ 0.06,98.46 $ \pm$ 0.06,98.65 $ \pm$ 0.00,94.18 $\pm$0.54,0.040 $\pm$0.001,93.72 $\pm$0.81,0.028 $\pm$0.002
3,Mn,34.54,97.58 $\pm$ 0.38,99.74 $ \pm$ 0.04,96.56 $ \pm$ 0.53,96.64 $ \pm$ 0.57,94.40 $\pm$0.49,0.040 $\pm$0.001,90.13 $\pm$1.09,0.027 $\pm$0.002
4,Fe,35.2,96.72 $\pm$ 0.28,99.00 $ \pm$ 0.08,95.46 $ \pm$ 0.42,95.82 $ \pm$ 0.37,91.35 $\pm$0.38,0.060 $\pm$0.003,85.97 $\pm$0.70,0.027 $\pm$0.001
5,Co,34.19,97.95 $\pm$ 0.27,99.74 $ \pm$ 0.17,97.03 $ \pm$ 0.37,97.18 $ \pm$ 0.42,91.70 $\pm$0.73,0.050 $\pm$0.001,89.38 $\pm$0.39,0.027 $\pm$0.000
6,Ni,34.98,98.43 $\pm$ 0.23,99.68 $ \pm$ 0.12,97.69 $ \pm$ 0.35,98.02 $ \pm$ 0.24,81.41 $\pm$1.75,0.050 $\pm$0.002,85.14 $\pm$2.64,0.028 $\pm$0.004
7,Cu,34.78,99.40 $\pm$ 0.11,99.49 $ \pm$ 0.05,99.14 $ \pm$ 0.16,99.60 $ \pm$ 0.12,92.06 $\pm$1.08,0.040 $\pm$0.004,85.36 $\pm$0.95,0.037 $\pm$0.001
8,'Avgs.',34.67,98.06,99.56,97.21,97.49,90.16,0.05,87.63,0.03


CNN: Augmented & Kernel 20


Unnamed: 0,Material,Coord Baseline,Coord Acc.,Coord F1 (4),Coord F1 (5),Coord F1 (6),Bader $R^2$,Bader MAE,Mean NN $R^2$,Mean NN-MAE
0,Ti,34.43,97.39 $\pm$ 0.77,99.69 $ \pm$ 0.07,96.30 $ \pm$ 1.03,96.28 $ \pm$ 1.19,85.70 $\pm$1.40,0.040 $\pm$0.001,79.13 $\pm$1.23,0.025 $\pm$0.001
1,V,34.35,97.09 $\pm$ 0.49,99.37 $ \pm$ 0.11,95.84 $ \pm$ 0.68,96.15 $ \pm$ 0.73,93.91 $\pm$0.70,0.040 $\pm$0.002,92.02 $\pm$0.61,0.025 $\pm$0.001
2,Cr,34.93,99.02 $\pm$ 0.04,99.89 $ \pm$ 0.08,98.50 $ \pm$ 0.03,98.67 $ \pm$ 0.07,93.92 $\pm$0.36,0.040 $\pm$0.001,93.62 $\pm$0.56,0.028 $\pm$0.001
3,Mn,34.54,96.87 $\pm$ 0.41,99.72 $ \pm$ 0.10,95.53 $ \pm$ 0.59,95.64 $ \pm$ 0.52,94.42 $\pm$0.28,0.040 $\pm$0.001,89.35 $\pm$1.29,0.028 $\pm$0.003
4,Fe,35.2,97.12 $\pm$ 0.22,99.15 $ \pm$ 0.05,96.01 $ \pm$ 0.30,96.33 $ \pm$ 0.33,91.22 $\pm$0.56,0.060 $\pm$0.001,86.61 $\pm$0.97,0.026 $\pm$0.001
5,Co,34.19,98.32 $\pm$ 0.38,99.77 $ \pm$ 0.12,97.55 $ \pm$ 0.54,97.71 $ \pm$ 0.48,91.67 $\pm$0.65,0.050 $\pm$0.001,89.60 $\pm$0.13,0.026 $\pm$0.000
6,Ni,34.98,98.49 $\pm$ 0.11,99.81 $ \pm$ 0.13,97.75 $ \pm$ 0.16,98.00 $ \pm$ 0.25,84.68 $\pm$2.23,0.040 $\pm$0.002,84.18 $\pm$1.86,0.028 $\pm$0.002
7,Cu,34.78,99.34 $\pm$ 0.10,99.55 $ \pm$ 0.18,99.05 $ \pm$ 0.15,99.44 $ \pm$ 0.03,92.58 $\pm$0.30,0.040 $\pm$0.003,87.07 $\pm$0.14,0.036 $\pm$0.000
8,'Avgs.',34.67,97.96,99.62,97.07,97.28,91.01,0.04,87.7,0.03
