Skip to content

usnistgov/hea-feature-importance-study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aggressively optimizing validation statistics can degrade interpretability of data-driven materials models

J. Chem. Phys. 155, 054105 (2021) 10.1063/5.0050885[https://dx.doi.org/10.1063/5.0050885]

Katherine Lei, Howie Joress, Nils Persson, Jason R. Hattrick-Simpers, and Brian DeCost

The principal dataset is https://github.com/CitrineInformatics/MPEA_dataset, which is included here.

We also analyzed the metallic glass data reported in 10.1038/npjcompumats.2016.28, which is included here, and is also available via matminer as glass_ternary_landolt.

The cross-validation experiments can be reproduced as follows:

ipython src/models/rf_pca_feature_importances.py
ipython src/models/rf_feature_importances.py

ipython src/visualization/pca_random_features.py
ipython src/visualization/ranked_feature_importances.py

Releases

No releases published

Packages

No packages published

Languages