Title: Comparative Analysis of Machine Learning Approaches on the Prediction of the Electronic Properties of Perovskites: A Case Study of ABX3 and A2BB’X6 (Materials Today Communications, Vol. 27, June 2021, 102462) https://doi.org/10.1016/j.mtcomm.2021.102462
Machine learning (ML) methods have recently been widely employed to tackle several problems in quantum mechanics and materials science. Their main objective is to develop surrogate models that can be used to bypass the costly Schrodinger equations and their approximations such as the density functional theory. However, most approaches so far have focused on perovskite-oxides and finite bandgap that are thought to be most related to solar cells. This limits their generalizability and the range of perovskites that can be captured, potentially limiting possibilities for the discovery of novel perovskites. Therefore, the current study includes finite and infinite bandgap, and investigates a more diverse set of perovskites, including oxides and halides occupying the X-anionic sites. Twelve ML techniques are then described, implemented and compared against each other on the prediction of the formation energy and the energy band gap of two distinctive crystal configurations: ABX3 and A2BB'X6. The samples are initially described using well developed and identical features. In addition, the effect of the energy above hull feature is systematically investigated among the set of initial features. As a result, the predictive performance of the formation energy is greatly improved. The Support Vector Regression (SVR) model is found to best predict the formation energy with error metrics of 0.055 eV/atom MAE, 0.096 eV/atom RMSE and 99% R2 on the test set. Higher marginal errors are observed in the prediction of the energy bandgap, with SVR accuracy measurements evaluated at 0.462 eV MAE, 0.662 eV RMSE, and 85.18% R2 on the test set. The study of the sample size effect shows that the Gradient Boosting Regression (GBR) and Random Forest Regression (RFR) models are better suited for energy bandgap prediction over the SVR model. Finally, feature importance is used to inspect the relative importance among all input features considered in study. It was found that a strong relationship exists between the standard-deviated electronegativity and the formation energy. All data and codes implemented in this study are openly available at: github.com/chenebuah/perovskite-ML.
1 - C. Li, H. Hao, B. Xu, G. Zhao, L. Chen, S. Zhang, H. Liu, A progressive learning method for predicting the band gap of ABO3 perovskites using an instrumental variable, J. Mater. Chem. C, 8, (2020), 3127.
2 - Q. Xu, Z. Li, M. Liu, W.J. Yin, Rationalizing Perovskite Data for Machine Learning and Materials Design, J. Phys. Chem. Lett., 9, (2018), 6948−6954.
3 - S. Lu, Q. Zhou, Y. Ouyang, Y. Guo, Q. Li, J. Wang, Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning, Nature Comm., 9(1), (2018), 3405.
4 - S. Körbel, M.A.L. Marques, S. Botti, Stability and electronic properties of new inorganic perovskites from high-throughput ab initio calculations, J. Mater. Chem. C, 4(15), (2016), 3157–3167.
5 - H. Lu, et al., Screening stable and metastable ABO3 perovskites using machine learning and the materials project, Comput. Mater Sci., Vol. 177, (2020), 109614