Machine learning algorithms create potentially more accurate models than linear models, but any increase in accuracy over more traditional, better-understood, and more easily explainable techniques is not practical for those who must explain their models to regulators or customers. For many decades, the models created by machine learning algorithms were generally taken to be black-boxes. However, a recent flurry of research has introduced credible techniques for interpreting complex, machine-learned models. Materials presented here illustrate applications or adaptations of these techniques for practicing data scientists.
Want to contribute your own examples? Just make a pull request.
A Dockerfile is provided that will construct a container with all necessary dependencies to run the examples here.
(Refer to GetData.md to obtain datasets needed for notebooks)
General
- Machine Learning Interpretability with H2O Driverless AI Booklet
by Patrick Hall, Navdeep Gill, Megan Kurka & Wen Phan - Towards A Rigorous Science of Interpretable Machine Learning
by Finale Doshi-Velez and Been Kim - Ideas for Machine Learning Interpretability
by Patrick Hall, Wen Phan, & SriSatish Ambati - Fairness, Accountability, and Transparency in Machine Learning (FAT/ML)
Techniques
- Partial Dependence: Elements of Statistical Learning, Section 10.13
- LIME: “Why Should I Trust You?” Explaining the Predictions of Any Classifier
by Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin - LOCO: Distribution-Free Predictive Inference for Regression
by Jing Lei, Max G’Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman - ICE: Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation
- Surrogate Models
Notes
-
Strata Data Conference slides about MLI
by Patrick Hall, Wen Phan, SriSatish Ambati, & H2O.ai team