Skip to content

jaygshah/Interpretable-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 

Repository files navigation

Interpretable-AI readings and resources

-- In Progress --

I will be updatting this more often as I read more papers and related readings along my research

Readings and Articles

Research Papers

  • Explaining Neural Networks by Decoding Layer Activations

  • InterpNET: Neural Introspection for Interpretable Deep Learning

    • And it's publicly available code
  • What do Deep Networks Like to See?

  • Towards A Rigorous Science of Interpretable Machine Learning

    • (Finale Doshi-Velez et al) discusses the field of interpretability research, and how it can be made more rigorous and well-defined. The authors first highlight the problem of defining interpretability in the first place - they don't have a resolution to this problem, but suggest that we can think of interpretability in terms of what it's used for. They claim that interpretability is used for confirming other important desiderata in ML systems, which stem from an incompleteness in the problem formalization. For example, if we want a system to be unbiased but aren't able to formally specify this in the reward function, or the reward we're optimising for is only a proxy of the true reward, then we could use interpretability to inspect our model and see whether it's reasoning how we want it to.

    • The authors next move on to discussing how we can evaluate interpretability methods, providing a taxonomy of different evaluation methods: Application-grounded is when the method is evaluated in the context it will actually be used in, by real humans (i.e. doctors getting explanations for AI diagnoses); Human-grounded is about conducting simpler human-subject experiments (who are perhaps not domain experts) using possibly simpler tasks than what the intended purpose of the method is; Functionally-grounded is where no humans are involved in the experiments, and instead some formal notion of interpretability is measured for the method to evaluate its quality. Each of these evaluation methods can be used in different circumstances, depending on the method and the context it will be used in.

    • Finally, the authors propose a data-driven approach to understanding the factors which are important in interpretability. They propose to try and create a dataset of applications of machine learning models to tasks, and then analyse this dataset to find important factors. They list some possible task- and method- related factors, and then conclude with recommendations to researchers doing interpretability.

  • Visualizing and Understanding Convolutional Networks - Also, in Cousera CNN > Week-4

Books

Projects


Cool Tools

About

Personal collection of resources to get started on Interpretability in AI (... still being updated ...)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published