This repository collects all relevant resources about interpretability in LLMs
-
Updated
Oct 31, 2024
This repository collects all relevant resources about interpretability in LLMs
MICCAI 2022 (Oral): Interpretable Graph Neural Networks for Connectome-Based Brain Disorder Analysis
Discover and Cure: Concept-aware Mitigation of Spurious Correlation (ICML 2023)
[KDD'22] Source codes of "Graph Rationalization with Environment-based Augmentations"
Official code for the CVPR 2022 (oral) paper "OrphicX: A Causality-Inspired Latent Variable Model for Interpreting Graph Neural Networks."
[ICCV 2023] Learning Support and Trivial Prototypes for Interpretable Image Classification
Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"
Explainable AI: From Simple Rules to Complex Generative Models
TraceFL is a novel mechanism for Federated Learning that achieves interpretability by tracking neuron provenance. It identifies clients responsible for global model predictions, achieving 99% accuracy across diverse datasets (e.g., medical imaging) and neural networks (e.g., GPT).
Build a Neural net from scratch without keras or pytorch just by using numpy for calculus, pandas for data loading.
Visualization methods to interpret CNNs and Vision Transformers, trained in a supervised or self-supervised way. The methods are based on CAM or on the attention mechanism of Transformers. The results are evaluated qualitatively and quantitatively.
Explainable Speaker Recognition
Interpretability: Methods for Identification and Retrieval of Concepts in CNN Networks
Implementation of the gradient-based t-SNE sttribution method described in our GLBIO oral presentation: 'Towards Computing Attributions for Dimensionality Reduction Techniques'
My PhD thesis in NUS. Making it public so that future graduate students may benefit.
Interpretable Anomaly Severity Detection on UAV Flight Log Messages
Work on combining Logit model with an information granulation method for better interpretability
Semi-supervised Concept Bottleneck Models (SSCBM)
Add a description, image, and links to the interpretability-and-explainability topic page so that developers can more easily learn about it.
To associate your repository with the interpretability-and-explainability topic, visit your repo's landing page and select "manage topics."