interpretability

Star

Here are 621 public repositories matching this topic...

shap / shap

Star

A game theoretic approach to explain the output of any machine learning model.

machine-learning deep-learning gradient-boosting interpretability shapley shap explainability

Updated Jun 4, 2024
Jupyter Notebook

souravsaha / ir_explain

Star

ir_explain: a Python Library of Explainable IR Methods

information-retrieval interpretability explainability document-ranking

Updated Jun 4, 2024
JavaScript

jphall663 / awesome-machine-learning-interpretability

Star

A curated list of awesome responsible machine learning resources.

Updated Jun 4, 2024

OpenMOSS / Language-Model-SAEs

Star

For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research. Open-sourced and constantly updated.

interpretability mechanistic-interpretability

Updated Jun 4, 2024
Jupyter Notebook

hi-paris / XPER

Star

A methodology designed to measure the contribution of the features to the predictive performance of any econometric or machine learning model.

machine-learning performance-metrics interpretability shap explainability shapley-value

Updated Jun 4, 2024
Python

zichuan-liu / TimeXplusplus

Star

[ICML'24] Official PyTorch Implementation of TimeX++

deep-learning time-series interpretability information-bottleneck explainable-ai xai perturbations explainability

Updated Jun 4, 2024
Python

trustyai-explainability / trustyai-explainability

Star

TrustyAI Explainability Toolkit

python java hacktoberfest interpretability xai explainability xai-library explainableai

Updated Jun 4, 2024
Java

stanfordnlp / pyreft

Star

ReFT: Representation Finetuning for Language Models

interpretability reft representation-finetuning

Updated Jun 4, 2024
Python

evan-lloyd / graphpatch

Star

graphpatch is a library for activation patching on PyTorch neural network models.

pytorch interpretability large-language-models mechanistic-interpretability

Updated Jun 4, 2024
Python

stanfordnlp / pyvene

Star

Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions

intervention interpretability mechanistic-interpretability activation-intervention activation-patching

Updated Jun 3, 2024
Python

stevenbischoff / NLP-Genre-Classification

Star

Creating a PyTorch LSTM and Transformer to classify movies by genre and visualizing the LSTM's reasoning process

visualization nlp sqlalchemy google-cloud pytorch dash lstm interpretability

Updated Jun 3, 2024
Jupyter Notebook

leap-laboratories / llm-attribution

Star

An attribution library for LLMs

ai pytorch artificial-intelligence interpretability llm

Updated Jun 4, 2024
Jupyter Notebook

ndif-team / nnsight

Star

The nnsight package enables interpreting and manipulating the internals of deep learned models.

python machine-learning pytorch neural-networks interpretability

Updated Jun 3, 2024
Jupyter Notebook

google / yggdrasil-decision-forests

Star

A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.

javascript python go cli machine-learning cpp random-forest tensorflow pypi distributed-computing ml cart decision-trees gradient-boosting interpretability decision-forest

Updated Jun 3, 2024
C++

interpretml / interpret

Star

Fit interpretable models. Explain blackbox machine learning.

machine-learning ai scikit-learn artificial-intelligence transparency blackbox bias differential-privacy gradient-boosting interpretability interpretable-ai interpretable-ml explainable-ai explainable-ml xai interpretable-machine-learning iml explainability interpretml

Updated Jun 3, 2024
C++

microsoft / automated-explanations

Star

Explain a black-box module in natural language.

data-science machine-learning neuroscience artificial-intelligence fmri gpt explanation language-model interpretability xai fmri-data-analysis huggingface gpt4 large-language-models mechanistic-interpretability automated-interpretability

Updated Jun 3, 2024
HTML

frgfm / torch-cam

Sponsor

Star

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

python deep-learning grad-cam cnn pytorch saliency-map interpretability smoothgrad interpretable-deep-learning gradcam activation-maps class-activation-map gradcam-plus-plus score-cam

Updated Jun 3, 2024
Python

Dependable-Intelligent-Systems-Lab / xwhy

Star

Explaining black boxes with a SMILE: Statistical Mode-agnostic Interpretability with Local Explanations

machine-learning deep-learning smile lime interpretability explainability interpretability-methods xwhy

Updated Jun 2, 2024
Python

EthicalML / awesome-production-machine-learning

Star

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

machine-learning data-mining awesome deep-learning awesome-list interpretability privacy-preserving production-machine-learning mlops privacy-preserving-machine-learning explainability responsible-ai machine-learning-operations ml-ops ml-operations privacy-preserving-ml large-scale-ml production-ml large-scale-machine-learning

Updated Jun 2, 2024

zjunlp / KnowledgeCircuits

Star

Knowledge Circuits in Pretrained Transformers

natural-language-processing artificial-intelligence transformer circuit interpretability hallucination large-language-models model-editing knowledge-editing knowledge-edting knowledge-circuit

Updated Jun 2, 2024
Python

Improve this page

Add a description, image, and links to the interpretability topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the interpretability topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

interpretability

Here are 621 public repositories matching this topic...

shap / shap

souravsaha / ir_explain

jphall663 / awesome-machine-learning-interpretability

OpenMOSS / Language-Model-SAEs

hi-paris / XPER

zichuan-liu / TimeXplusplus

trustyai-explainability / trustyai-explainability

stanfordnlp / pyreft

evan-lloyd / graphpatch

stanfordnlp / pyvene

stevenbischoff / NLP-Genre-Classification

leap-laboratories / llm-attribution

ndif-team / nnsight

google / yggdrasil-decision-forests

interpretml / interpret

microsoft / automated-explanations

frgfm / torch-cam

Dependable-Intelligent-Systems-Lab / xwhy

EthicalML / awesome-production-machine-learning

zjunlp / KnowledgeCircuits

Improve this page

Add this topic to your repo