Source code for the paper in TMLR 2023.
Perform concept discovery in the latent space of deep learning models with Singular Value Decomposition.
View Demo
·
Report Bug
Supported Models
-
State-of-the-art CNNs
- Inception V3
- ResNet 50
- MLP for tabular data
Datasets
This repo contains the implementation of the CDISCO toolkit proposed in the paper "Uncovering Unique Concept Vectors through Latent Space Decomposition", published in Transactions in Machine Learning Research in 2023.
Achieving broader interpretability with concept vectors requires a reverse engineering approach that focuses on automating concept identification. The central question here is: Given a representation of a complex model such as its deep latent space spanned by individual neurons, is this already an interpretable version? If not, can we find a different description of this space that aligns with semantically distinct and unique concepts? We propose to analyze the latent space of a deep neural network with Singular Value Decomposition, to discover a new representation of the space that best describes "what the model has learnt". This novel framework merges factorization, clustering of the latent space and output-sensitivity analyses. As a result, we are able to isolate directions in the latent space that respond to well-distinguishable, unique concepts.
CDISCO can be used to identify the singular vectors, to visualize concept maps and to analyze the model internal state.
CDISCO - main tool
import cdisco.cdisco
- to implement - run_cdisco(model, input_data, save_fold='') -> runs get_model_state() and then discovery()
- discovery(conv_maps, gradients, prediction, classes) -> concept_candidates, eigenvectors
- get_model_state(model, input_data, save_fold='') -> performs inference on input_data and stores the output in save_fold
CDISCO - vis tool
The visualization toolbox allows us to visualize and interpret the results of CDISCO.
import cdisco.vis
- cdisco.vis.cdisco_concept_vis(image_path, concept_vector, conv_maps) -> concept_heatmap
- cdisco.vis.cdisco_vis_extremes_extensive(concepts_list, concept_candidates, eigenvectors, conv_maps, input_paths, predictions, save_fold='') -> visualizes the top 5 images that have the highest projection on the concept direction and saves it in save_fold
- cdisco.vis.conceptbard(concept, save_fold='') -> saves in save_fold a visualization of the cncept segmentations to create a board that is representative of the concept
CDISCO - analyze tool
import cdisco.analyze
- cdisco.analyze.cdisco_alignment(concepts, concept_candidates) -> prints the list of classes that share the same discovered concept direction
- cidsco.analyze.cdisco_pop_concepts(concept_candidates, classes, eigenvectors, save_fold='')-> prints the top 3 most popular directions among the classes
- cdisco.analyze.cdisco_angle_dissection(eigenvectors, candidates, save_fold='')-> stores in savefold the results of the alignment evaluation of each concept with the canonical basis.
This section should list any major frameworks that you built your project using. Leave any add-ons/plugins for the acknowledgements section. Here are a few examples.
Follow the steps below to install the toolkit as a github repo. Pip install will be implemented soon.
- Clone this repo
git clone https://github.com/maragraziani/CDISCO.git
- Install required packages
pip install -r requirements.txt
Distributed under the MIT License. See LICENSE
for more information.
Mara Graziani - @mormontre - mara.graziani@hevs.ch