GitHub - maragraziani/concept_discovery_svd: Automatic identification of regions in the latent space of a model that correspond to unique concepts, namely to concepts with a semantically distinct meaning.

Uncovering Unique Concept Vectors through Latent Space Decomposition

Source code for the paper in TMLR 2023. Perform concept discovery in the latent space of deep learning models with Singular Value Decomposition.
View Demo · Report Bug

Supported Models

State-of-the-art CNNs
- Inception V3
- ResNet 50
MLP for tabular data

Datasets

About CDISCO

This repo contains the implementation of the CDISCO toolkit proposed in the paper "Uncovering Unique Concept Vectors through Latent Space Decomposition", published in Transactions in Machine Learning Research in 2023.

Achieving broader interpretability with concept vectors requires a reverse engineering approach that focuses on automating concept identification. The central question here is: Given a representation of a complex model such as its deep latent space spanned by individual neurons, is this already an interpretable version? If not, can we find a different description of this space that aligns with semantically distinct and unique concepts? We propose to analyze the latent space of a deep neural network with Singular Value Decomposition, to discover a new representation of the space that best describes "what the model has learnt". This novel framework merges factorization, clustering of the latent space and output-sensitivity analyses. As a result, we are able to isolate directions in the latent space that respond to well-distinguishable, unique concepts.

Functionalities

CDISCO can be used to identify the singular vectors, to visualize concept maps and to analyze the model internal state.

CDISCO - main tool

   import cdisco.cdisco

to implement - run_cdisco(model, input_data, save_fold='') -> runs get_model_state() and then discovery()
discovery(conv_maps, gradients, prediction, classes) -> concept_candidates, eigenvectors
get_model_state(model, input_data, save_fold='') -> performs inference on input_data and stores the output in save_fold

CDISCO - vis tool

The visualization toolbox allows us to visualize and interpret the results of CDISCO.

   import cdisco.vis

cdisco.vis.cdisco_concept_vis(image_path, concept_vector, conv_maps) -> concept_heatmap
cdisco.vis.cdisco_vis_extremes_extensive(concepts_list, concept_candidates, eigenvectors, conv_maps, input_paths, predictions, save_fold='') -> visualizes the top 5 images that have the highest projection on the concept direction and saves it in save_fold
cdisco.vis.conceptbard(concept, save_fold='') -> saves in save_fold a visualization of the cncept segmentations to create a board that is representative of the concept

CDISCO - analyze tool

   import cdisco.analyze

cdisco.analyze.cdisco_alignment(concepts, concept_candidates) -> prints the list of classes that share the same discovered concept direction
cidsco.analyze.cdisco_pop_concepts(concept_candidates, classes, eigenvectors, save_fold='')-> prints the top 3 most popular directions among the classes
cdisco.analyze.cdisco_angle_dissection(eigenvectors, candidates, save_fold='')-> stores in savefold the results of the alignment evaluation of each concept with the canonical basis.

Built With

This section should list any major frameworks that you built your project using. Leave any add-ons/plugins for the acknowledgements section. Here are a few examples.

Getting Started

Follow the steps below to install the toolkit as a github repo. Pip install will be implemented soon.

Installation

Clone this repo

git clone https://github.com/maragraziani/CDISCO.git

Install required packages
```
pip install -r requirements.txt
```

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Mara Graziani - @mormontre - mara.graziani@hevs.ch

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
build/lib		build/lib
cdisco.egg-info		cdisco.egg-info
dist		dist
images		images
notebooks		notebooks
results		results
scripts		scripts
LICENSE.txt		LICENSE.txt
README.md		README.md
__init__.py		__init__.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uncovering Unique Concept Vectors through Latent Space Decomposition

About CDISCO

Functionalities

Built With

Getting Started

Installation

License

Contact

About

Releases

Packages

Languages

License

maragraziani/concept_discovery_svd

Folders and files

Latest commit

History

Repository files navigation

Uncovering Unique Concept Vectors through Latent Space Decomposition

About CDISCO

Functionalities

Built With

Getting Started

Installation

License

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages