Skip to content

maragraziani/concept_discovery_svd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

Uncovering Unique Concept Vectors through Latent Space Decomposition

Source code for the paper in TMLR 2023. Perform concept discovery in the latent space of deep learning models with Singular Value Decomposition.
View Demo · Report Bug

Supported Models
  1. State-of-the-art CNNs
    • Inception V3
    • ResNet 50
  2. MLP for tabular data
Datasets
  1. ImageNet
  2. XAI Benchmark

About CDISCO

This repo contains the implementation of the CDISCO toolkit proposed in the paper "Uncovering Unique Concept Vectors through Latent Space Decomposition", published in Transactions in Machine Learning Research in 2023.

Achieving broader interpretability with concept vectors requires a reverse engineering approach that focuses on automating concept identification. The central question here is: Given a representation of a complex model such as its deep latent space spanned by individual neurons, is this already an interpretable version? If not, can we find a different description of this space that aligns with semantically distinct and unique concepts? We propose to analyze the latent space of a deep neural network with Singular Value Decomposition, to discover a new representation of the space that best describes "what the model has learnt". This novel framework merges factorization, clustering of the latent space and output-sensitivity analyses. As a result, we are able to isolate directions in the latent space that respond to well-distinguishable, unique concepts.

Functionalities

CDISCO can be used to identify the singular vectors, to visualize concept maps and to analyze the model internal state.

CDISCO - main tool

   import cdisco.cdisco
  • to implement - run_cdisco(model, input_data, save_fold='') -> runs get_model_state() and then discovery()
  • discovery(conv_maps, gradients, prediction, classes) -> concept_candidates, eigenvectors
  • get_model_state(model, input_data, save_fold='') -> performs inference on input_data and stores the output in save_fold

CDISCO - vis tool

The visualization toolbox allows us to visualize and interpret the results of CDISCO.

   import cdisco.vis
  • cdisco.vis.cdisco_concept_vis(image_path, concept_vector, conv_maps) -> concept_heatmap
  • cdisco.vis.cdisco_vis_extremes_extensive(concepts_list, concept_candidates, eigenvectors, conv_maps, input_paths, predictions, save_fold='') -> visualizes the top 5 images that have the highest projection on the concept direction and saves it in save_fold
  • cdisco.vis.conceptbard(concept, save_fold='') -> saves in save_fold a visualization of the cncept segmentations to create a board that is representative of the concept

CDISCO - analyze tool

   import cdisco.analyze
  • cdisco.analyze.cdisco_alignment(concepts, concept_candidates) -> prints the list of classes that share the same discovered concept direction
  • cidsco.analyze.cdisco_pop_concepts(concept_candidates, classes, eigenvectors, save_fold='')-> prints the top 3 most popular directions among the classes
  • cdisco.analyze.cdisco_angle_dissection(eigenvectors, candidates, save_fold='')-> stores in savefold the results of the alignment evaluation of each concept with the canonical basis.

Built With

This section should list any major frameworks that you built your project using. Leave any add-ons/plugins for the acknowledgements section. Here are a few examples.

Getting Started

Follow the steps below to install the toolkit as a github repo. Pip install will be implemented soon.

Installation

  1. Clone this repo
    git clone https://github.com/maragraziani/CDISCO.git
  2. Install required packages
    pip install -r requirements.txt
    

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Mara Graziani - @mormontre - mara.graziani@hevs.ch

About

Automatic identification of regions in the latent space of a model that correspond to unique concepts, namely to concepts with a semantically distinct meaning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published