Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision (AAAI2024)

Wonjoon Chang* · Dahee Kwon* · Jaesik Choi (* Equal Contribution)
[Paper][Poster]

This is the official pytorch implementation of Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision which is published on AAAI 2024.

Abstract

Understanding intermediate representations of the concepts learned by deep learning classifiers is indispensable for interpreting general model behaviors. In this paper, we propose a novel unsupervised method for discovering distributed representations of concepts by selecting a principal subset of neurons. Our empirical findings demonstrate that instances with similar neuron activation states tend to share coherent concepts. Based on the observations, the proposed method selects principal neurons that construct an interpretable region, namely a Relaxed Decision Region (RDR), encompassing instances with coherent concepts in the feature space. Our method identifies unlabeled data subclasses, misclassification causes, and revealing distinct representations across layers for deeper insights into deep learning mechanisms.

Example

import numpy as np
from models.configurations import *
from models.rdr import RDR, visualize

train_data = ''
model = ''

'''choosing target instance'''
rand_target = np.random.choice(len(train_data),1)[0]
tar_label = int(train_data[:][1][rand_target].numpy())
org_class = int(train_data.true_labels[rand_target])

print('Pred: ', class_dict[tar_label])
print('True: ', class_dict[org_class])

'''computing configuration distance'''
_, configs = getconfigs(train_data,model,tar_layer=27)
config_values = config_dist(configs, rand_target,np.arange(len(features)), n_jobs=4)
config_similars = np.argsort(config_values)

'''forming RDR'''
rdr = RDR(config_similars, configs)
rdr_samples, rdr_neurons, rdr_states = rdr.selection(k=8, t=10)

visualize(rdr_samples, train_data[:][0])

Please refer to the notebook.

Citation

If you find this repo useful, please cite our paper:

@article{chang2023understanding,
  title={Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision},
  author={Chang, Wonjoon and Kwon, Dahee and Choi, Jaesik},
  journal={arXiv preprint arXiv:2312.17285},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
data		data
imgs		imgs
models		models
RDR_poster.pdf		RDR_poster.pdf
README.md		README.md
Relaxed-Decision-Region.ipynb		Relaxed-Decision-Region.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision (AAAI2024)

Abstract

Example

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision (AAAI2024)

Abstract

Example

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages