Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Dissecting the weight space of neural networks


We have trained a collection of 16,000 deep convolutional neural networks, for the purpose of performing a dissection of the weight space of neural nets. This repository provides scripts for training additional networks, as well as training meta-classifiers. A meta-classifier takes the sampled weight space as training data input and the objective is to classify which hyper-parameters were used for training a specific weight sample. For example, given only the weights of a network, can we predict which dataset was used in training the model? The purpose of a meta-classifier is to probe for information in neural network weights, in order to learn about how optimization locally shapes the weights of a network.

For information on the study of the weight space, we refer to the paper, which will be presented at the European Conference on Artificial Intelligence (ECAI 2020). For now, the paper can be found on arXiv, and if you use the code or dataset please cite according to:

  author       = "Eilertsen, Gabriel and J\"onsson, Daniel and Ropinski, Timo and Unger, Jonas and Ynnerman, Anders",
  title        = "Classifying the classifier: dissecting the weight space of neural networks",
  journal=     = "arXiv preprint arXiv:2002.05688",
  year         = "2020"


The entire dataset of neural network weights is available here. The dataset consists of 16,000 CNNs, captured at 20 different steps during training, from initialization to converged model. This means that the dataset in total is comprised of 320,000 CNN weight snapshots.

The dataset contains two different subsets:

  • nws_fixed: 3,000 trained nets with fixed architecture, and randomly selected dataset and hyper-parameters (see the paper for details).
  • nws_main: 13,000 trained nets with randomly selected architecture, dataset and hyper-parameters.

Each of these has been split into a training and test set, by means of the provided CSV files. Also, the CSV files contain information about the architecture and hyper-parameters of each trained net. Note also that the CSV files do not utilize all the trained CNNs. Some models have been excluded due to poor convergence in training (very low test accuracy).

Providing the path to a certain CSV file, util.import_weights() can be used to read a dataset with annotations for a selected hyper-parameter.


Sampling the weight space is used to sample/train a network with randomly selected hyper-parameters, e.g.:

python --data_nmn=mnist --log_dir=OUTPUT_PATH --prints=20 --fixed=0

prints selects how many times during a training weights should be exported. fixed can be used to fix the architecture, i.e. so architectural parameters are the same in each training.

Training on the weight space dataset can train and evaluate a meta-classifier, i.e. classification of raw weights given some chosen hyper-parameter. For example, this could use the nws_main dataset to learn how to classify which dataset a weight sample was trained on:

python --data_train=nws_main_train.csv --data_test=nws_main_test.csv --prop=0 --K=11,20 --slice_length=5000

prop selects which hyper-parameter to classify. K specifies which snapshots of a network to use in training, in this case snapshots 11-20 (each model of the dataset has 20 weight snapshots, from initialization to converged model). slice_length selects how large subset of weights to use in training, i.e. it specifies the slice of weights for local meta-classification.


Copyright (c) 2020, Gabriel Eilertsen. All rights reserved.

The code is distributed under a BSD license. See LICENSE for information.


Dissecting the weight space of neural networks




No releases published


No packages published


You can’t perform that action at this time.