This repository contains code to reproduce the experiments from
Connectivity-Optimized Representation Learning via Persistent Homology
C. Hofer, R. Kwitt, M. Dixit and M. Niethammer
ICML '19
PDF
If you use this code (or parts of it), please cite this work as
@inproceedings{Hofer19a,
title = {Connectivity-Optimized Representation Learning via Persistent Homology},
author = {C.~Hofer, R.~Kwitt, M.~Dixit and M.~Niethammer},
booktitle = {ICML},
year = {2019}}
The following setup was tested with the following system configuration:
- Ubuntu 18.04.2 LTS
- CUDA 10 (driver version 410.48)
- Anaconda (Python 3.7)
- PyTorch 1.1
In the following, we assume that we work in /tmp
(obviously, you have to
change this to reflect your choice and using /tmp
is, of course, not
the best choice :).
First, get the Anaconda installer and install Anaconda (in /tmp/anaconda3
)
using
cd /tmp/
wget https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh
bash Anaconda3-2019.03-Linux-x86_64.sh
# specify /tmp/anconda3 as your installation path
source /tmp/anaconda3/bin/activate
Second, we install PyTorch (v1.1) using
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
Third, we clone the torchph
repository from GitHub (which basically
implements all the functionality required for the experiments - previously named
chofer_torchex
) and make
it available within Anaconda.
cd /tmp/
git clone https://github.com/c-hofer/torchph.git
cd torchph
git fetch --all --tags --prune
git checkout tags/icml2019_code_release -b icml2019_code_release
cd ../
conda develop /tmp/torchph
Fourth, we clone this GitHub repository, using
cd /tmp/
git clone https://github.com/c-hofer/COREL_icml2019.git
cd COREL_icml2019
mkdir data
Finally, we modify config.py
to reflect our choice of directories:
ablation_bkb_dir = '/tmp/COREL_icml2019/models/ablation'
ablation_res_dir = '/tmp/COREL_icml2019/results_ablation'
performance_bkb_dir = '/tmp/COREL_icml2019/models/performance'
performance_res_dir = '/tmp/COREL_icml2019/results_performance'
dataset_root_generic = '/tmp/COREL_icml2019/data'
dataset_root_special = {}
Note that CIFAR10 and CIFAR100 are directly available via PyTorch and will
be downloaded automatically (to /tmp/COREL_icml2019/data
). For TinyImageNet-200,
please use the following link and extract the downloaded zip file into
/tmp/COREL_icml2019/data
:
cd /tmp/COREL_icml2019/data
wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
unzip http://cs231n.stanford.edu/tiny-imagenet-200.zip
Note: ImageNet instructions will be added soon!
All experiments have the same structure.
First a autoencoder, i.e., the "backbone", is trained on an auxiliary dataset, e.g., CIFAR10
.
Then the trained backbone's encoder is used to represent samples from the test-dataset, e.g., ImageNet
.
Here we train backbones on various auxiliary datasets (CIFAR10
, CIFAR100
, TinyImageNet
) and evaluate the one-class performance on the test datasets (CIFAR10
, CIFAR100
, TinyImageNet
, ImageNet
).
cd /tmp/COREL_icml2019
python train_backbone_performance.py
python eval_backbone_performance.py
In this group of experiments the overall impact of the hyper-parameters is evaluated. Most importantly, the impact of the weighting factor of the proposed connectivity loss.
cd /tmp/COREL_icml2019
python train_backbone_ablation.py
python eval_backbone_ablation.py
We provide two Jupyter notebooks to query results from the previous experiments,
in particular, ablation_study.ipynb
and performance_study.ipynb
.