Skip to content

ekorman/neurve

Repository files navigation

neurve

This is the repository to accompany the paper Self-supervised representation learning on manifolds, to be presented at the ICLR 2021 Workshop on Geometrical and Topological Representation Learning.

Additionally, we implement a manifold version of triplet training, which will be expounded on in an upcoming preprint.

Notebooks

MSimCLR Inference Open In Colab

This notebook will run inference using a pre-trained Manifold SimCLR model (trained on either CIFAR10, FashionMNIST, or MNIST).

Installation

Install via

pip install neurve

or, to install with Weights & Biases support, run:

pip install "neurve[wandb]"

You can also install from source by cloning this repository and then running, from the repo root, the command

pip install . # or pip install .[wandb]

The dependencies are

numpy>=1.17.4
torch>=1.3.1
torchvision>=0.4.2
scipy>=1.5.3 (for parsing the cars dataset annotations)
tqdm
tensorboardX

Datasets

To get the datasets for metric learning (the datasets we use for representation learning are included in torchvision.datasets):

Training commands

Tracking with Weights & Biases

To use Weights & Biases to log training/validation metrics and for storing model checkpoints, set the environment variable NEURVE_TRACKER to wandb. Otherwise tensorboardX will be used for metric logging and model checkpoints will be saved locally.

Manifold SimCLR

For self-supervised training, run the command

python experiments/simclr.py \
              --dataset $DATASET \
              --backbone $BACKBONE \
              --dim_z $DIM_Z \
              --n_charts $N_CHARTS \
              --n_epochs $N_EPOCHS \
              --tau $TAU \
              --out_path $OUT_PATH # if not using Weights & Biases for tracking

where

  • $DATASET is one of "cifar", "mnist", "fashion_mnist".
  • $BACKBONE is the name of the backbone network (in the paper we used "resnet50" for CIFAR10 and "resnet18" for MNIST and FashionMNIST).
  • $DIM_Z and $N_CHARTS are the dimension and number of charts, respectively, for the manifold.
  • $N_EPOCHS is the number of epochs to train for (in the paper we used 1,000 for CIFAR10 and 100 for MNIST and FashionMNIST).
  • $TAU is the temperature parameter for the contrastive loss function (in the paper we used 0.5 for CIFAR10 and 1.0 for MNIST and FashionMNIST).
  • $OUT_PATH is the path to save model checkpoints and tensorboard output.

Manifold metric learning

To train metric learning, run the command

python experiments/triplet.py \
              --data_root $DATA_ROOT \
              --dim_z $DIM_Z \
              --n_charts $N_CHARTS \
              --out_path $OUT_PATH # if not using Weights & Biases for tracking

where

  • $DATA_ROOT is the path to the data (e.g. data/CUB_200_2011/images/ or data/cars/), which should be a folder of subfolders, where each subfolder has the images for one class.
  • $DIM_Z and $N_CHARTS are the dimension and number of charts, respectively, for the manifold.
  • $OUT_PATH is the path to save model checkpoints and tensorboard output.

Citation

@inproceedings{
  korman2021selfsupervised,
  title={Self-supervised representation learning on manifolds},
  author={Eric O Korman},
  booktitle={ICLR 2021 Workshop on Geometrical and Topological Representation Learning},
  year={2021},
  url={https://openreview.net/forum?id=EofGDIGAhvR}
}