kd-switch

This repository contains a Python implementation of the kd-switch online predictor and the KDS-seq sequential two-sample test proposed in our paper:

Low-Complexity Nonparametric Bayesian Online Prediction with Universal Guarantees
Alix Lhéritier & Frédéric Cazals
NeurIPS 2019

ArXiv version

It also contains a Python implementation of the k-nearest neighbors based online predictor and the KNN-seq sequential two-sample test of our previous paper:

A Sequential Non-Parametric Multivariate Two-Sample Test
Alix Lhéritier & Frédéric Cazals
IEEE Transactions on Information Theory, 64(5):3361–3370, 2018.

This repository contains the Python implementation used to produce the results in the paper. For a more efficient C++ implementation with Python bindings, see https://github.com/alherit/kd-switch-cpp.

Dependencies

python >= 2.7
numpy
h5py
lxml
scikit-learn
scipy
pandas
matplotlib
psutil
pomegranate

Reproducing experimental results

Each experiment is defined in its own bash script file in the scripts folder. The trial index determines the seed used for randomness. For each trial, two files are generated: an xml with the summary of the results and a text file containing elpased time, predicted probability for the observed label and cumulated log loss for each data point.

In order to reproduce the results of the paper, you should run each of the following scripts in an empty folder:

experiment1.sh 
experiment2.sh

The DATAPATH variable needs to point to the folder containing HIGGS and GANMNIST datasets. The PYTHONPATH variable needs to include the code and scripts folders.

Higgs boson dataset

In our experiments, we used a random subset of the HIGGS dataset from the UCI Machine Learning Repository. This subset can be downloaded in HDF5 format from here. HDF5 format is convenient for sequential tests since it allows constant time sampling.

GAN generated vs real MNIST dataset

This dataset was generated using the pretrained model available at https://github.com/csinva/gan-pretrained-pytorch. The dataset can be downloaded in HDF5 format from here

License

MIT license.

If you have questions or comments about anything regarding this work, please see the paper for contact information.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
code		code
scripts		scripts
LICENSE		LICENSE
README.md		README.md
poster.pdf		poster.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kd-switch

Dependencies

Reproducing experimental results

Higgs boson dataset

GAN generated vs real MNIST dataset

License

About

Releases

Packages

Languages

License

alherit/kd-switch

Folders and files

Latest commit

History

Repository files navigation

kd-switch

Dependencies

Reproducing experimental results

Higgs boson dataset

GAN generated vs real MNIST dataset

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages