author: Iain Carmichael
Additional documentation, examples and code revisions are coming soon. For questions, issues or feature requests please reach out to Iain: iain@unc.edu.
This package implements Direction-Projection-Permutation for High Dimensional Hypothesis Tests (DiPoPerm). For details see Wei et al, 2016 (paper link, arxiv link). DiProPerm "rigorously assesses whether a binary linear classifier is detecting statistically significant differences between two high-dimensional distributions."
Wei, S., Lee, C., Wichers, L., & Marron, J. S. (2016). Direction-projection-permutation for high-dimensional hypothesis tests. Journal of Computational and Graphical Statistics, 25(2), 549-569.
The diproperm package can be installed via pip or github. This package is currently only tested in python 3.6.
pip install diproperm
git clone https://github.com/idc9/diproperm.git python setup.py install
from sklearn.datasets import make_blobs
import numpy as np
import matplotlib.pyplot as plt
# %matplotlib inline
from diproperm.DiProPerm import DiProPerm
# toy binary class dataset (two isotropic Gaussians)
X, y = make_blobs(n_samples=100, n_features=2, centers=2, cluster_std=2)
# DiProPerm with mean difference classifier, mean difference summary
# statistic, and 1000 permutation samples.
dpp = DiProPerm(B=1000, separation_stats=['md', 't', 'auc'], clf='md')
dpp.fit(X, y)
dpp.test_stats_['md']
{'Z': 11.704865481794599,
'cutoff_val': 1.2678333596648679,
'obs': 4.542253375623943,
'pval': 0.0,
'rejected': True}
plt.figure(figsize=[12, 5])
# show histogram of separation statistics
plt.subplot(1, 2, 1)
dpp.plot_perm_sep_stats(stat='md')
# the observed scores
plt.subplot(1, 2, 2)
dpp.plot_observed_scores()
For more example code see these example notebooks.
Additional documentation, examples and code revisions are coming soon. For questions, issues or feature requests please reach out to Iain: iain@unc.edu.
The source code is located on github: https://github.com/idc9/diproperm
Testing is done using nose.
We welcome contributions to make this a stronger package: data examples, bug fixes, spelling errors, new features, etc.