Repository for paper 'Topology-Preserving Dimensionality Reduction via Interleaving Optimization'. The images we used in paper are stored in folder images
folder and codes in the ipynb
folder.
In order to re-run the experiments we have in the ipynb
folder, you will need several packages installed first.
bats
is used for general persistent homology computation, including greedy subsampling and computation flags. To install it, see the installation page(we suggest install from source files).
torch_tda
is used for optimzation on persistnet homology based on Pytorch, which supports auto differention. To install it, see the installation page(we suggest install from source files).
The synthetic data sets we used in paper are all in ipynb notebooks. Real life data sets are from
- COIL-100: https://www.kaggle.com/jessicali9530/coil100/download
- Natural Image Patches: http://pirsquared.org/research/vhatdb/full/vanhateren_iml.zip
The main function for our dimension reduction method on a data set X
with shape (n,p) is
P, opt_info = bottleneck_proj_pursuit(X)
. It will return us a projection P
with shape (p,2) and a dictionary opt_info
that stores optimzation information. You may also check a variety of parameters you can pass into the function in PH_projection_pursuit.py
.
Next, in order to see the result in 2D plane, use
X_PH = X @ P.T
plt.scatter(X_PH[:, 0],X_PH[:,1])