HIsomap

This is a Python implementation of homology-preserving dimension reduction(DR) algorithm. The implementation is described in "Homology-Preserving Dimensionality Reduction via Manifold Landmarking and Tearing".

There is also a demo for both homology-preserving manifold landmarking and tearing.

Installation
Features
Usage
Citation
License

Installation

Tested with Python 2.7&3.7, MacOS and Linux.

Dependencies

HIsomap requires:

Python (>= 2.7 or >= 3.3)
NumPy
sklearn

Running examples requires:

matplotlib

Installation

Python2

$ git clone https://github.com/LynneYan/HIsomap.git
$ cd HIsomap
$ sudo python setup.py install

Python3

$ git clone https://github.com/LynneYan/HIsomap.git
$ cd HIsomap
$ sudo python3 setup.py install

Checking your HIsomap Installation

If you applied all the above steps successfully, you can open terminal and see "HIsomap X.XX" in pip list.

Python2

$ pip list

Python3

$ pip3 list

Run example

Python2

$ python example.py

Python3

$ python3 example.py

Features

class HIsomap(n_components=2, filter_function="base_point_geodesic_distance", BP='EP', nr_cubes=20, 
              overlap_perc=0.2, auto_tuning="off", n_neighbors=8, eigen_solver='auto', n_jobs=1, 
              clusterer=sklearn.cluster.DBSCAN(eps=0.6, min_samples=5))

Parameters

n_components, int, optional, default: 2
- Number of dimensions in which to immerse the dissimilarities.
filter_function, string, optional, default: "base_point_geodesic_distance"
- A string from ["sum", "mean", "median", "max", "min", "std", "dist_mean", "l2norm", "knn_distance_n", "height", "width", "base_point_geodesic_distance", "dist_mean", "eccentricity", "Guass_density", "density_estimator", "integral_geodesic_distance", "graph_Laplacian", "Guass_density_auto"].
- If using knn_distance_n write the number of desired neighbors in place of n: knn_distance_5 for summed distances to 5 nearest neighbors.
- If using base_point_geodesic_distance, you can adjust the parameter "BP" to locate the base point.
BP, string, optional, default: "EP"
- A string from ["EP", "BC", "DR"].
- EP means extremal point, BC means barycenter, and DR means densest region.
nr_cubes, int, optional, default: 20
- The number of intervals/hypercubes to create.
overlap_perc, float, optional, default: 0.2
- The percentage of overlap "between" the intervals/hypercubes.
auto_tuning, string, optional, default: "off"
- A string from ["off", "on"].
- If "off", the input data will be divided into nr_cube cubes with fixed length of interval.
- If "on", the input data will be divided into nr_cube cubes where each cube contain roughly the same number of points. In this case, the lengths of intervals are different. This means, in dense region, there will be more number of cubes than other regions when auto_tuning is "on".
n_neighbors, int, optional, default: 8
- Number of neighbors to consider for each point in Isomap.
eigen_solver, string, optional, default: "auto"
- A string from ["auto", "arpack", "dense"].
- auto: Attempt to choose the most efficient solver for the given problem.
- arpack: Use Arnoldi decomposition to find the eigenvalues and eigenvectors.
- dense: Use a direct solver (i.e. LAPACK) for the eigenvalue decomposition.
n_jobs, int or None, optional, default: 1
- The number of parallel jobs to run. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See Glossary for more details.
cluster, algorithm, optional, default: sklearn.cluster.DBSCAN(eps=0.6, min_samples=5)
- Scikit-learn API compatible clustering algorithm. Must provide fit and predict.

__init__(self, n_components=2, filter_function="base_point_geodesic_distance", BP='EP', nr_cubes=20, 
         overlap_perc=0.2, auto_tuning="off", n_neighbors=8, eigen_solver='auto', n_jobs=1, 
         clusterer=sklearn.cluster.DBSCAN(eps=0.6, min_samples=5))

Initialize self.

fit_transform(self, X, y=None, init=None)

Fit the data from X, and returns the embedded coordinates.

Parameters

X, array, shape (n_samples, n_features).
- Input data.
y, Ignored
init, ndarray, shape (n_samples,), optional, default: None
- Starting configuration of the embedding to initialize the SMACOF algorithm. By default, the algorithm is initialized with a randomly chosen array.

Returns

Y, array, shape (n_samples, n_components)
- Projected output.

get_landmark_index(self)

Returns

landmarks_indexes, int list
- The indexes of landmarks in input data.

get_skeleton_nodes(self)

Returns

landmarks, ndarray, shape (n_landmarks, n_features).
- Nodes of mapper graph in original domain.

get_skeleton_links(self)

Returns

skeleton, ndarray, shape (n_links, 2).
- Edges of mapper graph.

get_scalar_value(self)

Returns

lens, Numpy Array, shape (, n_samples)
- Scalar values of input data. Lower dimensional representation of data.

get_base_point(self)

Returns

basePoint, Numpy Array, shape (, n_features).
- Base point in original domain.

Usage

Python2 code

# Import denpendencies
import numpy as np

# Import the class
from HIsomap import HIsomap

# Sample data "Swiss hole"
file_name = './data/SwissHole.txt'
X = np.loadtxt(file_name)

# Initialize. In this example, the number of cubes is 25 and auto_tuning is enabled. Other parameters are using the default.
proj = HIsomap(nr_cubes=25, auto_tuning="on")

# Fit to and transform the data. Y is projected result in 2 dimensional space.
Y = proj.fit_transform(X)

# You can also get the 'mapper graph' with nodes and edges.
proj.get_skeleton_nodes()
proj.get_skeleton_links()

Python3 code

# Import denpendencies
import numpy as np
import sklearn

# Import the class
from HIsomap import HIsomap

# Sample data "octa"
file_name = './data/octa.txt'
X = np.loadtxt(file_name)

# Initialize. The auto_tuning is turned off. And we used parameters from our paper.
proj =  HIsomap.HIsomap(nr_cubes=20, overlap_perc=0.2, clusterer=sklearn.cluster.DBSCAN(eps=150, min_samples=5), filter_function="base_point_geodesic_distance", BP="BC", auto_tuning="off")

# Fit to and transform the data. Y is projected result in 2 dimensional space.
Y = proj.fit_transform(X)

# You can also get the 'mapper graph' with nodes and edges.
proj.get_skeleton_nodes()
proj.get_skeleton_links()

Citation

License

Standard MIT disclaimer applies, see LICENSE for full text.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
HIsomap		HIsomap
data		data
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
example.py		example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HIsomap

HIsomap

data

data

.DS_Store

.DS_Store

LICENSE

LICENSE

README.md

README.md

example.py

example.py

Repository files navigation

HIsomap

Installation

Dependencies

Installation

Checking your HIsomap Installation

Run example

Features

Parameters

Parameters

Returns

Returns

Returns

Returns

Returns

Returns

Usage

Python2 code

Python3 code

Citation

License

About

Releases

Packages

Languages

License

LynneYan/HIsomap

Folders and files

Latest commit

History

Repository files navigation

HIsomap

Installation

Dependencies

Installation

Checking your HIsomap Installation

Run example

Features

Parameters

Parameters

Returns

Returns

Returns

Returns

Returns

Returns

Usage

Python2 code

Python3 code

Citation

License

About

Resources

License

Stars

Watchers

Forks

Languages