In [1]:
from time import time

from collections import namedtuple
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.ticker import NullFormatter
from sklearn import (manifold, datasets, decomposition, ensemble,
                     discriminant_analysis, random_projection)

**Making the Swiss Roll data and setting up variables.**  

Now we make the swiss roll with n_points, and with some noise. 

We also set a couple variables for Manifold learning, which are n_neighbors and n_components.
* n_neighbors: number of neighbors to consider for each point
* n_components: number of coordinates for the manifold.

In [2]:
n_points = 1000
noise = 0
X, color = datasets.samples_generator.make_swiss_roll(n_points, noise)

# n_components = Set to 2 for this program because components 
#                beyond 2 will be difficult to visualize.
n_neighbors = 10
n_components = 2

**Setting up the figure.**

In [3]:
# Setting up the figure.
fig = plt.figure(figsize=(20, 10))
plt.suptitle("Manifold Learning with %i points, %i neighbors, %.2f noise"
             % (n_points, n_neighbors, noise), fontsize=14)

<matplotlib.text.Text at 0x111b5b090>

**Drawing the original Swiss Roll.**

In [4]:
# Drawing the first subplot of the figure. 
#   .add_subplot(251) means 2x5 grid, 1st plot.
#   X.shape is (n_points, 3). ax.scatter plots each dimension at a time.
ax = fig.add_subplot(251, projection='3d')
ax.scatter(X[:, 0], X[:, 1], X[:, 2], c=color)
plt.title("Original Swiss Roll")

<matplotlib.text.Text at 0x11288b750>

**Setting up a function to apply manifold learning and plot result.**

In [5]:
def manifold_plot(raw_data, plot_loc, manifold_name, manifold_method):
    t0 = time()
    manifold_method = manifold_method
    manifold_fit = manifold_method.fit_transform(raw_data)
    t1 = time()
    
    ax = fig.add_subplot(plot_loc)

    # Plot the 2 dimensions.
    plt.scatter(manifold_fit[:, 0], manifold_fit[:, 1], c=color, cmap=plt.cm.Spectral)
    plt.title(manifold_name + "(%.2g sec)" % (t1-t0))
    ax.xaxis.set_major_formatter(NullFormatter())
    ax.yaxis.set_major_formatter(NullFormatter())
    plt.axis('tight')

**Setting up methods.** THERE HAS TO BE A BETTER WAY.

In [6]:
name_method = namedtuple('name_method', 'name method')

LLE = name_method('LLE', 
        manifold.LocallyLinearEmbedding(n_neighbors, n_components, eigen_solver='auto', method='standard'))
LTSA = name_method('LTSA', 
        manifold.LocallyLinearEmbedding(n_neighbors, n_components, eigen_solver='auto', method='ltsa'))
HessianLLE = name_method('HessianLLE', 
        manifold.LocallyLinearEmbedding(n_neighbors, n_components, eigen_solver='auto', method='hessian'))
ModifiedLLE = name_method('ModifiedLLE', 
        manifold.LocallyLinearEmbedding(n_neighbors, n_components, eigen_solver='auto', method='modified'))
Isomap = name_method('Isomap', 
        manifold.Isomap(n_neighbors, n_components))
MDS = name_method('MDS', 
        manifold.MDS(n_components, max_iter=100, n_init=1))
SpectralEmbedding = name_method('SpectralEmbedding', 
        manifold.SpectralEmbedding(n_components=n_components, n_neighbors=n_neighbors))
tSNE = name_method('tSNE', 
        manifold.TSNE(n_components=n_components, init='pca', random_state=0))

methods = [LLE, LTSA, HessianLLE, ModifiedLLE, Isomap, MDS, SpectralEmbedding, tSNE]


**Now applying manifold_plot to the methods**

In [7]:
for i, method in enumerate(methods):
    try:
        manifold_plot(X, 252+i, methods[i].name, methods[i].method)
    
    # With high noise level, some of the models fail.
    except:
        pass
plt.show()

## Explanation on the Manifold methods ##
Much of this section is from scikit-learn website: http://scikit-learn.org/stable/modules/manifold.html#manifold
  
**Locally Linear Embedding** 
 
LLE seeks a lower-dimensional projection of the data which preserves distances within local neighborhoods. It can be thought of as a series of local Principal Component Analyses which are globally compared to find the best non-linear embedding. 

There are 4 methods in LLE: standard, local tangent space alignment (LTSA), hessian (HLLE), and modified (MLLE).
- standard: seek a lower-dimensional projection of the data which preserves distances within local neighborhoods, kind of like PCA.
- LTSA: for regularization, characterize the local geometry at each neighborhood via its tangent space.
- HLLE: for regularization, hessian-based quadratic form at each neighborhood.
- MLLE: use multiple weight vectors in each neighborhood.
  
  
**Isomap**

Isomap is kind of like an extension of PCA. Isomap seeks a lower-dimensional embedding which maintains geodesic distances between all points.
  
  
**MDS**

This method is often used for analyzing similarity/dissimilarity in data. It attempts to model similarity or dissimilarity data as distances in a geometric spaces.
  
  
**SpectralEmbedding**

Spectral Embedding finds a low dimensional representation of the data using a spectral decomposition of the graph Laplacian. 
  
  
**tSNE**

t-SNE (TSNE) converts affinities of data points to probabilities. The affinities in the original space are represented by Gaussian joint probabilities and the affinities in the embedded space are represented by Studentâ€™s t-distributions. This allows t-SNE to be particularly sensitive to local structure and has a few other advantages over existing techniques: 
- Revealing the structure at many scales on a single map
- Revealing data that lie in multiple, different, manifolds or clusters
- Reducing the tendency to crowd points together at the center

While Isomap, LLE and variants are best suited to unfold a single continuous low dimensional manifold, t-SNE will focus on the local structure of the data and will tend to extract clustered local groups of samples