
<br>
=============================================================================<br>
t-SNE: The effect of various perplexity values on the shape<br>
=============================================================================<br>
An illustration of t-SNE on the two concentric circles and the S-curve<br>
datasets for different perplexity values.<br>
We observe a tendency towards clearer shapes as the perplexity value increases.<br>
The size, the distance and the shape of clusters may vary upon initialization,<br>
perplexity values and does not always convey a meaning.<br>
As shown below, t-SNE for higher perplexities finds meaningful topology of<br>
two concentric circles, however the size and the distance of the circles varies<br>
slightly from the original. Contrary to the two circles dataset, the shapes<br>
visually diverge from S-curve topology on the S-curve dataset even for<br>
larger perplexity values.<br>
For further details, "How to Use t-SNE Effectively"<br>
https://distill.pub/2016/misread-tsne/ provides a good discussion of the<br>
effects of various parameters, as well as interactive plots to explore<br>
those effects.<br>


Author: Narine Kokhlikyan <narine@slice.com><br>
License: BSD

In [None]:
print(__doc__)

In [None]:
import numpy as np
import matplotlib.pyplot as plt

In [None]:
from matplotlib.ticker import NullFormatter
from sklearn import manifold, datasets
from time import time

In [None]:
n_samples = 300
n_components = 2
(fig, subplots) = plt.subplots(3, 5, figsize=(15, 8))
perplexities = [5, 30, 50, 100]

In [None]:
X, y = datasets.make_circles(n_samples=n_samples, factor=.5, noise=.05)

In [None]:
red = y == 0
green = y == 1

In [None]:
ax = subplots[0][0]
ax.scatter(X[red, 0], X[red, 1], c="r")
ax.scatter(X[green, 0], X[green, 1], c="g")
ax.xaxis.set_major_formatter(NullFormatter())
ax.yaxis.set_major_formatter(NullFormatter())
plt.axis('tight')

In [None]:
for i, perplexity in enumerate(perplexities):
    ax = subplots[0][i + 1]
    t0 = time()
    tsne = manifold.TSNE(n_components=n_components, init='random',
                         random_state=0, perplexity=perplexity)
    Y = tsne.fit_transform(X)
    t1 = time()
    print("circles, perplexity=%d in %.2g sec" % (perplexity, t1 - t0))
    ax.set_title("Perplexity=%d" % perplexity)
    ax.scatter(Y[red, 0], Y[red, 1], c="r")
    ax.scatter(Y[green, 0], Y[green, 1], c="g")
    ax.xaxis.set_major_formatter(NullFormatter())
    ax.yaxis.set_major_formatter(NullFormatter())
    ax.axis('tight')

Another example using s-curve

In [None]:
X, color = datasets.make_s_curve(n_samples, random_state=0)

In [None]:
ax = subplots[1][0]
ax.scatter(X[:, 0], X[:, 2], c=color)
ax.xaxis.set_major_formatter(NullFormatter())
ax.yaxis.set_major_formatter(NullFormatter())

In [None]:
for i, perplexity in enumerate(perplexities):
    ax = subplots[1][i + 1]
    t0 = time()
    tsne = manifold.TSNE(n_components=n_components, init='random',
                         random_state=0, perplexity=perplexity)
    Y = tsne.fit_transform(X)
    t1 = time()
    print("S-curve, perplexity=%d in %.2g sec" % (perplexity, t1 - t0))
    ax.set_title("Perplexity=%d" % perplexity)
    ax.scatter(Y[:, 0], Y[:, 1], c=color)
    ax.xaxis.set_major_formatter(NullFormatter())
    ax.yaxis.set_major_formatter(NullFormatter())
    ax.axis('tight')

Another example using a 2D uniform grid

In [None]:
x = np.linspace(0, 1, int(np.sqrt(n_samples)))
xx, yy = np.meshgrid(x, x)
X = np.hstack([
    xx.ravel().reshape(-1, 1),
    yy.ravel().reshape(-1, 1),
])
color = xx.ravel()
ax = subplots[2][0]
ax.scatter(X[:, 0], X[:, 1], c=color)
ax.xaxis.set_major_formatter(NullFormatter())
ax.yaxis.set_major_formatter(NullFormatter())

In [None]:
for i, perplexity in enumerate(perplexities):
    ax = subplots[2][i + 1]
    t0 = time()
    tsne = manifold.TSNE(n_components=n_components, init='random',
                         random_state=0, perplexity=perplexity)
    Y = tsne.fit_transform(X)
    t1 = time()
    print("uniform grid, perplexity=%d in %.2g sec" % (perplexity, t1 - t0))
    ax.set_title("Perplexity=%d" % perplexity)
    ax.scatter(Y[:, 0], Y[:, 1], c=color)
    ax.xaxis.set_major_formatter(NullFormatter())
    ax.yaxis.set_major_formatter(NullFormatter())
    ax.axis('tight')

In [None]:
plt.show()