# **`t-SNE (t-Distributed Stochastic Neighbor Embedding) `**

Short & clear explanation (simple English):

t-SNE is a dimensionality reduction technique used to visualize high-dimensional data in 2D or 3D.
It focuses on keeping similar data points close and dissimilar points far in the lower dimension.
Mostly used for data visualization, not for model training.
Very popular in Data Science & Machine Learning for exploring clusters.
Works best on small to medium datasets.

In [None]:
import plotly.express as px
from sklearn.datasets import make_classification
from sklearn.manifold import TSNE

X, y = make_classification(n_samples=1500,
                           n_features=6,
                           n_informative=2,
                           random_state=42,
                           n_classes=3,
                           n_clusters_per_class=1)

fig = px.scatter_3d(x=X[:, 0], y=X[:, 1], z=X[:,2],  color=y, opacity=0.8)
fig.show()


In [None]:
from sklearn.decomposition import PCA
import plotly.express as px

pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

In [None]:
fig = px.scatter(x=X_pca[:, 0], y=X_pca[:, 1], color=y)
fig.update_layout(
    title="PCA",
    xaxis_title="First principal component",
    yaxis_title="Second principal component"
)
fig.show()

In [None]:
# fitting and transform t-SNE
from sklearn.manifold import TSNE
tsne = TSNE(n_components=2, random_state=42)
X_tsne = tsne.fit_transform(X)
tsne.kl_divergence_

In [None]:
fig = px.scatter(x=X_tsne[:, 0], y=X_tsne[:, 1], color=y)
fig.update_layout(
    title="t-SNE",
    xaxis_title="First t-SNE component",
    yaxis_title="Second t-SNE component"
)
fig