
# t-Distributed Stochastic Neighbor Embedding (t-SNE) with Perplexity Overview

This notebook provides an overview of t-Distributed Stochastic Neighbor Embedding (t-SNE), focusing on the role of perplexity, its mathematical foundation, and a basic implementation using a dataset.



## Background

### t-Distributed Stochastic Neighbor Embedding (t-SNE)

t-SNE is a nonlinear dimensionality reduction technique primarily used for the visualization of high-dimensional datasets. It converts similarities between data points into joint probabilities and tries to minimize the Kullback-Leibler divergence between the joint probabilities of the low-dimensional embedding and the high-dimensional data.

### Perplexity in t-SNE

Perplexity is a crucial hyperparameter in t-SNE that can be thought of as a measure of the effective number of neighbors. It balances attention between local and global aspects of the data. A low perplexity focuses more on local structure, while a high perplexity captures more of the global structure.

### Applications of t-SNE

t-SNE is widely used for visualizing high-dimensional data in fields like bioinformatics, speech processing, and natural language processing. It is particularly effective for visualizing clusters and understanding the structure of the data.



## Mathematical Foundation

### t-SNE Algorithm

t-SNE involves the following steps:

1. **Compute Pairwise Similarities**:
   - In the high-dimensional space, the similarity between two points \( x_i \) and \( x_j \) is computed using a Gaussian distribution:

\[
P_{j|i} = \frac{\exp(-\|x_i - x_j\|^2 / 2\sigma_i^2)}{\sum_{k \neq i} \exp(-\|x_i - x_k\|^2 / 2\sigma_i^2)}
\]

   - The perplexity \( \text{Perp}(P_i) \) is related to the variance \( \sigma_i \) and is defined as:

\[
\text{Perp}(P_i) = 2^{-\sum_j P_{j|i} \log_2 P_{j|i}}
\]

2. **Define Joint Probabilities**:
   - The joint probability distribution over pairs is symmetric:

\[
P_{ij} = \frac{P_{j|i} + P_{i|j}}{2n}
\]

3. **Low-Dimensional Mapping**:
   - In the low-dimensional space, the similarity between two points \( y_i \) and \( y_j \) is computed using a Student-t distribution:

\[
Q_{ij} = \frac{(1 + \|y_i - y_j\|^2)^{-1}}{\sum_{k \neq l} (1 + \|y_k - y_l\|^2)^{-1}}
\]

4. **Minimize Kullback-Leibler Divergence**:
   - The goal is to minimize the divergence between the joint probabilities \( P_{ij} \) and \( Q_{ij} \):

\[
\text{KL}(P \| Q) = \sum_{i \neq j} P_{ij} \log \frac{P_{ij}}{Q_{ij}}
\]

### Effect of Perplexity

Perplexity influences the bandwidth of the Gaussian kernel used to compute pairwise similarities. A small perplexity emphasizes local relationships, leading to more detailed clusters, while a large perplexity captures broader, more global structures.



## Implementation in Python

We'll implement t-SNE using Scikit-Learn on the Iris dataset and explore the effects of different perplexity values.


In [None]:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.manifold import TSNE

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Apply t-SNE with different perplexity values
perplexities = [5, 30, 50]

plt.figure(figsize=(18, 5))
for i, perplexity in enumerate(perplexities):
    tsne = TSNE(n_components=2, perplexity=perplexity, random_state=42)
    X_tsne = tsne.fit_transform(X)
    
    plt.subplot(1, 3, i+1)
    plt.scatter(X_tsne[:, 0], X_tsne[:, 1], c=y, cmap='viridis', edgecolor='k')
    plt.title(f"t-SNE with Perplexity={perplexity}")
    plt.xlabel("Component 1")
    plt.ylabel("Component 2")

plt.tight_layout()
plt.show()



## Conclusion

This notebook provided an overview of t-Distributed Stochastic Neighbor Embedding (t-SNE), focusing on the role of perplexity. We implemented t-SNE using Scikit-Learn on the Iris dataset, exploring how different perplexity values affect the visualization. Perplexity is a crucial hyperparameter that balances local and global structure in the data, and choosing an appropriate value is key to obtaining meaningful visualizations.
