## Dimension Reduction [4] : UMAP

![](https://upload.wikimedia.org/wikipedia/commons/2/27/MnistExamples.png)


- [🎛 Dimension Reduction [1] : PCA](https://www.kaggle.com/subinium/dimension-reduction-1-pca)
- [🎛 Dimension Reduction [2] : LDA](https://www.kaggle.com/subinium/dimension-reduction-2-lda)
- [🎛 Dimension Reduction [3] : T-SNE](https://www.kaggle.com/subinium/dimension-reduction-3-t-sne)


This is the most recent theoretical dimension reduction methodology and is very useful.

## Import Libarary has & Default Setting

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib as mpl
import matplotlib.pyplot as plt

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))


In [None]:
# matplotlib configure
plt.rcParams['image.cmap'] = 'gray'
# Color from R ggplot colormap
color = ['#6388b4', '#ffae34', '#ef6f6a', '#8cc2ca', '#55ad89', '#c3bc3f', '#bb7693', '#baa094', '#a9b5ae', '#767676']

In [None]:
mnist = pd.read_csv('/kaggle/input/digit-recognizer/train.csv')
label = mnist['label']
mnist.drop(['label'], inplace=True, axis=1)

## Umap & Result

- [UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction](https://arxiv.org/abs/1802.03426)

Depending on the data, UMAP is faster than TSNE and shows tremendous clustering capabilities.

**UMAP** stands for Uniform Manifold Approximation and Projection.

In [None]:
%%time
from umap import UMAP

umap = UMAP(random_state=0)
mnist_umap = umap.fit_transform(mnist, label)

Let's visualize it.

In [None]:
import plotly.graph_objects as go

fig = go.Figure()

for idx in range(10):
    fig.add_trace(go.Scatter(
        x = mnist_umap[:,0][label==idx],
        y = mnist_umap[:,1][label==idx],
        name=str(idx),
        opacity=0.6,
        mode='markers',
        marker=dict(color=color[idx])
        
    ))

fig.update_layout(
    width = 800,
    height = 800,
    title = "UMAP result",
    yaxis = dict(
      scaleanchor = "x",
      scaleratio = 1
    ),
    legend=dict(
        orientation="h",
        yanchor="bottom",
        y=1.02,
        xanchor="right",
        x=1
    )
)


fig.show()