# Dimencionality Reduction

The number of input variables or features for a dataset is referred to as its dimensionality. Dimensionality reduction refers to techniques that reduce the number of input variable in a dataset. More input features often make a predictive modeling task more challenging to model, more generally referred to as the curse of dimensionality.

High-dimensionality statistics and dimensionality reduction techniques are often used for data visualization. Nevertheless these techniques can be used in applied machine learning to simplify a classification dataset in order to better fitt a predictive model.

High-dimensionality might mean handreds, thousands, or even millions of input variables.

Fewer input dimensions often mean correspondingly fewer parameters or a simpler structure in the machine learning model, referred to as degrees of freedom. A model with too many degrees of freedom is likely to overfit the training dataset and therefore may not perform well on new data.

It is desirable to have simple models that generalize well, and it turn, input data with few input variables. This is particularly true for linear models where the number of inputs and the degrees of freedom of the model are often closely related.

## Techniques for Dimensionality Reduction

### Feature Selection Methods

Perhaps the most common are so-called feature selection techniques that use scoring or statistical methods to select which features to keep and which features to delete.

### Matrix Factorization

Techniques from linear algebra can be used for dimensionality reduction. Specifically, matrix factorization methods can be used to reduce a dataset matrix into its constituent parts. Examples includes the eigendecomposition and sigular value decomposition.

### Manifold Learning

These techniques are sometimes referred to as "manifold learning" and are used to create a low-dimensional projection of high-dimensional data, often for the purposes of data visualization.

The projection is designed to both create a low-dimensional representation of the dataset whilst best preserving the salient structure or relationships in the data.

Examples of manifold learning techniques include:

* Kohonen Self-Organizing Map(SOM)
* Sammons Mapping
* Multidimensional Scaling(MDS)
* t-distributed Stochastic Neighbor Embedding(t-SNE)

### Autoencoder Methods

Deep learning neural networks can be constucted to perform dimensionaliy reduction. A popular approach is called autoencoders. This involves framing a self-supervised learning problem where a model must reproduce the input correctly.

## Principal Component Analysis(PCA)

The idea of principal comonent analysis is to reduce the dimensionality of a dataset consisting of a large number of related variables while retraining as much variance in the data as possible. PCA finds a set of new variables that the original variables are just their linear combinations. The new variables are called *Principal Components(PCs)*. These principal components are *orthgonal*: In a 3-D case, the principal components are perpendicular to each other, X can not be represented by Y or Y cannot be presented by Z.

In [12]:
import numpy as np
from sklearn.datasets import load_iris
import pandas as pd


iris = load_iris()
X = iris.data[:,:]
y = iris.target

In [13]:
X[:5, :]

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2]])

In [14]:
iris.feature_names

['sepal length (cm)',
 'sepal width (cm)',
 'petal length (cm)',
 'petal width (cm)']

In [25]:
from sklearn.decomposition import PCA

pca = PCA()
x_pca = pca.fit_transform(X)

In [26]:
x_pca[:5, :]

array([[-2.68412563e+00,  3.19397247e-01, -2.79148276e-02,
        -2.26243707e-03],
       [-2.71414169e+00, -1.77001225e-01, -2.10464272e-01,
        -9.90265503e-02],
       [-2.88899057e+00, -1.44949426e-01,  1.79002563e-02,
        -1.99683897e-02],
       [-2.74534286e+00, -3.18298979e-01,  3.15593736e-02,
         7.55758166e-02],
       [-2.72871654e+00,  3.26754513e-01,  9.00792406e-02,
         6.12585926e-02]])

In [27]:
x_pca.shape

(150, 4)

In [28]:
pca = PCA(n_components = 0.95)
x_pca = pca.fit_transform(X)
x_pca[:5, :]

array([[-2.68412563,  0.31939725],
       [-2.71414169, -0.17700123],
       [-2.88899057, -0.14494943],
       [-2.74534286, -0.31829898],
       [-2.72871654,  0.32675451]])

In [29]:
x_pca.shape

(150, 2)