# PCA Transformation

### DESCRIPTION

Scikit-learn comes with preloaded datasets. Load the digits dataset from that collection (http://scikitlearn.org/stable/auto_examples/datasets/plot_digits_last_image.html). Using Scikit-learn, perform a PCA transformation such that the transformed dataset can explain 95% of the variance in the original dataset. Find out the number of components in the projected subspace.

### Objective
- Understand and practice Principal Component Analysis using Scikit-learn.

In [1]:
# Import Required Libraries
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split

In [2]:
digits = load_digits()

In [3]:
X = digits.data
X

array([[ 0.,  0.,  5., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ..., 10.,  0.,  0.],
       [ 0.,  0.,  0., ..., 16.,  9.,  0.],
       ...,
       [ 0.,  0.,  1., ...,  6.,  0.,  0.],
       [ 0.,  0.,  2., ..., 12.,  0.,  0.],
       [ 0.,  0., 10., ..., 12.,  1.,  0.]])

In [4]:
Y = digits.target
Y

array([0, 1, 2, ..., 8, 9, 8])

In [5]:
X.shape

(1797, 64)

In [6]:
Y.shape

(1797,)

In [7]:
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size = 0.25, random_state = 1)

In [8]:
x_train.shape

(1347, 64)

In [9]:
x_test.shape

(450, 64)

In [10]:
y_train.shape

(1347,)

In [11]:
y_test.shape

(450,)

In [12]:
# import PCA
from sklearn.decomposition import PCA
pca = PCA(n_components=0.95)

In [13]:
pca.fit(x_train)

PCA(copy=True, iterated_power='auto', n_components=0.95, random_state=None,
    svd_solver='auto', tol=0.0, whiten=False)

In [14]:
x_train_trans = pca.transform(x_train)

In [15]:
x_train_trans

array([[ 10.97303259, -12.77131381,  12.52565473, ...,   3.37027073,
         -3.57474987,   0.38524292],
       [-20.87235854,   7.76977072,  15.03312611, ...,   2.39090556,
          0.92470007,  -0.26898484],
       [-17.61014644,   2.80780295,   9.16898397, ...,  -0.70927955,
         -0.08726772,   2.5498659 ],
       ...,
       [ 11.89301767,   8.66061583,  -2.68626102, ...,   3.54488269,
         -1.62748985,  -2.21827527],
       [-10.15055892,  -6.16764611, -20.5678722 , ...,  -1.24440862,
         -1.40288761,   2.8371074 ],
       [ -4.50534411,  -4.22914718,  -3.85097457, ...,  -0.84308395,
         -3.60808782,   1.56248151]])

In [16]:
x_train_trans.shape

(1347, 28)

In [17]:
x_test_trans = pca.transform(x_test)

In [18]:
x_test_trans.shape

(450, 28)

# Thank You