## Principle Component Analysis for Dimensionality Reduction 

Principal Component Analysis (PCA) is a popular statistical tool for reducing the dimension of a data into a specified smaller dimension, compressing data, visualize data (usually in 2D or 3D), and construct new features from the original data. The goal behind PCA is to be able to work with a low dimensional data set that captures large proportion of the variation found in the original higly dimensional complex data. As the name may hint, PCA is used to find the principal components of the data that explain most of the variation and hence allows us to drop certain components of the data that are redundant (example, height measured in cm, metres, inches, and feet is redundant). Note that PCA does not directly select some important features and drops irrelevant ones, but rather it constructs new features from the given ones that capture the most variation in the data. For example suppose you have test scores on math, science, and reading for students, PCA may summarize this by a new feature (math+science+reading)/3. For comparing students it wont be useful to have characteristics that barely vary across them. PCA will try to linearly combine the given characteristics of students in a way so that the summarized resulting feature varies across students as much as possible.

PCA is considered to be a unsupervised learning algorithm because it does not require a labeled data set. That is there is no need for a categorical outcome variable $y$ that assigns the observations into different categories to apply PCA. In machine learning, PCA is often used to reduce the dimensionality of the data so that classification algorithms can run faster on the smaller data set without sacrificing much classification accuracy.

### Problem and Data Description 

We will use the Pima Indian Diabetes data set which contains several characteristics of women atleast 21 years old of Pima indian heritage and whether they have been tested positive for diabetes. In particular for several womens we know the number of pregnancies, glucose level, blood pressure, skin thickness, insulin level, body mass index (BMI), age, and a diabetes pedigree index which is a measure of the likelihood of getting diabities based on their ancestor's history. This data is $8$-dimensional (it has 8 features), we will use PCA to visualize this data in 2D and 3D space, and examine whether the dimensionality of this data can be reduced in a way to keep vast majority of the original variation.

##### Importing Libraries 

In [2]:
# For matrix computation
import numpy as np
# For data manipulation 
import pandas as pd
# For 2D plotting 
from matplotlib import pyplot as plt
# For 3D plotting 
from mpl_toolkits.mplot3d import Axes3D