# Kernel Principal Component Analysis
In a linear PCA, the clusterings of samples are linearly separated. That is, you could draw a line between sample clusters and they would almost be perfectly separated. However, some data sets don't have linearly separated sample clusters, so instead you would use the kernel-trick to separate those clusters by going from a higher dimension back to a lower dimension.

### Linear PCA
<img src="images/k_pca/linear_pca.png" height="75%" width="75%"></img>
- x-axis is PC1
- y-axis is PC2

The diagram above is a logistic regression model predicting either 0 (red) or 1 (green) using the data set that fitted using linear PCA.

There exists a line that separates the classification of 0 (red) or 1 (green). This is because the standard PCA linearly fits to the data set, so such a separating line is created.

### Kernel PCA
<img src="images/k_pca/linear_pca.png" height="75%" width="75%"></img>
- x-axis is PC1
- y-axis is PC2

The diagram above is a logistic regression model predicting either 0 (red) or 1 (green) using the data set that fitted using kernel PCA.

There exists non-linear separator that separates the classification of 0 (red) or 1 (green). This is because the standard PCA uses the kernel to fit to the data set, so such a separation is created.

In [1]:
# import libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [2]:
# import the data set
ads_df = pd.read_csv("datasets/social_network_ads.csv")

ads_df.head()

Unnamed: 0,User ID,Gender,Age,EstimatedSalary,Purchased
0,15624510,Male,19,19000,0
1,15810944,Male,35,20000,0
2,15668575,Female,26,43000,0
3,15603246,Female,27,57000,0
4,15804002,Male,19,76000,0


In [6]:
# receive only the independent variables, disregard the dependent variable Purchased column
x = ads_df.iloc[:, 2:3].values

In [7]:
# import a Standarization Scaler for Feature Scaling
from sklearn.preprocessing import StandardScaler

# feature scale the (independent variables)
sc_X = StandardScaler()
scaled_x = sc_X.fit_transform(x)



# Kernel PCA Model
There is no attribute called "explained_variance_ratio_" for the Kernel PCA because the PCA could not linearly separate the principal components, thus the ratios may not sum to 100%. For this reason, the attribute is not provided because it could be skewed due to non-linear separation.

Another issue is that the "components_" attribute is not provided in Kernel PCA, so we cannot graph the data set's eigenvalues for each principal component.

We could fit and train the Kernel PCA transformed data set on a classification model to see how well a machine learning model would work with new independent variables (PC1 and PC2). Although, that can be done for a personal project because this is not a classification lecture.

In [8]:
# import the Kernel PCA model
from sklearn.decomposition import KernelPCA

In [13]:
# create a Kernel PCA with 2 principal components
kpca = KernelPCA(n_components=2, kernel="rbf")

# fit and transform the scaled independent variables
x_reduced = pca.fit_transform(scaled_x)

"""
show the first 10 rows of the reduced data frame

Now there are only 2 independent variables (PC1 and PC2).
"""
x_reduced[:10]

array([[-0.44464514,  0.58755735],
       [-0.20047442, -0.47979431],
       [-0.68154194,  0.31804391],
       [-0.6777246 ,  0.23127813],
       [-0.44464514,  0.58755735],
       [-0.6777246 ,  0.23127813],
       [-0.6777246 ,  0.23127813],
       [-0.46188006, -0.26268603],
       [-0.67309095,  0.39433006],
       [-0.20047442, -0.47979431]])