Principal Component Analysis, or PCA, is a dimensionality-reduction method that is often used to reduce the dimensionality of large data sets, by transforming a large set of variables into a smaller one that still contains most of the information in the large set.

Reducing the number of variables of a data set naturally comes at the expense of accuracy, but the trick in dimensionality reduction is to trade a little accuracy for simplicity. 

Because smaller data sets are easier to explore and visualize and make analyzing data much easier and faster for machine learning algorithms without extraneous variables to process.

Step 1: Standardization
The aim of this step is to standardize the range of the continuous initial variables so that each one of them contributes equally to the analysis.

Step 2: Covariance Matrix computation
The aim of this step is to understand how the variables of the input data set are varying from the mean with respect to each other, or in other words, to see if there is any relationship between them. Because sometimes, variables are highly correlated in such a way that they contain redundant information. So, in order to identify these correlations, we compute the covariance matrix.

![image.png](attachment:image.png)

sign of the covariance:

if positive then : the two variables increase or decrease together (correlated)
if negative then : One increases when the other decreases (Inversely correlated)

Step 3: Compute the eigenvectors and eigenvalues of the covariance matrix to identify the principal components

eigenvectors of the Covariance matrix are actually the directions of the axes where there is the most variance (most information) and that we call Principal Components. And eigenvalues are simply the coefficients attached to eigenvectors, which give the amount of variance carried in each Principal Component.

By ranking your eigenvectors in order of their eigenvalues, highest to lowest, you get the principal components in order of significance.

Step 4
TransformedData = RowAdjustData * FeatureVector


Principal components are new variables that are constructed as linear combinations or mixtures of the initial variables. 
These combinations are done in such a way that the new variables (i.e., principal components) are uncorrelated and most of the information within the initial variables is squeezed or compressed into the first components.

Geometrically speaking, principal components represent the directions of the data that explain a maximal amount of variance, that is to say, the lines that capture most information of the data. 

The relationship between variance and information here, is that, the larger the variance carried by a line, the larger the dispersion of the data points along it, and the larger the dispersion along a line, the more the information it has. 

![image.png](attachment:image.png)


In [44]:
from sklearn.preprocessing import MinMaxScaler

import pandas as pd
data=pd.read_csv("https://raw.githubusercontent.com/aspdiscovery123/ANZ-Bangalore/master/iris.csv")

In [46]:
from sklearn.decomposition import PCA
data.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [47]:
data=data.drop("species",axis=1)


In [48]:
sc=MinMaxScaler()
data=sc.fit_transform(data)

In [49]:
data= pd.DataFrame(data)
data.head()

Unnamed: 0,0,1,2,3
0,0.222222,0.625,0.067797,0.041667
1,0.166667,0.416667,0.067797,0.041667
2,0.111111,0.5,0.050847,0.041667
3,0.083333,0.458333,0.084746,0.041667
4,0.194444,0.666667,0.067797,0.041667


In [50]:
pca=PCA(n_components=3)

In [51]:
x=pca.fit_transform(data)

In [54]:
x.shape
x=pd.DataFrame(x)
x.head()

Unnamed: 0,0,1,2
0,-0.630703,0.107578,-0.018719
1,-0.622905,-0.10426,-0.049142
2,-0.66952,-0.051417,0.019644
3,-0.654153,-0.102885,0.023219
4,-0.648788,0.133488,0.015116
