### Example
Starting by examining a simple dataset, the Iris data available by default in scikit-learn. The data consists of measurements of three different species of irises. 

There are three species of iris in the dataset: 1. Iris Virginica 2. Iris Setosa 3. Iris Versicolor

In [None]:
from sklearn.datasets import load_iris
iris = load_iris()
iris.keys()

In [None]:
#checking shape of data and list of features (X matrix) 
print(iris.data.shape)
print(iris.feature_names) 

#checking target values 
print( iris.target_names)

In [None]:
#importing and instantiating PCA with 2 components. 
from sklearn.decomposition import PCA 
pca = PCA(2) 
pca

In [None]:
#Fitting PCA to the iris dataset and transforming it into 2 principal components 
X, y = iris.data, iris.target 
X_proj = pca.fit_transform(X) 
X_proj.shape

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

#Plotting the projected principal components and try to understand the data. 
#Ignoring what's in y, it looks more like 2 clusters of data points rather than 3 
#c=y colors the scatter plot based on y (target) 
plt.scatter(X_proj[:,0], X_proj[:,1],c=y) 
plt.show()

In [None]:
#pca.components_ has the meaning of each principal component, essentially how it was derived 
#checking shape tells us it has 2 rows, one for each principal component and 4 columns, proportion of each of the 4 features 
#for each row 
print(pca.components_ )
print(pca.components_.shape)

In [None]:
#Trying to decipher the meaning of the principal components 
print("Meaning of the 2 components:" )
for component in pca.components_:
    print( " + ".join("%.2f x %s" % (value, name) for value, name in zip(component, iris.feature_names)))

In [None]:
#this tells us the extent to which each component explains the original dataset. 
#so the 1st component is able to explain ~92% of X and the second only about 5.3% 
#Together they can explain about 97.3% of the variance of X 
print( pca.explained_variance_ratio_)

In [None]:
#So if we only needed a 92% variance, we actually need just one component, let's verify 
pca=PCA(0.92)
X_new=pca.fit_transform(X) 
print(X_new.shape )

#If we need more than 97% variance, we begin to need more components 
pca=PCA(0.98)
X_new=pca.fit_transform(X)
print(X_new.shape)

### Más ejemplos:
https://shankarmsy.github.io/posts/pca-sklearn.html