This notebook is part of the [Machine Learning class](https://github.com/erachelson/MLclass) by [Emmanuel Rachelson](https://personnel.isae-supaero.fr/emmanuel-rachelson?lang=en).

License: CC-BY-SA-NC.

<div style="font-size:22pt; line-height:25pt; font-weight:bold; text-align:center;">Unsupervized Learning</div>

Three Unsupervized Learning tasks are illustrated here:
1. [Dimensionality reduction](#dim)
2. [Clustering](#clust)
3. [Density estimation](#density)

# <a id="dim"></a> 1. Dimensionality reduction

In [None]:
from sklearn.datasets import load_boston

boston = load_boston()
X, y = boston['data'], boston['target']
print(boston.DESCR)

In [None]:
from sklearn.decomposition import PCA

print(X.shape)

boston_pca = PCA()
boston_pca.fit(X)

In [None]:
boston_pca.explained_variance_ratio_

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

plt.bar(range(X.shape[1]), boston_pca.explained_variance_ratio_, color="r", align="center")

In [None]:
import numpy as np

np.sum(boston_pca.explained_variance_ratio_[:2])

In [None]:
boston_pca = PCA(n_components=2)
boston_pca.fit(X)
X_proj = boston_pca.transform(X)

In [None]:
X_proj.shape

In [None]:
plt.scatter(X_proj[:,0],X_proj[:,1]);

In [None]:
boston_pca.components_

# <a id="clust"></a> 2. Clustering

In [None]:
from sklearn.cluster import KMeans
boston_kmeans = KMeans(n_clusters=2)
boston_kmeans.fit(X_proj)

In [None]:
boston_kmeans2 = KMeans(n_clusters=2)
boston_kmeans2.fit(X)
y_pred  =boston_kmeans2.predict(X)
plt.scatter(X_proj[:,0], X_proj[:,1], c=y_pred);

In [None]:
y_pred = boston_kmeans.predict(X_proj)
plt.scatter(X_proj[:,0], X_proj[:,1], c=y_pred);

See this [example](http://scikit-learn.org/stable/auto_examples/cluster/plot_cluster_comparison.html) for a great comparison.

# <a id="density"></a> 3. Density estimation

In [None]:
from sklearn import svm

boston_ocsvm = svm.OneClassSVM(gamma = 1e-3)
X1 = X_proj[y_pred==0,:]
plt.scatter(X1[:,0], X1[:,1])

In [None]:
boston_ocsvm.fit(X1)

In [None]:
xmin = np.min(X1[:,0])
xmax = np.max(X1[:,0])
ymin = np.min(X1[:,1])
ymax = np.max(X1[:,1])

xx, yy = np.meshgrid(np.linspace(xmin, xmax, 500), np.linspace(ymin, ymax, 500))

Z = boston_ocsvm.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, cmap=plt.cm.PuBu);
#plt.scatter(X1[:,0], X1[:,1])

In [None]:
plt.contourf(xx, yy, Z)

In [None]:
boston_ocsvm = svm.OneClassSVM(gamma = 1e-3)

boston_ocsvm.fit(X_proj)

xmin = np.min(X_proj[:,0])
xmax = np.max(X_proj[:,0])
ymin = np.min(X_proj[:,1])
ymax = np.max(X_proj[:,1])

xx, yy = np.meshgrid(np.linspace(xmin, xmax, 500), np.linspace(ymin, ymax, 500))

Z = boston_ocsvm.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, cmap=plt.cm.PuBu);