<a href="https://colab.research.google.com/github/cagBRT/Machine-Learning/blob/master/DecisionTreesRegression2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook uses three regression models: ExtraTreesRegressor, KNeighborsRegressor, LinearRegression. 

It also uses the Olivetti faces dataset. It has 400 images of human faces. 

This notebook trains the three regression modelss on the faces, then asks each model to recreate the lower half of each image. 

In [None]:
import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import fetch_olivetti_faces
from sklearn.utils.validation import check_random_state

from sklearn.ensemble import ExtraTreesRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import RidgeCV

**The [dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_olivetti_faces.html)**<br>

Load the Olivetti faces data-set from AT&T<br>
>Classes: 4<br>
Samples total: 400<br>
Dimensionality: 4096<br>
Features: real, between 0 and 1<br>

In [None]:
# Load the faces datasets
data, targets = fetch_olivetti_faces(return_X_y=True)

In [None]:
data.shape

In [None]:
targets

Split the data into training and test sets. The first 30 images are for testing. The rest are for testing

In [None]:
test = data[targets > 30]
train = data[targets < 30]  # Test on independent people
print("train shape:",train.shape)
print("test shape:",test.shape)

In [None]:
# Test on a subset of people
n_faces = 5 #change this number to change the test set number
#comment out next line to change face selections
rng = check_random_state(1)
face_ids = rng.randint(test.shape[0], size=(n_faces ))
test = test[face_ids, :]

In [None]:
face_ids

In [None]:
n_pixels = data.shape[1]
# Upper half of the faces
X_train = train[:, :(n_pixels + 1) // 2]
# Lower half of the faces
y_train = train[:, n_pixels // 2:]
X_test = test[:, :(n_pixels + 1) // 2]
y_test = test[:, n_pixels // 2:]

In [None]:
print("X_train shape:",X_train.shape)
print("X_test shape:",X_test.shape)

In [None]:
# Plot some of the images
image_shape = (64, 64)
n_cols = 1
plt.figure(figsize=(2. * n_cols, 2.26 * 5))
for i in range(5):
    #Stack arrays in sequence horizontally (column wise)
    true_face = np.hstack((X_test[i], y_test[i]))#
    sub = plt.subplot(5, n_cols, i * n_cols + 1)
    sub.axis("off")
    sub.imshow(true_face.reshape(image_shape),
               cmap=plt.cm.gray,
               interpolation="nearest")
plt.show()

[Extra Trees Classifier: ](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html)<br>
class sklearn.ensemble.ExtraTreesClassifier(n_estimators=100, *, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, bootstrap=False, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, class_weight=None, ccp_alpha=0.0, max_samples=None)

[KNeighborsRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html)<br>
class sklearn.neighbors.KNeighborsRegressor(n_neighbors=5, *, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', metric_params=None, n_jobs=None)

[Linear Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html)<br>
class sklearn.linear_model.LinearRegression(*, fit_intercept=True, normalize='deprecated', copy_X=True, n_jobs=None, positive=False)

[RidgeCV](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RidgeCV.html)<br>
class sklearn.linear_model.RidgeCV(alphas=(0.1, 1.0, 10.0), *, fit_intercept=True, normalize='deprecated', scoring=None, cv=None, gcv_mode=None, store_cv_values=False, alpha_per_target=False)

In [None]:
# Fit estimators
ESTIMATORS = {
    "Extra trees": ExtraTreesRegressor(n_estimators=10, max_features=32,
                                       random_state=0),
    "K-nn": KNeighborsRegressor(),
    "Linear regression": LinearRegression(),
    "Ridge": RidgeCV(),
}

In [None]:
y_test_predict = dict()
for name, estimator in ESTIMATORS.items():
    estimator.fit(X_train, y_train)
    y_test_predict[name] = estimator.predict(X_test)

In [None]:
# Plot the completed faces
image_shape = (64, 64)

n_cols = 1 + len(ESTIMATORS)
plt.figure(figsize=(2. * n_cols, 2.26 * n_faces))
plt.suptitle("Face completion with multi-output estimators", size=16)

for i in range(n_faces):
    true_face = np.hstack((X_test[i], y_test[i]))

    if i:
        sub = plt.subplot(n_faces, n_cols, i * n_cols + 1)
    else:
        sub = plt.subplot(n_faces, n_cols, i * n_cols + 1,
                          title="true faces")

    sub.axis("off")
    sub.imshow(true_face.reshape(image_shape),
               cmap=plt.cm.gray,
               interpolation="nearest")

    for j, est in enumerate(sorted(ESTIMATORS)):
        completed_face = np.hstack((X_test[i], y_test_predict[est][i]))

        if i:
            sub = plt.subplot(n_faces, n_cols, i * n_cols + 2 + j)

        else:
            sub = plt.subplot(n_faces, n_cols, i * n_cols + 2 + j,
                              title=est)

        sub.axis("off")
        sub.imshow(completed_face.reshape(image_shape),
                   cmap=plt.cm.gray,
                   interpolation="nearest")

plt.show()

**Assignment**<br>
Change the parameters of the models and note the differences in the faces. 