# Chapter 2
## Section: Model training and evaluation
Implementation of three steps of initialization, model training (i.e. fitting) and prediction for machine learning modeling using scikit-learn.

In [2]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn import metrics

# loading breast cancer dataset
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=5)

from sklearn.ensemble import RandomForestClassifier
# initializing a random forest model
rf_model = RandomForestClassifier(n_estimators=10, max_features=10, max_depth=4)
# training the random forest model using training set
rf_model.fit(X_train, y_train)
# predicting values of test set using the trained random forest model
y_pred_rf = rf_model.predict(X_test)
# assessing performance of the model on test set
print("Confusion matrix of the predictions:\n", metrics.confusion_matrix(y_test, y_pred_rf))
print("Balanced accuracy of the predictions:", round(metrics.balanced_accuracy_score(y_test, y_pred_rf), 4))

Confusion matrix of the predictions:
 [[ 58   3]
 [  2 108]]
Balanced accuracy of the predictions: 0.9663


In [3]:
from sklearn import cluster
# initializing a random forest model
kmeans_model = cluster.KMeans(n_clusters=2, n_init = 10)
# training the kmeans clustering model using training set
kmeans_model.fit(X_train)
# assigning new observations, that are test set datapoints here, to the identified clusters
y_pred_kmeans = kmeans_model.predict(X_test)

print('Assigned clusters for each datapoint in test set: {}'.format(y_pred_kmeans))

Assigned clusters for each datapoint in test set: [0 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 0 1 0 1 1 0
 1 1 0 1 1 1 0 1 1 1 0 1 0 1 1 1 1 1 0 1 0 1 0 1 1 1 1 1 1 1 1 1 0 1 0 1 1
 1 1 1 1 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 1 0 0 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1
 0 1 0 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 0
 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1]
