<h2>scikit-learn</h2>
This notebook shows the main functions of scikit-learn, a Python library for machine learning and dimensional analysis.

After you installed Scipy on your device, you can import it into your Python scripts with the following commands:
Usually, scikit-learn is bundled with common Python distribution, such as Anaconda (https://www.continuum.io/anaconda). If it is not bundled with your distribution and you use pip, you can install scikit-learn with the command "pip install scikit-learn". A free book which includes lectures about scikit-learn can be found under http://www.scipy-lectures.org (most of the examples shown in this notebook are derived from this book).<br>
scikit-learn's official web site is http://scikit-learn.org/stable/index.html

<h3>1. k-Nearest neighbors classifier</h3>
https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm<br>
http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html

In [27]:
from sklearn import datasets
from sklearn import neighbors
import numpy

iris = datasets.load_iris()

classifier = neighbors.KNeighborsClassifier()
classifier.fit(iris.data, iris.target)

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=5, p=2,
           weights='uniform')

In [47]:
classifier.predict([[0.1, 0.2, 0.3, 0.4]])

array([0])

In [53]:
classifier.predict_proba([[2.1, 2.2, 1.9, 1.4]])

array([[ 1.,  0.,  0.]])

<h3>2. Support Vector Machine</h3>
https://en.wikipedia.org/wiki/Support_vector_machine<br>
http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

In [39]:
from sklearn import svm

svc = svm.SVC(kernel='linear') # Other kernels are 'poly' and 'rbf'
svc.fit(iris.data, iris.target)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

In [45]:
svc.predict([[0.1, 0.2, 0.3, 0.4]])

array([0])

<h3>3. K-Means Clustering</h3>
https://en.wikipedia.org/wiki/K-means_clustering<br>
http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

In [56]:
from sklearn import cluster
clustering = cluster.KMeans(n_clusters=3)
clustering.fit(iris.data)
print(clustering.labels_)

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 2 2 2 1 2 2 2 2
 2 2 1 1 2 2 2 2 1 2 1 2 1 2 2 1 1 2 2 2 2 2 1 2 2 2 2 1 2 2 2 1 2 2 2 1 2
 2 1]


<h3>4. Principal Component Analysis</h3>
https://en.wikipedia.org/wiki/Principal_component_analysis<br>
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

In [35]:
from sklearn import decomposition
pca = decomposition.PCA(n_components=2)
pca.fit(iris.data)

PCA(copy=True, iterated_power='auto', n_components=2, random_state=None,
  svd_solver='auto', tol=0.0, whiten=False)

In [54]:
transformed = pca.transform(iris.data)

<h3>5. Multi-layer Perceptron Artificial Neural Network</h3>
https://en.wikipedia.org/wiki/Multi-layer_perceptron<br>
http://scikit-learn.org/stable/modules/neural_networks_supervised.html<br>
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html<br>
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html<br>
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html

In [19]:
from sklearn.neural_network import MLPClassifier # MLPRegressor is an alternative with the identity funtion as activation function of the output layer
import sklearn.neural_network 

x_values = [[0., 0.], [1., 1.]] # Inputs
y_values = [0, 1] # Predictions

# In order to normalize the data, StandardScaler(copy=True, with_mean=True, with_std=True) may be used
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(x_values)

StandardScaler(copy=True, with_mean=True, with_std=True)

In [21]:
x_values = scaler.transform(x_values) # y_values does not have the same shape, and cannot be used here
x_values

array([[-3., -3.],
       [ 1.,  1.]])

In [22]:
ann = MLPClassifier(solver="lbfgs", alpha=1e-5, activation="relu",
                    hidden_layer_sizes=(5, 2), random_state=1) # solver = "sgd" for stochastic gradient
ann.fit(x_values, y_values)

MLPClassifier(activation='relu', alpha=1e-05, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(5, 2), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=1, shuffle=True,
       solver='lbfgs', tol=0.0001, validation_fraction=0.1, verbose=False,
       warm_start=False)

In [23]:
predictions = ann.predict([[2., 2.], [-1., -2.]])
predictions

array([1, 0])

In [28]:
# In order to evaluate the model, the following methods may be used:
from sklearn.metrics import classification_report, confusion_matrix

confusion_matrix(y_values, predictions)

array([[0, 1],
       [1, 0]], dtype=int64)

In [29]:
# precision = number of correct classifications
# recall = portion of correctly categorized members of the category
# f1 is derived from precision and recall
print(classification_report(y_values, predictions))

             precision    recall  f1-score   support

          0       0.00      0.00      0.00         1
          1       0.00      0.00      0.00         1

avg / total       0.00      0.00      0.00         2



PSB 2017