# Introduction to scikit-learn

Scikit-learn is a free software machine learning library. It features various classification, regression and clustering algorithms including support-vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

### Digit classifier example

Here is a simple neural network which classifies small grayscale 8x8 pixels images of digits.

In [3]:
import numpy as np
from sklearn.datasets import load_digits
from sklearn.neural_network import MLPClassifier

ds = load_digits()
digits = ds["data"]
labels = ds["target"]

# Size of the training dataset
N = 200

# Shuffle the dataset
idx = np.argsort(np.random.random(len(labels)))

x_test, y_test = digits[idx[:N]], labels[idx[:N]]
x_train, y_train = digits[idx[N:]], labels[idx[N:]]

clf = MLPClassifier(hidden_layer_sizes=(128,))
clf.fit(x_train, y_train)

score = clf.score(x_test, y_test)
pred = clf.predict(x_test)
err = np.where(y_test != pred)[0]

print("score:", score)
print("errors:")
print("  actual:", y_test[err])
print("  predicted:", pred[err])

score: 0.985
errors:
  actual: [8 6 5]
  predicted: [1 5 9]
