## Scikit-learn
Scikit-learn is an open source Python library that
implements a range of machine learning,
preprocessing, cross-validation and visualization
algorithms using a unified interface.

### A Basic Example
Using sample iris data

In [3]:
from sklearn import neighbors, datasets, preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
iris = datasets.load_iris()
X, y = iris.data[:, :2], iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=33)
scaler = preprocessing.StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
knn = neighbors.KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
accuracy_score(y_test, y_pred)
print('Score:', accuracy_score(y_test, y_pred))

Score: 0.631578947368421


### Loading the Data
Your data needs to be numeric and stored as NumPy arrays or SciPy sparse
matrices. Other types that are convertible to numeric arrays, such as Pandas
DataFrame, are also acceptable

### numpy example

In [9]:
import numpy as np
>>> X = np.random.random((11,5))
>>> y = np.array(['M','M','F','F','M','F','M','M','F','F','F'])
>>> X[X < 0.7] = 0

### train and test example

In [12]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)