### INTRO

* Use case: #labeled points << #unlabeled points
* Both LabelPropagation & LabelSpreading work by building a similarity graph.

[API](http://scikit-learn.org/stable/modules/classes.html#module-sklearn.semi_supervised)

### LABEL PROPAGATION

* Performs "hard clamp" of input labels (alpha=1), can be relaxed
* Uses similarity matrix as-is

[API](http://scikit-learn.org/stable/modules/generated/sklearn.semi_supervised.LabelPropagation.html#sklearn.semi_supervised.LabelPropagation) |
[demo](plot_label_propagation_structure.ipynb)



In [2]:
import numpy as np
from sklearn import datasets
from sklearn.semi_supervised import LabelPropagation
label_prop_model = LabelPropagation()
iris = datasets.load_iris()
random_unlabeled_points = np.where(np.random.randint(0, 2,
   size=len(iris.target)))
labels = np.copy(iris.target)
labels[random_unlabeled_points] = -1
label_prop_model.fit(iris.data, labels)

LabelPropagation(alpha=1, gamma=20, kernel='rbf', max_iter=30, n_jobs=1,
         n_neighbors=7, tol=0.001)

### LABEL SPREADING

* Imposes a loss function with regularization properties (helps with noise immunity)
* Normalizes each weights by finding normalized graph Laplacian matrix
* Two built-in kernels ('gamma'=RBF, 'n_neighbors'=KNN)
   * RBF kernel = fully connected graph, dense matrix
   * KNN kernel = sparse matrix (much better runtimes)

[API](http://scikit-learn.org/stable/modules/generated/sklearn.semi_supervised.LabelSpreading.html#sklearn.semi_supervised.LabelSpreading) |
[demo](plot_label_propagation_structure.ipynb)

In [4]:
import numpy as np
from sklearn import datasets
from sklearn.semi_supervised import LabelSpreading
label_prop_model = LabelSpreading()
iris = datasets.load_iris()
random_unlabeled_points = np.where(np.random.randint(0, 2,
   size=len(iris.target)))
labels = np.copy(iris.target)
labels[random_unlabeled_points] = -1
label_prop_model.fit(iris.data, labels)

LabelSpreading(alpha=0.2, gamma=20, kernel='rbf', max_iter=30, n_jobs=1,
        n_neighbors=7, tol=0.001)