<center><img src=img/MScAI_brand.png width=70%></center>

# Scikit-Learn: Summary

We have now seen supervised, unsupervised, and representation learning. All use the Scikit-Learn *Estimator* API. In this notebook, we'll look at a few more details of that API, and then briefly summarise what is in Scikit-Learn.

### The Estimator API

"the main API implemented by scikit-learn is that of the estimator. An estimator is any object that learns from data; it may be a classification, regression or clustering algorithm or a transformer that extracts/filters useful features from raw data."



An estimator object has a `fit` method which may accept `X` or `X` and `y`:

```python
estimator.fit(X)
# OR...
estimator.fit(X, y)
```

A predictor object is an estimator which also has a `predict` method:

```python
estimator.predict(X)
```

Some predictors whose `predict` output is discrete (e.g. clustering or classification) will implement either `predict_proba` or `decision_function`, which return real values:

```python
estimator.predict_proba(X)
```

The `predict` method is then usually implemented as a threshold over the result, e.g.:

```python
    def predict(self, X):
        return self.predict_proba(X) > self.threshold
```

A *transformer* is an estimator with either `transform` or `fit_transform`. Often transformers are representation learning approaches. `fit_transform` is just a shortcut to calling `fit` and then `transform` on the same data.

A *model* is an object with a `score` method which evaluates how good it is, e.g. $R^2$ or classification accuracy. Higher is always better. 

```python
score(X)
```

or

```python
score(X, y)
```

For a little more detail on this, see https://scikit-learn.org/stable/tutorial/statistical_inference/settings.html#estimators-objects

A typical example is `LogisticRegression`:

```python
lr = LogisticRegression()
lr.fit(X, y)
lr.score(X_test, y_test)
lr.predict_proba(X_query)
lr.predict(X_query)
```

### Semantics

Calling `predict` **before** `fit` is disallowed.

After fitting, the estimator object will usually have some new attributes named with a trailing underscore, e.g. `lr.coef_` and `lr.intercept_` for linear regression, or `support_vectors_` and some others for an SVM.

Also, a call to `fit` over-writes the result of any previous call. 

```python
lr.fit(X1, y1)
lr.fit(X2, y2)
```

has the same effect as just: `lr.fit(X2, y2)`.

(A few estimators allow `warm_start=True` in the constructor, or `partial_fit(X, y)`, so that we can pick up training where we left off. But we won't cover these.)


### Scikit-Learn Summary Table

Problem | Example | Technique | Create | Fit | Evaluate | Use
------------|---------|-----------|--------|-----|-------|-----
**Unsupervised**|
Clustering  |Customer segmentation|$k$-means|`km = KMeans(nclusters=2)`|`km.fit(X)`|`km.score(X)`|`km.labels_`
Density estimation|Plotting a distribution smoothly|Kernel density estimation|`kde = KernelDensity()`|`kde.fit(X)`| (none)|`kde.score_samples(new_X)`
Representation learning|Visualising data|Multi-dimensional scaling|`mds = MDS()`|`mds.fit(X)`|(none)|`mds.embedding_`
**Supervised**|
Regression  |Predict car values | Linear regression|`lr = LinearRegression()`|`lr.fit(train_X, train_y)`|`lr.score(test_X, test_y)`|`lr.predict(new_X)`
Classification|Predict customer churn|Support vector machines|`svm = SVC()`|`svm.fit(train_X, train_y)`|`svm.score(test_X, test_y)`|`svm.predict(new_X)`

And here's a decision tree from a Scikit-Learn collaborator. As a decision aid (to help you choose an algorithm) it's of dubious value, but as a summary it's nice!

<img src=img/sklearn-tree.png width=95%>

<font size=1>https://peekaboo-vision.blogspot.com/2013/01/machine-learning-cheat-sheet-for-scikit.html</font>



Finally, this page lists all the main packages and so gives an overview of what is available:

https://scikit-learn.org/stable/modules/classes.html