# API Design


Interface    |
-------------|--------
Estimators   | Any object that can estimate some parameters based on a dataset. Needs a ```fit()``` method.
Transformers | Subclass of estimator that can also transform a dataset. Object needs a ```transform()``` method. Convenience method ```fit_transform()``` which is the equivalent of calling ```fit()``` and then ```transform()``` (but sometimes fit_transform() is optimized and runs much faster).
Predictors   | Estimators capable of making predictions given a dataset. E.g.: a ```LinearRegression``` model. Has a ```predict()``` method and a ```score()``` method that measures the quality of the predictions given a test set.


Note that scikit-learn doesn't have a strict subclass hierarchy; instead it leverages ducktyping: every class that has a ```fit()``` method (and some other less important methods, not described here) is effectively an Estimator. Same for Transformers with ```transform()```, and Predictors with ```predict()```, etc.

## Pipelines

Estimators, Transformers and Predictors can be combined into pipelines.

```python
from sklearn.pipeline 
import Pipelinefrom sklearn.preprocessing 
import StandardScaler
num_pipeline = Pipeline([('imputer', Imputer(strategy="median")),
                         ('attribs_adder', CombinedAttributesAdder()),
                         ('std_scaler', StandardScaler()),])
housing_num_tr = num_pipeline.fit_transform(housing_num)
```

The pipeline exposes the same methods as the final estimator. In this example, the last estimator is a ```StandardScaler```, which is a transformer, so the pipeline has a ```transform()``` method that applies all the transforms to the data in sequence.

It's also possible to create a sort of compound pipelines (i.e. pipelines made up of different pipelines) using ```FeatureUnion```.