# Estimators and Transformers
Estimators is the name that sklearn people use to refer ML models in sklearn. So, all your ML algorithms like LinearRegressor, GaussianNB, RandomForestClassifier, etc. are estimators.

Apart from estimators, sklearn also provides a lot of Tranformers. Transformer are responsible for transforming the input into another form, quite literally. We have already seen a few transformers in the course. Do you remember any of them? LabelEncoder, CountVectorizer, TFidfVectorizer, etc. are all transformers.

Both estimators and transformers might look very similar, but they are not.

Transformer                                        |                         Estimator

Used for data preparation                           |            Used for modeling and making prediction

fit()method-findparameters from training data(if needed)|fit()method-findparametersfrom training data

transformer() method - apply to training or test data |predict() method - apply to training or test data

Eg. LabelEncoder, MinMaxScaler, etc.           |    Eg. LinearRegressor, DecisionTreeClassifier, etc.


Now that you know this subtle and important difference between estimators and transformers in sklearn, its becomes very easy to remember the API. It all look very intuitive and obvious.


In [4]:
from sklearn.preprocessing import  PolynomialFeatures, PowerTransformer, LabelBinarizer, LabelEncoder, MinMaxScaler,MultiLabelBinarizer
from sklearn.impute import SimpleImputer

# Pipelines: The Seamless Workflow
Pipelines combine transformers and estimators into a single, streamlined process, ensuring reproducibility and minimizing the risk of data leakage.

# Why Use Pipelines?
* Simplifies code.
* Automates preprocessing and model training.
* Ensures consistent transformations during training and testing.


__Creating a Pipeline__

Hereâ€™s an example pipeline for Random Forest Regression:

In [12]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestRegressor


X = [[1], [2], [3], [4], [5]]
y = [2.1, 2.9, 3.7, 4.5, 5.3]

pipeline = Pipeline([
    ('scaler', StandardScaler()),                   # Transformer
    ('regressor', RandomForestRegressor())            # Estimator
])

pipeline.fit(X, y)
print(pipeline.predict([[2.5]]))

[3.044]


This chapter bridged the gap between classifiers and regressors, showing how versatile algorithms like KNN, SVM, Decision Trees, and Random Forests can adapt to regression tasks. It also introduced the concept of transformers, estimators, and pipelines, demonstrating how these tools simplify and enhance machine-learning workflows. With practical examples and Python code, you're now equipped to tackle real-world regression problems with confidence and creativity!