Data-Preprocessing-with-ColumnTransformer-and-Pipelines

Sklearn’s Pipelines with ColumnTransformer is an easy way to apply transformation rules in a standard manner, creating a more organized and clean code.

ColumnTransformer

What a ColumnTransformer allows is to apply a Sklearn’s Transformer only in a group of columns.

The ColumnTransformer object receives a list of tuples composed of the transformer name (this is your choice), the transformer itself, and the columns where to apply the transformation. The argument remainder specifies what needs to be done with all other columns.

ColumnTransformers with Pipelines

The ColumnTransformer is quite helpful, but more is needed. In many cases, a column needs to be processed in multiple steps.

For example, the numerical feature “price” may require an operation to replace the NULL values with the data mean, a log transformation to distribute the data more symmetrically, and standardization to make its values fall closer to the interval [-1, 1].

With pipelines, we can chain multiple transformers to create a complex process. Because a pipeline object is equivalent to a simple transformer (e.g., it has the same .fit() and .transform() methods), it can be inserted into the ColumnTransformer object.

You can also put a ColumnTransformer inside a Pipeline because it is a simple transformer object, and this loop can go on as long as you need.

The pipeline object has quite an intuitive interface. It accepts a list of tuples, each representing a transformer, with a name of your choice and the transformer object itself. It applies the transformations in the specified order.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-Preprocessing-with-ColumnTransformer-and-Pipelines

ColumnTransformer

ColumnTransformers with Pipelines

Link: https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

About

Uh oh!

Releases

Packages

srsapireddy/Data-Preprocessing-with-ColumnTransformer-and-Pipelines

Folders and files

Latest commit

History

Repository files navigation

Data-Preprocessing-with-ColumnTransformer-and-Pipelines

ColumnTransformer

ColumnTransformers with Pipelines

Link: https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages