<div class="alert alert-block alert-success">
    <h1 align="center">Scikit-Learn Tips</h1>
    <h3 align="center">Tip 09 : Passthrough or Drop</h3>
    <h4 align="center"><a href="http://www.iran-machinelearning.ir">Soheil Tehranipour</a></h5>
</div>

In a ColumnTransformer, you can use the strings 'passthrough' and 'drop' in place of a transformer. Useful if you need to passthrough some columns and drop others!

See example 👇

In [1]:
import pandas as pd
import numpy as np
from sklearn.impute import SimpleImputer
from sklearn.compose import make_column_transformer

In [2]:
impute = SimpleImputer()

In [3]:
X = pd.DataFrame({'A':[1, 2, np.nan],
                  'B':[10, 20, 30],
                  'C':[100, 200, 300],
                  'D':[1000, 2000, 3000],
                  'E':[10000, 20000, 30000]})

In [4]:
X

Unnamed: 0,A,B,C,D,E
0,1.0,10,100,1000,10000
1,2.0,20,200,2000,20000
2,,30,300,3000,30000


In [5]:
# impute A, passthrough B & C, then drop the remaining columns
ct = make_column_transformer(
    (impute, ['A']),
    ('passthrough', ['B', 'C']),
    remainder='drop')

In [6]:
ct.fit_transform(X)

array([[  1. ,  10. , 100. ],
       [  2. ,  20. , 200. ],
       [  1.5,  30. , 300. ]])

In [7]:
# impute A, drop D & E, then passthrough the remaining columns
ct = make_column_transformer(
    (impute, ['A']),
    ('drop', ['D', 'E']),
    remainder='passthrough')

In [8]:
ct.fit_transform(X)

array([[  1. ,  10. , 100. ],
       [  2. ,  20. , 200. ],
       [  1.5,  30. , 300. ]])

<img src="https://webna.ir/wp-content/uploads/2018/08/%D9%85%DA%A9%D8%AA%D8%A8-%D8%AE%D9%88%D9%86%D9%87.png" width=50% />