# Learning About Pipelining
- Pipelining is the way to streamlining our ML processes 
- It helps to create steps like preprocessing, then implementing the model
- Below, we'll first look into simple example of how we can implement a pipeline

In [13]:
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn import set_config

In [14]:
# we first start by creating the steps that we want our pipeline to follow
steps = [('Standard Scaler', StandardScaler()), 
            ('Classifier', LogisticRegression())]  # creating named tuples

# named tuples are created when we pass a string and a variable to go along that string inside a tuple
'''
we use named tuples instead of just passing the transformers and estimators cause if we didn't use names 
then we would have to address them by indexes and pipelines where many steps are included, you can just imagine 
how hard it would be to know which step is which and keep track of it.'''

"\nwe use named tuples instead of just passing the transformers and estimators cause if we didn't use names \nthen we would have to address them by indexes and pipelines where many steps are included, you can just imagine \nhow hard it would be to know which step is which and keep track of it."

In [15]:
pipe = Pipeline(steps) # fitting the steps into the Pipeline

In [16]:
set_config(display = 'diagram') #setting the configuration of Pipeline to visualize the steps

In [17]:
pipe

In [18]:
from sklearn.datasets import make_classification # importing make_classification to create random classification dataset

In [19]:
X, y = make_classification()

In [20]:
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.3)

In [21]:
pipe.fit(X_train, y_train)
y_pred = pipe.predict(X_test)
y_pred

array([0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1,
       1, 0, 1, 0, 0, 0, 0, 0])