**Programmer:** python_scripts (Abhijith Warrier)

**PYTHON SCRIPT TO **_BUILD AN END-TO-END ML PIPELINE WITH PREPROCESSING + MODEL USING sklearn.pipeline.Pipeline._**. 🐍⚙️🤖**

This script shows how to bundle **feature scaling** and a **classifier** into one reusable workflow using scikit-learn’s Pipeline. We’ll split data, create a pipeline (**StandardScaler → LogisticRegression**), train, and evaluate — all in a few clean lines.

### 📦 Import Required Libraries

We’ll use the Iris dataset, split utilities, preprocessing, model, and metrics.

In [None]:
# Data & utilities
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Preprocessing & model
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

# Pipeline & metrics
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, classification_report

### 📥 Load Dataset & Create Train/Test Split

Keep a held-out test set to estimate generalization.

In [None]:
# Load the Iris dataset (4 features, 3 classes)
iris = load_iris()
X, y = iris.data, iris.target

# Train/test split (stratified for balanced classes)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42, stratify=y
)

### 🧱 Build the End-to-End Pipeline

Chain StandardScaler (feature scaling) → LogisticRegression (classifier).

In [None]:
# Define a two-step pipeline: scale features, then fit classifier
clf = Pipeline(steps=[
    ("scaler", StandardScaler()),            # step 1: standardize features
    ("logreg", LogisticRegression(max_iter=200, n_jobs=None, random_state=42))  # step 2: classifier
])

### 🚀 Train the Pipeline

fit() runs all steps in order: fit scaler on train → transform → fit model.

In [None]:
# Train the full pipeline on the training set
clf.fit(X_train, y_train)

### 🔮 Predict & Evaluate

Use the same pipeline to transform test data and predict in one call.

In [None]:
# Predict on test data (transforms + predict happen inside)
y_pred = clf.predict(X_test)

# Evaluate results
acc = accuracy_score(y_test, y_pred)
print(f"Test Accuracy: {acc:.4f}\n")
print("Classification Report:\n", classification_report(y_test, y_pred, target_names=iris.target_names))

### ✅ (Optional) Access a Step or its Coefficients

Steps are accessible by name (e.g., "logreg").

In [None]:
# Access the logistic regression step and inspect learned coefficients
lr = clf.named_steps["logreg"]
print("Model classes:", lr.classes_)
print("Coef shape:", lr.coef_.shape)  # (n_classes, n_features)