# Part 2: Model Training Workflow (Practical)

The end-to-end model training workflow goes like this:

1. After preprocessing of the data, train-test split is performed. The test set is kept for the final evaluation of the (hyperparameter-) tuned model.
2. A combination of hyperparameters' prospective values (usually a range or list) is determined. Then grid-search (or random search) is performed to determine the best possible combination candidate by using cross-validation to evaluate each candidate.
3. The whole training set is used to train the model with the best candidate combination of hyperparameters found from the previous step. This model is called the (hyperparameter-) tuned model.
4. Evaluate the tuned model on the test set.

Let's perform the workflow on a regularized `Logistic Regression` model on the `titanic` dataset.

In [1]:
import seaborn as sns
from sklearn.model_selection import GridSearchCV, cross_val_score, train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_curve, auc

In [2]:
# Generate predictions on the test set and evaluate the model's performance
# using a confusion matrix, accuracy, precision, recall, and F1 score.

titanic = sns.load_dataset('titanic')
features = ['pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked']
target = 'survived'

X = titanic[features]
y = titanic[target]

In [3]:
# Preprocessing for numerical data
numerical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())
])

# Preprocessing for categorical data
categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='most_frequent')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))
])

# Bundle preprocessing for numerical and categorical data
preprocessor = ColumnTransformer(
    transformers=[
        ('num', numerical_transformer, ['age', 'sibsp', 'parch', 'fare']),
        ('cat', categorical_transformer, ['pclass', 'sex', 'embarked'])
    ])

# Define the model
model = LogisticRegression(penalty='elasticnet', solver='saga', max_iter=1000)

# Define the hyperparameter grid
param_grid = {
    'C': [0.05, 0.1, 0.15, 0.2],
    'l1_ratio': [0, 0.25, 0.5, 0.75, 1.0],
}

# Initialize GridSearchCV with the model, parameter grid, cross-validation strategy, and scoring metrics
# here we use f1 score as the metrics
grid_search = GridSearchCV(estimator=model, param_grid=param_grid, cv=5, n_jobs=-1, verbose=2, scoring='f1')

# Create and evaluate the pipeline
pipeline = Pipeline(steps=[('preprocessor', preprocessor),
                           ('grid_search', grid_search)
                          ])

In [4]:
# Split the data into training and testing sets. 
# Fit the full pipeline (preprocessing and model) on the training data.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
pipeline.fit(X_train, y_train)

Fitting 5 folds for each of 20 candidates, totalling 100 fits
[CV] END .................................C=0.05, l1_ratio=0; total time=   0.0s
[CV] END .................................C=0.05, l1_ratio=0; total time=   0.0s
[CV] END .................................C=0.05, l1_ratio=0; total time=   0.0s
[CV] END .................................C=0.05, l1_ratio=0; total time=   0.0s
[CV] END ..............................C=0.05, l1_ratio=0.25; total time=   0.0s
[CV] END .................................C=0.05, l1_ratio=0; total time=   0.0s
[CV] END ...............................C=0.05, l1_ratio=0.5; total time=   0.0s
[CV] END ...............................C=0.05, l1_ratio=0.5; total time=   0.0s
[CV] END ..............................C=0.05, l1_ratio=0.25; total time=   0.0s
[CV] END ..............................C=0.05, l1_ratio=0.25; total time=   0.0s
[CV] END ...............................C=0.05, l1_ratio=1.0; total time=   0.0s
[CV] END ...............................C=0.05,

In [5]:
best_params = grid_search.best_params_
best_score = grid_search.best_score_

print(f"Best Parameters: {best_params}")
print(f"Best Score: {best_score:.4f}")

Best Parameters: {'C': 0.05, 'l1_ratio': 0}
Best Score: 0.7304


In [6]:
# Display a list of hyperparameters tested in the grid search alongside their corresponding mean test scores.

list(zip(grid_search.cv_results_['params'], grid_search.cv_results_['mean_test_score']))

[({'C': 0.05, 'l1_ratio': 0}, 0.730435877443761),
 ({'C': 0.05, 'l1_ratio': 0.25}, 0.7235263118758264),
 ({'C': 0.05, 'l1_ratio': 0.5}, 0.7190841792874277),
 ({'C': 0.05, 'l1_ratio': 0.75}, 0.7046556679985212),
 ({'C': 0.05, 'l1_ratio': 1.0}, 0.70865096096225),
 ({'C': 0.1, 'l1_ratio': 0}, 0.7233268233512049),
 ({'C': 0.1, 'l1_ratio': 0.25}, 0.7278184801428662),
 ({'C': 0.1, 'l1_ratio': 0.5}, 0.7181125781333633),
 ({'C': 0.1, 'l1_ratio': 0.75}, 0.717798826170872),
 ({'C': 0.1, 'l1_ratio': 1.0}, 0.7165292219886464),
 ({'C': 0.15, 'l1_ratio': 0}, 0.7240515583534956),
 ({'C': 0.15, 'l1_ratio': 0.25}, 0.724163307279177),
 ({'C': 0.15, 'l1_ratio': 0.5}, 0.7212954749011656),
 ({'C': 0.15, 'l1_ratio': 0.75}, 0.7190109473829932),
 ({'C': 0.15, 'l1_ratio': 1.0}, 0.7215808080166874),
 ({'C': 0.2, 'l1_ratio': 0}, 0.7227446128717927),
 ({'C': 0.2, 'l1_ratio': 0.25}, 0.721574310515423),
 ({'C': 0.2, 'l1_ratio': 0.5}, 0.7225518925518926),
 ({'C': 0.2, 'l1_ratio': 0.75}, 0.7212954749011656),
 ({'C': 

In [7]:
# Generate predictions on the test set and calculate classification metrics (Accuracy, Precision, Recall, F1),
# as well as the ROC curve and AUC score using predicted probabilities.

preds = pipeline.predict(X_test)

accuracy = accuracy_score(y_test, preds)
precision = precision_score(y_test, preds)
recall = recall_score(y_test, preds)
f1 = f1_score(y_test, preds)
fpr, tpr, thresholds = roc_curve(y_test, pipeline.predict_proba(X_test)[:,1])
roc_auc = auc(fpr, tpr)

In [8]:
print(f'Accuracy: {accuracy:.2f}')
print(f'Precision: {precision:.2f}')
print(f'Recall: {recall:.2f}')
print(f'F1 Score: {f1:.2f}')
print(f"AUC: {roc_auc:.2f}")

Accuracy: 0.79
Precision: 0.75
Recall: 0.70
F1 Score: 0.72
AUC: 0.87
