

# Project 5 Instructions - Ensemble ML, Spiral (Wine)
**Author:** AARON 
**Date:** November 19, 2025 
**Objective:** Gain an understanding of ensemble model collections.  Evaluate two collection of models and document peformance metrics.



## Introduction
- Gain an understanding of ensemble model collections.  Evaluate two collection of models and document peformance metrics.


## Section 1. Import and Inspect the Data
 

### 1.1 Include Imports

In [2]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.ensemble import (
    RandomForestClassifier,
    AdaBoostClassifier,
    GradientBoostingClassifier,
    BaggingClassifier,
    VotingClassifier,
)
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neural_network import MLPClassifier

from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import (
    confusion_matrix,
    accuracy_score,
    precision_score,
    recall_score,
    f1_score,
)


### 1.2 Load the dataset and display basic information

In [3]:

# Load the dataset (download from UCI and save in the same folder)
df = pd.read_csv("winequality-red.csv", sep=";")

# Display structure and first few rows
df.info()
df.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1599 entries, 0 to 1598
Data columns (total 12 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   fixed acidity         1599 non-null   float64
 1   volatile acidity      1599 non-null   float64
 2   citric acid           1599 non-null   float64
 3   residual sugar        1599 non-null   float64
 4   chlorides             1599 non-null   float64
 5   free sulfur dioxide   1599 non-null   float64
 6   total sulfur dioxide  1599 non-null   float64
 7   density               1599 non-null   float64
 8   pH                    1599 non-null   float64
 9   sulphates             1599 non-null   float64
 10  alcohol               1599 non-null   float64
 11  quality               1599 non-null   int64  
dtypes: float64(11), int64(1)
memory usage: 150.0 KB


Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality
0,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,5
1,7.8,0.88,0.0,2.6,0.098,25.0,67.0,0.9968,3.2,0.68,9.8,5
2,7.8,0.76,0.04,2.3,0.092,15.0,54.0,0.997,3.26,0.65,9.8,5
3,11.2,0.28,0.56,1.9,0.075,17.0,60.0,0.998,3.16,0.58,9.8,6
4,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,5


## Section 2. Data Preparation


In [4]:
# Define helper function that:

# Takes one input, the quality (which we will temporarily name q while in the function)
# And returns a string of the quality label (low, medium, high)
# This function will be used to create the quality_label column
def quality_to_label(q):
    if q <= 4:
        return "low"
    elif q <= 6:
        return "medium"
    else:
        return "high"


# Call the apply() method on the quality column to create the new quality_label column
df["quality_label"] = df["quality"].apply(quality_to_label)


# Then, create a numeric column for modeling: 0 = low, 1 = medium, 2 = high
def quality_to_number(q):
    if q <= 4:
        return 0
    elif q <= 6:
        return 1
    else:
        return 2


df["quality_numeric"] = df["quality"].apply(quality_to_number)


## Section 3. Feature Selection and Justification



### 3.1 Define X and y



In [5]:
# Define input features (X) and target (y)
# Features: all columns except 'quality' and 'quality_label' and 'quality_numberic' - drop these from the input array
# Target: quality_label (the new column we just created)
X = df.drop(columns=["quality", "quality_label", "quality_numeric"])  # Features
y = df["quality_numeric"]  # Target

### Reflection 3:
- I combine numbers into low, medium, and high categories.  This allows all the different models focus on three target areas.  This helps with overfitting and giving the classifier well defined boundaries. 

## Section 4. Split the Data into Train and Test
 


In [6]:
# Train/test split (stratify to preserve class balance)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

## Section 5.  Evaluate Model Performance (Choose 2)



In [7]:
# Helper function to train and evaluate models
def evaluate_model(name, model, X_train, y_train, X_test, y_test, results):
    model.fit(X_train, y_train)

    y_train_pred = model.predict(X_train)
    y_test_pred = model.predict(X_test)

    train_acc = accuracy_score(y_train, y_train_pred)
    test_acc = accuracy_score(y_test, y_test_pred)
    train_f1 = f1_score(y_train, y_train_pred, average="weighted")
    test_f1 = f1_score(y_test, y_test_pred, average="weighted")

    print(f"\n{name} Results")
    print("Confusion Matrix (Test):")
    print(confusion_matrix(y_test, y_test_pred))
    print(f"Train Accuracy: {train_acc:.4f}, Test Accuracy: {test_acc:.4f}")
    print(f"Train F1 Score: {train_f1:.4f}, Test F1 Score: {test_f1:.4f}")

    results.append(
        {
            "Model": name,
            "Train Accuracy": train_acc,
            "Test Accuracy": test_acc,
            "Train F1": train_f1,
            "Test F1": test_f1,
        }
    )

    

### 5.1a Gradient Boosting (100)
- Below I run the Gradient Boosting Classifier with n_estimators at 100 and learning_rate at 0.1


In [20]:
# Gradient Boosting 100, 0.01

results = []
evaluate_model(
    "Gradient Boosting (100, 0.1)",
    GradientBoostingClassifier(
        n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42
    ),
    X_train,
    y_train,
    X_test,
    y_test,
    results,
)



Gradient Boosting (100, 0.1) Results
Confusion Matrix (Test):
[[  0  13   0]
 [  3 247  14]
 [  0  16  27]]
Train Accuracy: 0.9601, Test Accuracy: 0.8562
Train F1 Score: 0.9584, Test F1 Score: 0.8411


### 5.1b Gradient Boosting (175)
- Below I run the Gradient Boosting Classifier with n_estimators at 175 and learning_rate at 0.03.

In [21]:
# Gradient Boosting 175, 0.03


evaluate_model(
    "Gradient Boosting (175, 0.03)",
    GradientBoostingClassifier(
        n_estimators=175, learning_rate=0.03, max_depth=3, random_state=42
    ),
    X_train,
    y_train,
    X_test,
    y_test,
    results,
)


Gradient Boosting (175, 0.03) Results
Confusion Matrix (Test):
[[  1  12   0]
 [  2 250  12]
 [  0  16  27]]
Train Accuracy: 0.9281, Test Accuracy: 0.8688
Train F1 Score: 0.9222, Test F1 Score: 0.8546


### 5.1c Reflection on Gradient Boosting
- I think a well defined improvement was made by increasing the n_estimators to 175 and decreasing the learning_rate to 0.03.  I see a Test Accuracy and Test F1 Score improvement of > 1%, an accurate prediction for one "low" classification, and a 3% drop in the Train Accuracy and Train F1 Score.  
- I gain accuracy and reduce the difference between the Train and Test metrics from 10% to 6%.  This in turn leads to less overfitting.  
- I feel this is a great addition to this model and something to keep in mind.  Namely, adjust the parameters of the model to get a good balance. 

### 5.2a Voting (RF 100 + LR 1K + KNN)

- Below I run the Voting Classifier with n_estimators at 100 and max_iter at 1000.

In [22]:
# Voting Classifier (RF 100, LR 1K, KNN) 
voting2 = VotingClassifier(
    estimators=[
        ("RF", RandomForestClassifier(n_estimators=100)),
        ("LR", LogisticRegression(max_iter=1000)),
        ("KNN", KNeighborsClassifier()),
    ],
    voting="soft",
)
evaluate_model(
    "Voting (RF 100 + LR 1K + KNN)", voting2, X_train, y_train, X_test, y_test, results
)


Voting (RF 100 + LR 1K + KNN) Results
Confusion Matrix (Test):
[[  0  13   0]
 [  0 258   6]
 [  0  27  16]]
Train Accuracy: 0.9179, Test Accuracy: 0.8562
Train F1 Score: 0.9010, Test F1 Score: 0.8236


STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT

Increase the number of iterations to improve the convergence (max_iter=1000).
You might also want to scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


### 5.2b Voting (RF 100 + LR 1K + KNN)

- Below I run the Voting Classifier with n_estimators at 300 and max_iter at 2000.

In [23]:
# Voting Classifier (RF - 300, LR 2K, KNN) 

voting3 = VotingClassifier(
    estimators=[
        ("RF", RandomForestClassifier(n_estimators=300)),
        ("LR", LogisticRegression(max_iter=2000)),
        ("KNN", KNeighborsClassifier()),
    ],
    voting="hard",
)
evaluate_model(
    "Voting (RF 300 + LR 2K + KNN)", voting3, X_train, y_train, X_test, y_test, results
)


Voting (RF 300 + LR 2K + KNN) Results
Confusion Matrix (Test):
[[  0  13   0]
 [  1 258   5]
 [  1  23  19]]
Train Accuracy: 0.9124, Test Accuracy: 0.8656
Train F1 Score: 0.8977, Test F1 Score: 0.8391


### Section 6. Compare Results 

In [27]:
# Create a table of results 
results_df = pd.DataFrame(results)

results_df["Test Acc_Diff"] = results_df["Test Accuracy"] - results_df.loc[0, "Test Accuracy"]
results_df["Train-Test Acc_Diff"] = results_df["Train Accuracy"] - results_df[ "Test Accuracy"]
results_df["Train-Test F1_Diff"] = results_df["Train F1"] - results_df[ "Test F1"]
results_df = results_df.sort_values(by="Test Acc_Diff", ascending=False)

print("\nSummary of All Models:")
display(results_df)




Summary of All Models:


Unnamed: 0,Model,Train Accuracy,Test Accuracy,Train F1,Test F1,Test Acc_Diff,Train-Test Acc_Diff,Train-Test F1_Diff
1,"Gradient Boosting (175, 0.03)",0.928069,0.86875,0.922212,0.854639,0.0125,0.059319,0.067573
3,Voting (RF 300 + LR 2K + KNN),0.912432,0.865625,0.897654,0.839116,0.009375,0.046807,0.058538
0,"Gradient Boosting (100, 0.1)",0.960125,0.85625,0.95841,0.841106,0.0,0.103875,0.117304
2,Voting (RF 100 + LR 1K + KNN),0.917905,0.85625,0.901043,0.823627,0.0,0.061655,0.077416


Dan Millers reference metrics notebook:
https://github.com/DMill31/applied-ml-miller/blob/main/notebooks/project05/ensemble-miller.ipynb

Megan Chastain
https://github.com/Megan-Chastain1/applied-ml-Chastain/blob/main/docs/project05/%20ensemble-Chastain.ipynb