# Step 1: Load the Dataset
First, let's load the dataset from the CSV file.

In [1]:
import pandas as pd

# Load the dataset
file_path = 'winequality-white.csv'
data = pd.read_csv(file_path)

# Display the first few rows of the dataset
print(data.head())

   fixed acidity  volatile acidity  citric acid  residual sugar  chlorides  \
0            7.0              0.27         0.36            20.7      0.045   
1            6.3              0.30         0.34             1.6      0.049   
2            8.1              0.28         0.40             6.9      0.050   
3            7.2              0.23         0.32             8.5      0.058   
4            7.2              0.23         0.32             8.5      0.058   

   free sulfur dioxide  total sulfur dioxide  density    pH  sulphates  \
0                 45.0                 170.0   1.0010  3.00       0.45   
1                 14.0                 132.0   0.9940  3.30       0.49   
2                 30.0                  97.0   0.9951  3.26       0.44   
3                 47.0                 186.0   0.9956  3.19       0.40   
4                 47.0                 186.0   0.9956  3.19       0.40   

   alcohol  quality  
0      8.8        6  
1      9.5        6  
2     10.1        6 

# Step 2: Prepare the Dataset
Prepare the data for modeling by separating features and target variables and splitting into training and testing sets.

In [2]:
from sklearn.model_selection import train_test_split

# Define features (X) and target (y)
X = data.drop('quality', axis=1)
y = data['quality']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

# Step 3: Train MLPClassifier Models
Train three different MLPClassifier models with different architectures and report the validation scores.

In [3]:
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import cross_val_score

# Define different MLP architectures
mlp_1 = MLPClassifier(hidden_layer_sizes=(50,), activation='relu', random_state=42)
mlp_2 = MLPClassifier(hidden_layer_sizes=(100, 50), activation='tanh', random_state=42)
mlp_3 = MLPClassifier(hidden_layer_sizes=(100, 50, 25), activation='logistic', random_state=42)

# Train and evaluate the models using cross-validation
mlp_models = [mlp_1, mlp_2, mlp_3]
mlp_scores = {}

for i, mlp in enumerate(mlp_models, 1):
    scores = cross_val_score(mlp, X_train, y_train, cv=5, scoring='accuracy')
    mlp_scores[f'mlp_{i}'] = scores.mean()
    print(f'MLP Model {i} - Mean CV Accuracy: {scores.mean():.4f}')

best_model_index = max(mlp_scores, key=mlp_scores.get)
print(f'Best MLP Model: {best_model_index} with Accuracy: {mlp_scores[best_model_index]:.4f}')



MLP Model 1 - Mean CV Accuracy: 0.4872




MLP Model 2 - Mean CV Accuracy: 0.5265




MLP Model 3 - Mean CV Accuracy: 0.5120
Best MLP Model: mlp_2 with Accuracy: 0.5265


* Best Model: MLP Model 2 with two hidden layers (100 and 50 neurons) and tanh activation function.
* Validation Scoring Method: Accuracy was used to evaluate model performance, with cross-validation to ensure the reliability of the results.
* Convergence Issues: All models reached the maximum number of iterations without converging, indicating a need for further tuning (e.g., increasing the number of iterations or adjusting learning rates).

# Step 4: Hyperparameter Tuning on the Best Model
Pick the best-performing model and study the impact of varying three hyperparameters.

In [4]:
from sklearn.model_selection import GridSearchCV

# Best performing model (replace mlp_X with the actual best model from previous step)
best_mlp = mlp_models[int(best_model_index.split('_')[1]) - 1]

# Define the parameter grid for hyperparameter tuning
param_grid = {
    'solver': ['adam', 'sgd'],
    'learning_rate_init': [0.001, 0.01, 0.1],
    'max_iter': [200, 300, 400]
}

# Perform grid search with cross-validation
grid_search = GridSearchCV(best_mlp, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

# Report the best parameters and the corresponding score
print(f'Best Parameters: {grid_search.best_params_}')
print(f'Best CV Accuracy: {grid_search.best_score_:.4f}')




Best Parameters: {'learning_rate_init': 0.001, 'max_iter': 300, 'solver': 'adam'}
Best CV Accuracy: 0.5291


**Validation Scoring Method**

The validation scoring method used was accuracy, which measures the proportion of correctly predicted instances out of the total instances. Accuracy is a straightforward and widely used metric for classification problems.

**MLP Model 2 Performance**

* Initial Performance: MLP Model 2 had a mean cross-validation accuracy of 0.5265, indicating it was the best among the three architectures tested.
* Hyperparameter Tuning: After tuning, the best parameters improved the mean cross-validation accuracy to 0.5291. The adam solver, learning_rate_init of 0.001, and max_iter of 300 were found to be the optimal settings.

**Convergence Issues**

All models reached the maximum number of iterations without converging, as indicated by the convergence warnings. This suggests that further adjustments, such as increasing the number of iterations, adjusting the learning rates, or exploring other optimizers, could potentially improve model performance.

# Step 5: Evaluate the Best Model with Different Scoring Methods
Test the best model's performance using different scoring methods.

In [5]:
from sklearn.metrics import accuracy_score, precision_score, recall_score

# Get the best model from the grid search
best_mlp_tuned = grid_search.best_estimator_

# Predict on the test set
y_pred = best_mlp_tuned.predict(X_test)

# Evaluate using different scoring methods
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')

print(f'Test Accuracy: {accuracy:.4f}')
print(f'Test Precision: {precision:.4f}')
print(f'Test Recall: {recall:.4f}')

Test Accuracy: 0.5490
Test Precision: 0.5419
Test Recall: 0.5490


  _warn_prf(average, modifier, msg_start, len(result))


The performance of the best-tuned MLP model on the test set suggests that the model has room for improvement. The test accuracy of 54.90% indicates that the model is only slightly better than a random guess for this multi-class classification problem (assuming 10 classes, a random guess would have an accuracy of about 10%).

__Cross-Validation and Test Performance Comparison__
* Cross-Validation Accuracy: The best cross-validation accuracy during hyperparameter tuning was 0.5291.
* Test Accuracy: The test accuracy is 0.5490.
The similar performance between cross-validation accuracy (0.5291) and test accuracy (0.5490) suggests that the model is not significantly overfitting. However, both scores indicate relatively low performance, which might suggest some degree of underfitting.

# Step 6: Train and Evaluate a Classical ML Model
Train a new classifier that is not a neural network and compare its performance.

In [6]:
from sklearn.ensemble import RandomForestClassifier

# Define and train a RandomForestClassifier
rf = RandomForestClassifier(random_state=42)
rf.fit(X_train, y_train)

# Predict on the test set
y_pred_rf = rf.predict(X_test)

# Evaluate using different scoring methods
accuracy_rf = accuracy_score(y_test, y_pred_rf)
precision_rf = precision_score(y_test, y_pred_rf, average='weighted')
recall_rf = recall_score(y_test, y_pred_rf, average='weighted')

print(f'Random Forest - Test Accuracy: {accuracy_rf:.4f}')
print(f'Random Forest - Test Precision: {precision_rf:.4f}')
print(f'Random Forest - Test Recall: {recall_rf:.4f}')

Random Forest - Test Accuracy: 0.6796
Random Forest - Test Precision: 0.6882
Random Forest - Test Recall: 0.6796


  _warn_prf(average, modifier, msg_start, len(result))


* Model Comparison: The Random Forest classifier outperforms the MLP classifier in all three metrics: accuracy, precision, and recall. The Random Forest model's accuracy and recall are significantly higher, indicating that it better captures the underlying patterns in the data and makes more accurate predictions. Additionally, the higher precision of the Random Forest model means that it makes more reliable positive predictions.
* Potential Issues with MLP: The relatively low performance of the MLP model suggests potential issues such as underfitting, as indicated by the similar cross-validation and test accuracy scores. The convergence warnings also suggest that the MLP model might require further tuning, such as increasing the number of iterations or adjusting the learning rates.
* Recommendation: Based on these results, the Random Forest classifier is recommended for predicting wine quality in this dataset. Further tuning and optimization of the MLP model might be necessary to improve its performance, but the Random Forest model provides a more accurate and reliable baseline.