In [11]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.metrics import accuracy_score

# Step 1: Load the dataset
df = pd.read_csv("WineQT.csv")

# Step 2: Drop the 'Id' column (not needed)
df.drop(columns=["Id"], inplace=True)

# Step 3: Filter the dataset to include only quality levels 5, 6, and 7
df_filtered = df[df['quality'].isin([5, 6, 7])]

# Step 4: Separate features (X) and target (y) for the filtered dataset
X_filtered = df_filtered.drop(columns=["quality"])  # Features
y_filtered = df_filtered["quality"]  # Target variable (discrete classes)

# Step 5: Map target labels to a range of [0, num_classes - 1]
unique_classes = y_filtered.unique()
num_classes = len(unique_classes)
label_mapping = {label: idx for idx, label in enumerate(sorted(unique_classes))}
y_mapped = y_filtered.map(label_mapping)

# Step 6: Perform Z-score normalization (standardization)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X_filtered)

# Convert the scaled features back to a DataFrame
X_scaled_df = pd.DataFrame(X_scaled, columns=X_filtered.columns)

# Step 7: Split the filtered data into training and test sets (80% training, 20% testing)
X_train, X_test, y_train, y_test = train_test_split(X_scaled_df, y_mapped, test_size=0.2, random_state=42)

# Step 8: Define and train the base model
model = Sequential([
    Dense(64, activation='relu', input_shape=(X_train.shape[1],)),  # Input layer
    Dense(32, activation='relu'),  # Hidden layer
    Dense(16, activation='relu'),  # Hidden layer
    Dense(num_classes, activation='softmax')  # Output layer (num_classes)
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(X_train, y_train, epochs=55, batch_size=32, validation_split=0.2, verbose=1)

# Step 9: Evaluate the model on the test set
y_pred = model.predict(X_test)
y_pred = tf.argmax(y_pred, axis=1)
accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy on the filtered dataset: {accuracy:.4f}")

Epoch 1/55
Epoch 2/55
Epoch 3/55
Epoch 4/55
Epoch 5/55
Epoch 6/55
Epoch 7/55
Epoch 8/55
Epoch 9/55
Epoch 10/55
Epoch 11/55
Epoch 12/55
Epoch 13/55
Epoch 14/55
Epoch 15/55
Epoch 16/55
Epoch 17/55
Epoch 18/55
Epoch 19/55
Epoch 20/55
Epoch 21/55
Epoch 22/55
Epoch 23/55
Epoch 24/55
Epoch 25/55
Epoch 26/55
Epoch 27/55
Epoch 28/55
Epoch 29/55
Epoch 30/55
Epoch 31/55
Epoch 32/55
Epoch 33/55
Epoch 34/55
Epoch 35/55
Epoch 36/55
Epoch 37/55
Epoch 38/55
Epoch 39/55
Epoch 40/55
Epoch 41/55
Epoch 42/55
Epoch 43/55
Epoch 44/55
Epoch 45/55
Epoch 46/55
Epoch 47/55
Epoch 48/55
Epoch 49/55
Epoch 50/55
Epoch 51/55
Epoch 52/55
Epoch 53/55
Epoch 54/55
Epoch 55/55
Accuracy on the filtered dataset: 0.6468


#### base model:
Neural Network Classification Report:
               precision    recall  f1-score   support

           0       0.00      0.00      0.00         0
           1       0.00      0.00      0.00         6
           2       0.69      0.74      0.71        96
           3       0.59      0.52      0.55        99
           4       0.44      0.62      0.52        26
           5       0.00      0.00      0.00         2

    accuracy                           0.60       229

## Modifying the Dataset to Improve Model Performance
To improve the performance of our neural network model, we modified the dataset by removing wine samples with quality scores of 3, 4, and 8. This was not done because these values were noise but because they formed small subsets that consistently led to poor predictive performance in earlier tests. Our goal was to alter the dataset structure in a way that improves classification accuracy overall.

### Reasoning Behind the Modification

The dataset originally contained a wide range of quality labels, but the distribution was uneven. Some quality categories had very few samples, making them harder for the model to learn effectively. When categories have very low representation, the model tends to either overfit to these rare cases or misclassify them due to insufficient training examples. Instead of focusing on these challenging outliers, we refined the model by keeping only the dominant categories: quality 5, 6, and 7.

This change created a dataset that allowed the neural network to better learn the relationships between the wine’s chemical properties and its perceived quality. By limiting the model’s focus to the most frequently occurring quality labels, we enabled it to generalize better and make more consistent predictions.

### Results After Modification

The results show a clear improvement in accuracy. The base model, trained on all quality levels, struggled due to the difficulty of predicting rare quality labels correctly. After filtering the dataset, the model was able to achieve more stable and reliable predictions, leading to an increase of about **4.68% in accuracy**.

- **Base Model Accuracy:** 60%  
- **Modified Model Accuracy:** 64.68%  

### Conclusion

This experiment demonstrates that modifying a dataset—by removing categories that consistently lead to poor performance—can significantly enhance a model’s effectiveness in certain cases. Rather than simply tuning hyperparameters, adjusting the data itself can sometimes be the most effective way to improve results.

However, it’s important to note that this approach does not necessarily create the best overall model. The removed quality categories (3, 4, and 8) are still essential for a fully representative classifier. If this were a real-world scenario, a better approach might be to use techniques such as data augmentation, resampling, or class-weighted loss to handle rare categories rather than removing them entirely.

In this specific case, filtering the dataset led to improved performance by making the classification task easier, but the trade-off is that our model is now unable to predict certain quality levels at all. This highlights the balance between optimizing for accuracy and maintaining a model’s ability to generalize across all classes.