# Comparison of whitebox and blackbox models

## Main similarities

Both models identified alcohol content as the most influential feature in predicting wine quality. This aligns with domain knowledge as higher alcohol levels often correlate with better fermentation control and higher quality wines. Volatile acidity was consistently the second most important feature across both models, supporting the idea that acidity levels play a significant role in wine quality perception. The agreement on these top two features provides confidence in their robustness and relevance, as two very different modeling approaches arrived at similar conclusions.

## Main differences

In the decision tree, fixed acidity appeared at the third level of the tree, indicating that it played a meaningful role in the model's classification decisions. In contrast, the Shapley value analysis of the neural network showed that fixed acidity had minimal impact on its predictions, ranking among the least influential features overall. Looking back at earlier exploratory analysis, I observed that fixed acidity was positively correlated with red wine quality but negatively associated with white wine quality.

Another difference is the use of the wine type feature. While it was available to both models, the decision tree did not use it in its splits, whereas the neural network assigned it a small but noticeable importance. This could raise concerns, as red and white wines have different characteristics that can affect quality. However, by not relying on wine type, the decision tree may avoid potential bias such as the neural network's tendency to favor white wines.

## Results

| Metric                              | Decision Tree | Neural Network |
|-------------------------------------|----------------|----------------|
| Accuracy                            | 0.785          | 0.810          |
| Sensitivity / Recall                | 0.589          | 0.634          |
| Specificity                         | 0.855          | 0.873          |
| Precision                           | 0.592          | 0.640          |
| F1 Score                            | 0.591          | 0.637          |
| AUROC                               | 0.758          | 0.864          |
| AUPR                                | 0.531          | 0.690          |

The neural network outperformed the decision tree across all key metrics, particularly in AUROC and AUPR.



## Ethical Considerations

### Does this activity involve the development, deployment and/or use of Artificial Intelligence-based systems?  
**Answer**: **Yes**

#### 1) Explanation on informing participants and/or end-users
This project involves the use of AI systems, a decision tree (whitebox model) and a neural network (blackbox model) to classify wine quality based on physicochemical properties.

- **User Interaction with AI**:  
  Users of the system (e.g., researchers, data scientists) are clearly informed that they are interacting with machine learning models. The nature and purpose of the AI systems are explicitly communicated in the documentation.

- **Abilities, Limitations, Risks, and Benefits**:  
  The AI models can assist in identifying whether a wine sample is likely to be of good or poor quality. However, wine quality is partly subjective and influenced by individual preferences, which the models cannot capture. Benefits include faster, consistent evaluations based on objective input features, while risks include possible overfitting or biased outcomes if the models are misused or deployed in new contexts without further validation.

- **Decision Logic Transparency**:  
  The decision tree model provides a fully interpretable decision-making path, allowing users to trace how a classification was reached. For the neural network model, both SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-Agnostic Explanations) were used to provide post hoc interpretability. These tools help identify the contribution of each feature to the model's predictions, both globally (SHAP) and locally (LIME). These interpretability methods and their results are documented.


#### 2) Measures to avoid bias in input data and algorithm design
- The original dataset was found to be imbalanced, particularly between classes representing different quality levels.
- To mitigate bias, undersampling techniques were applied to balance the class distribution. Additionally, multiple performance metrics (accuracy, precision, recall, and F1-score) were used to evaluate fairness across classes.
- Interpretability tools were employed (LIME/SHAP) to detect if certain features unfairly dominated the model's decisions.

#### 3) Respect for fundamental human rights and freedoms
- The dataset used is publicly available and fully anonymized. It contains no personally identifiable information or sensitive human data.
- Wine brands and the identities of the original quality evaluators are not disclosed in the dataset, protecting privacy.
- No user data is collected, stored, or processed.
- The AI system does not interfere with or replace human autonomy; it serves as a supplementary classification tool and not a decision-maker.

#### 4) Ethical risks and mitigation measures
- This project carries minimal ethical risks, given its use of public, anonymized data and its application in a low-stakes domain (wine quality classification).
- Nonetheless, potential risks such as bias in classification or misinterpretation of model outputs were considered.
- Mitigation measures include:
  - Balancing the dataset to reduce training bias,
  - Applying model-agnostic interpretability techniques (LIME and SHAP),
  - Transparent reporting of performance and limitations,
  - Emphasizing that AI-based recommendations should not replace expert judgment in high-stakes domains.

### Could the AI-based system/technique potentially stigmatise or discriminate against people?

**Answer**: **No**

The AI system developed in this project does not involve any personal or demographic data related to individuals. The dataset used exclusively contains physicochemical properties of wine samples and their associated quality ratings. No human-related features (e.g., gender, race, age, etc.) are included. Therefore, there is no risk of bias, discrimination, or stigmatisation against any individual or group.

### Does the AI system/technique interact, replace or influence human decision-making processes?

**Answer**: **No**

The AI models developed in this project are used solely for research and educational purposes. Their objective is to predict wine quality based on physicochemical characteristics. These predictions do not influence or replace any critical human decision-making processes.

### Does the AI system/technique have the potential to lead to negative social (e.g. on democracy, media, labour market, freedoms, educational choices, mass surveillance) and/or environmental impacts either through intended applications or plausible alternative uses?

**Answer**: **No**

No foreseeable negative societal or environmental impacts are expected due to the very specific, low-risk application domain. The models do not interact with sensitive human attributes, make social predictions, or impact public systems.

### Does this activity involve the use of AI in a weapon system?
**Answer**: **No**

### Does the AI to be developed/used in the project raise any other ethical issues not covered by the questions above?
**Answer**: **No**

The AI models developed in this project are used solely for the classification of wine quality in an academic and technical context. They do not involve subliminal messaging, covert data collection, deceptive mechanisms, or any manipulative behavior. Although the subject matter relates to alcoholic beverages, the project does not promote or encourage alcohol consumption. As such, no additional ethical concerns beyond those already addressed are applicable to this project.

