# Project 2 - Model Assessment and Selection
## Predict customers likely to respond to a marketing campaign

- We used the excel 'Model Assessment.xlsx'

# **Group W**
- Ana Rita Mateus - 20241483;
- Gabriel Fábrega - 20241530;
- Gift Kimbini Musharwa - 20241190;
- Marta Filipe - 20240211;
- Wilson Lima - 20241183.

Following the modeling phase of the CRISP-DM methodology, this section assesses the generated models. The primary objective is to select a model that exhibits High Precision, thereby minimizing costs associated with contacting uninterested customers. A secondary, but important, consideration is the model's ability to provide Flexible Thresholding through probability scores, allowing the Marketing team to adjust cutoffs based on their strategic balance between cost and response volume. The evaluation will also consider Recall, F1-score, and ROC AUC as key technical metrics.

## Data loading

In [1]:
import pandas as pd
# Load data 
ds = pd.read_excel('Model Assessment.xlsx', sheet_name='Sheet1')

In [2]:
ds.head()

Unnamed: 0,Metric,Neural Network (MLP),Logistic Regression,Support Vector Machine (SVM),Random Forest,LightGBM
0,Precision,0.6,0.91,0.622,0.444,0.333
1,Recall,0.5,0.914,0.39,0.627,0.735
2,F1-Score,0.545,0.912,0.485,0.52,0.459
3,ROC AUC,0.721,0.908,0.67,0.745,0.74
4,Accuracy,0.875,0.908,0.875,0.829,0.743


## Detailed Model Evaluation

1. Logistic Regression

Logistic Regression was selected for its simplicity and inherent interpretability, serving as a robust baseline. The model demonstrated outstanding performance, achieving a Precision of 0.910 on the test set. This high precision directly aligns with the primary business objective of minimizing costs by reducing outreach to uninterested customers. Furthermore, its Recall was excellent at 0.914, resulting in a harmonically balanced F1-score of 0.912. The ROC AUC of 0.908 (with other textual references suggesting up to 0.96) signifies strong discriminative power. Crucially, Logistic Regression naturally provides probability outputs, fulfilling the flexible thresholding requirement for the Marketing team. Its straightforward nature also promotes stakeholder acceptance, despite the absence of a detailed feature importance interpretation in this specific iteration. Overall, its strong generalization from training to test data, coupled with its top-tier metrics, makes it a compelling candidate.

2. LightGBM (Light Gradient Boosting Machine)

LightGBM is a gradient boosting framework that uses tree-based learning algorithms and is known for its high efficiency, speed, and often excellent accuracy. Advantages of LightGBM include its ability to handle large datasets with lower memory usage compared to other boosting algorithms, support for parallel and GPU learning, and often faster training speeds. It can also provide feature importance measures.
In this evaluation, the LightGBM model yielded a test Precision of 0.333. This is the lowest precision among all evaluated models, indicating it would be the least effective at minimizing costs associated with contacting uninterested customers. However, its Recall was notably high at 0.735, second only to Logistic Regression, meaning it is quite effective at identifying actual responders. This resulted in an F1-score of 0.459. The ROC AUC was 0.740, suggesting a good level of discriminative ability, comparable to the Random Forest model. LightGBM, like other tree-based ensembles, can readily provide probability scores for flexible thresholding. The provided confusion matrix shows 122 False Positives against 61 True Positives, which visually underscores the low precision. While its recall is a strength, its very low precision is a significant drawback for the primary project objective.

3. Neural Network (Multilayer Perceptron - MLP)

The MLP was explored for its capacity to learn non-linear relationships. While powerful, its Precision on the test set was 0.600, a considerable decrease from Logistic Regression, indicating a higher potential for misclassifying non-responders as responders. Its Recall was 0.500, leading to an F1-score of 0.545. The MLP's ROC AUC of 0.721 suggests reasonable, albeit not exceptional, class discrimination. The model can provide probability scores, meeting the flexible thresholding criterion, and permutation-based feature importance (highlighting Marital Status, Education, and Total Accepted Campaigns) offered some insight into its decision-making. However, the inherent "black-box" nature of MLPs can pose a challenge for stakeholder acceptance compared to simpler models. While optimization showed marginal gains, its precision remains a concern for the primary business goal.

4. Random Forest

The Random Forest model, an ensemble method known for its robustness and ability to provide feature importance, yielded a test Precision of 0.444. Conversely, its Recall was good at 0.627, and its ROC AUC of 0.745 indicated good discriminative capability, slightly better than LightGBM and MLP in this metric. The F1-score stood at 0.520. Feature importance analysis pointed to behavioral features like TotalAcceptedCampaigns, Income, and Recency as key drivers. Random Forests inherently generate probability scores. While it has a better precision than LightGBM, it is still low for the primary business objective.

5. Support Vector Machine (SVM)

The SVM, particularly with a linear kernel, was chosen for its efficacy in high-dimensional spaces and potential for interpretability. On the test set, it achieved a Precision of 0.62, outperforming Random Forest and LightGBM in this metric. Its Recall was the lowest of all models at 0.39, leading to an F1-score of 0.485. The ROC AUC was also the lowest at 0.670 from the comparative table. While a linear SVM offers some interpretability through feature coefficients and can be configured to output probabilities for flexible thresholding, its overall predictive performance, particularly its low recall and F1-score, makes it less competitive.

## Model Ranking and Recommendation

The selection of the optimal model is guided by the predefined success criteria, with a strong emphasis on maximizing Precision to align with business objectives.

1. Recommended Model: Logistic Regression
The Logistic Regression model remains unequivocally the top-ranked model. Its leading performance across all key metrics—Precision (0.910), Recall (0.914), F1-score (0.912), and ROC AUC (0.908)—demonstrates its superior capability in accurately identifying interested customers while minimizing errors. Its inherent ability to provide probability scores meets the flexible thresholding requirement, and its interpretability fosters stakeholder trust. Given the primary goal of reducing costs by avoiding contact with uninterested customers, its high precision is paramount.

2. Second Rank: Support Vector Machine (SVM)
With a Precision of 0.620, the SVM ranks second. This is the highest precision after Logistic Regression, making it more aligned with the business objective than the other models. While its Recall (0.390) and F1-score (0.485) are relatively low, it offers linear interpretability and supports probability outputs. Despite the trade-off in recall, the high precision gives it an edge over the MLP.

3. Third Rank: Neural Network (MLP)
The MLP model achieved a Precision of 0.600 and Recall of 0.500, resulting in an F1-score of 0.545. It captures non-linear relationships and outputs probabilities, but its black-box nature may hinder stakeholder acceptance. While its overall metrics are balanced, the slightly lower precision places it just below SVM for this task.

4. Fourth Rank: Random Forest
Random Forest places fourth. Its ROC AUC (0.745) is good, and its Recall (0.627) is decent. However, its low Precision (0.444) is a significant drawback. Its F1 score is better than LightGBM's.

5. Fifth Rank: LightGBM
LightGBM is ranked fifth. While it boasts the second-highest Recall (0.735) and a good ROC AUC (0.740), its extremely low Precision (0.333) makes it unsuitable for the primary goal of cost minimization through targeted contact. The high number of false positives would lead to substantial wasted marketing expenditure, despite its efficiency and other technical advantages. Its F1-score is also the lowest among the tree-based ensembles and MLP.

Conclusion

Based on this comprehensive assessment the Logistic Regression model is still strongly recommended for deployment. It best satisfies the project's success criteria, offering an excellent balance of high precision for cost-effective marketing campaigns and the necessary flexibility in threshold setting for the Marketing team. Its strong performance metrics and interpretability make it a reliable and understandable choice.