# Marketing Analytics -- How Can a Bank Better Target Its Promotions?

## Situation
Universal Bank (name changed) has been targeting customers with various offers for its mortgage loan products. Historically, these campaigns have had single-digit response rates (8-10%). A new Stern MSBA alumnus, Jasmin Ali, has been appointed as the CMO, and she has challenged the marketing team to improve these results. 

When told by an analyst that their campaigns “cannot do any better as the competition is in the 5-6% response rate,” Jasmin asked a simple yet insightful question:  
**“Have we tried to learn anything from our past campaigns to target better?”**

## Complication
Direct marketing analytics faces several challenges, particularly low response rates to customer promotions. For instance, if only 10% of past customers responded to a campaign, can actionable insights still be derived?  

### Key Question
Given a 1,000-person target list, who are the top 50 individuals the marketing team should invite to a special cocktail event? This event targets high-value mortgage customers with a lifetime value in the hundreds of thousands of dollars.

## Solution Approach
To address this problem, we will:
1. Apply **binary classification models** (k-nearest neighbors, decision trees, and logistic regression) to the new dataset.
2. Identify the best-performing model and use it to score the target list.
3. Rank-order the list by likelihood of response to select the top 50 individuals.

### Challenges
- **Imbalanced/skewed outcome class distribution:** This is common in binary classification problems such as loan defaults, corporate bankruptcies, or ad click-through rates (rare events).
- Evaluating model performance will require metrics beyond simple accuracy, including:
  - Precision
  - Recall
  - F-Measure
  - Kappa statistic

## Technical Details
- **Deck:** Predictive modeling continued.
- **Dataset:** [Dataset1 (XLS file)]
- **python Code:** 

### Key Technical Concepts
- Predicting probabilities
- Setting custom thresholds
- Hyper-parameter tuning
- Scoring new data
- Performing cost-benefit analysis for profit maximization

## Discussion Questions
1. **Model Explainability:**  
   - Should we care about explainability for this problem?  
   - How should we balance model accuracy with explainability?

2. **Machine Learning vs Baseline:**  
   - How much better can machine learning be in selecting 50 individuals compared to the baseline response rate?

3. **Classification Types and Cost-Benefit Analysis:**  
   - Are all types of classifications equally beneficial?  
   - Is a true positive as valuable as a true negative?  
   - Are false negatives as costly as false positives?  
   - How can we use costs, benefits, and expected value to choose the best model?

4. **Model Deployment:**  
   - What data should be used during deployment?

5. **Model Selection:**  
   - Which model (k-nearest neighbors, decision trees, logistic regression) should be used for the most fine-grained probability predictions?

## References
1. Shmueli et al., *Business Analytics using R*, Wiley, 2020.