<a href="https://colab.research.google.com/github/drOluOla/Lloyds_DA_AI_Scientist_Take_Home/blob/main/Lloyds_Take_Home_Answer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Introduction**
Actionable insights into customer savings behavior are crucial for driving business growth and enhancing customer relationships. This analysis leverages statistical and machine learning models to identify key savings drivers and predict future behavior, offering a path to optimize deposit strategies and personalize customer engagement. Ethical considerations and regulatory compliance (Equality Law, FCA Principles, UK GDPR) are integrated throughout the process to ensure responsible and trustworthy AI implementation. The findings highlight that current financial behavior is a stronger predictor of future savings than demographics, supporting ethical, behavior-based modeling approaches.

## **Problem Statement**
Analyse customer saving behavior to identify key drivers and predict future savings, aiming to deepen customer relationships through behavioral insights. This analysis must adhere to ethical guidelines, relevant laws (Equality Law, FCA Principles), and UK GDPR, particularly concerning special category data.

**Business Objectives:**
- Deepen customer relationships through behavioral insights.

**Machine Learning/Statistical Model Objective:**
- Develop a regression model to predict future customer savings.

## **Methodology**

To mitigate the bias in identified in the previous section, I follow guidance provided by ICO vs FCA on processes for mitigating bias using both pre-processing and post-processing approaches. The work is structured into key tickets, managed iteratively via a Kanban board (image below), with subtasks detailed in the [GitHub project link](https://github.com/users/drOluOla/projects/2).

![](https://drive.google.com/uc?export=view&id=121BTN4Nhs75LCoV9K3zbkLwmYvOqDCNO)

**Libraries and Tools (to be used across tickets):**
-   Scikit learn
-   Stats Model (For Statistical Modelling)
-   Seaborn and Matplotlib (For visualisation)
-   Fairlearn (Fairness analysis)
-   Computer Assisted Coding (for code completion)

## **Experimentation**

A series of experiments were conducted to evaluate the performance and fairness of different modeling approaches. The experimentation process involved:

-   **Experiment 1: Baseline Statistical Model:** Train and evaluate a baseline statistical model (e.g., Linear Regression) to establish a performance benchmark. Assess its predictive power and initial fairness metrics.
-   **Experiment 2: Machine Learning Model (e.g., Random Forest):** Train and evaluate a machine learning model. Tune hyperparameters using techniques like cross-validation to optimize performance. Assess its predictive power and compare fairness metrics to the baseline model.
-   **Experiment 3: Exploring Fairness Mitigation:** Apply different fairness mitigation techniques to the chosen models and evaluate their impact on both predictive performance and fairness metrics. Analyze the trade-offs.
-   **Experiment 4: Model Comparison and Selection:** Compare the results across all experiments based on predictive performance metrics (e.g., R-squared, RMSE), fairness metrics, and interpretability. Select the model that best balances these factors while aligning with business objectives and ethical considerations.

Results from these experiments informed the selection of the final model and highlighted key trade-offs between predictive power and ethical considerations. Specific details on data preprocessing steps, feature engineering choices, and the exact metrics used for evaluation are documented within the code accompanying each experiment.

## **Conclusion**
Statistical modeling and machine learning applications in economic data analysis serve a fundamental purpose: advancing business objectives and maintaining competitive advantage. This assignment evaluates technical competencies in coding and data analytics; however, these capabilities must be understood within their broader business context.
Effective ML implementation requires clear alignment with defined business goals to demonstrate tangible value and practical applicability. Drawing from Lloyds Banking Group's 2025-26 strategic priorities, the following business objectives have been identified as relevant contexts for this savings prediction exercise:
Business Goals aligned with ML Objectives:

1. Deepening customer relationships through behavioral insights – Predictive models can identify high-propensity savers and inform targeted engagement strategies to increase depth of relationship (current: c.1% growth in H1 2025; 2026 target: c.3%) [page 6].
2. Optimizing deposit franchise economics – Understanding drivers of savings behavior supports pricing strategies and customer segmentation to improve deposit gross margins (H1 2025: 1.29%, up 16bps YoY) [page 16].
3. Supporting sustainable financing commitments – Identifying customers with capacity and propensity to save can inform targeting for green savings products aligned with the £30bn sustainable financing target by 2026 [page 6].
4. Enhancing capital-lite revenue growth – Predictive insights into savings patterns can support fee-based product cross-sell, contributing to the 50:50 NII:OOI revenue split target and >£1.5bn additional strategic revenues by 2026 [page 7].
5. Driving operational efficiency through automation – ML models can automate customer segmentation and propensity scoring, supporting the <50% cost-to-income ratio target and productivity improvements (>40% increase in customers served per FTE vs. 2021) [page 7,9,19].
6. Risk-aware portfolio management – Understanding the financial health indicators within savings behavior supports prudent lending decisions and maintaining robust asset quality (target AQR: c.25bps) [page 20].

This analysis demonstrates the technical approaches—data cleansing, statistical modeling, and machine learning—that would underpin such business applications, while also addressing critical considerations around model ethics, interpretability, and deployment suitability.