# SACCO Analytics Project  
## Loan Default Risk & Portfolio Performance Analytics

---

## 1. Project Title
**Data-Driven Credit Risk Assessment and Loan Portfolio Optimization for a SACCO**

---

## 2. Problem Statement
Many SACCOs face persistent challenges such as:

- Rising loan default rates  
- Weak or subjective credit assessment processes  
- Over-reliance on guarantors without data-backed validation  
- Limited analytical reporting for management decision-making  

As a result, loan decisions are often reactive rather than predictive.  
This project applies **actuarial risk principles and Python-based data analytics** to assess loan default risk and improve SACCO loan portfolio performance.

---

## 3. Project Objectives

### Primary Objective
To estimate and analyze loan default risk among SACCO members using data-driven and actuarial methods.

### Secondary Objectives
- Identify key drivers of loan default  
- Segment SACCO members into risk categories  
- Evaluate overall loan portfolio performance  
- Support data-backed loan approval and monitoring decisions  

---

## 4. Data Description (Simulated SACCO Data)

This project uses a **simulated SACCO dataset of 100 members** across four wards:
- East Ugenya  
- West Ugenya  
- North Ugenya  
- Ukwala  

### Member-Level Variables
- Age  
- Gender  
- Ward  
- Membership tenure (years)  

### Loan & Savings Variables
- Monthly contribution  
- Loan amount issued  
- Outstanding loan balance  
- Loan status (Active, Completed, Delinquent)  
- Default risk score (proxy for Probability of Default)

ðŸŽ¯ **Risk Indicators**
- `DefaultRiskScore` (proxy for Probability of Default)  
- `LoanStatus` (Delinquent vs Non-Delinquent)

> *Note: This project focuses on credit risk estimation and portfolio analysis rather than detailed repayment transaction histories.*

---

## 5. Methodology

### Step 1: Data Cleaning & Preparation
**Tools:** Python (Pandas, NumPy)

- Load the SACCO dataset  
- Validate data types and structure  
- Check for missing values and inconsistencies  
- Create derived variables:
  - Risk category (Low / Medium / High)  
  - Expected loss per member  

---

### Step 2: Exploratory Data Analysis (EDA)
**Tools:** Python (Pandas, Matplotlib, Seaborn)

Key analyses include:
- Member distribution by ward  
- Total monthly contributions by ward  
- Loan portfolio size and outstanding balances  
- Delinquency rates across wards  
- Relationship between membership tenure and default risk  

---

### Step 3: Actuarial Risk Modeling

This project applies core actuarial concepts to SACCO lending.

#### Expected Loss (EL) Model:
\[
EL = PD \times LGD \times EAD
\]

Where:
- **PD** = DefaultRiskScore  
- **LGD** = Assumed constant (e.g., 45%)  
- **EAD** = Outstanding loan balance  

**Outputs:**
- Expected loss per member  
- Total expected loss for the SACCO  
- Ward-level expected loss distribution  

---

### Step 4: Risk Segmentation

Members are classified into:
- **Low Risk:** PD < 0.20  
- **Medium Risk:** 0.20 â‰¤ PD â‰¤ 0.50  
- **High Risk:** PD > 0.50  

This segmentation supports:
- Risk-based loan limits  
- Targeted monitoring  
- Focused interventions for high-risk members  

---

### Step 5: Loan Portfolio Performance Analysis

The SACCO loan portfolio is evaluated using:
- Total loan book size  
- Outstanding loan balances  
- Non-performing loan (NPL) rate  
- Ward-level risk concentration  

This enables management to identify **underperforming segments** and concentration risk.

---

## 6. Reporting & Visualization

**Tools:**
- Python (Matplotlib, Seaborn)  
- Optional: Power BI or Excel dashboards  

**Key Visual Outputs:**
- Contributions by ward  
- Loan balance vs expected loss by ward  
- Risk distribution of SACCO members  
- Tenure vs default risk scatter plots  

---

## 7. Key Findings (Illustrative)

- Certain wards exhibit higher loan concentration and default risk  
- Members with shorter tenure tend to have higher default risk  
- Delinquent loans contribute disproportionately to expected losses  
- A small group of high-risk members accounts for most portfolio risk  

---

## 8. Recommendations to the SACCO

Based on the analysis, the SACCO can:
- Introduce risk-based loan limits  
- Strengthen monitoring for high-risk members  
- Review lending policies in high-risk wards  
- Use expected loss estimates for improved provisioning  
- Transition toward data-driven credit decision-making  

---

## 9. Business Impact

If implemented using real SACCO data, this approach can result in:
- Reduced loan default rates  
- Improved liquidity and sustainability  
- Better capital planning  
- Increased transparency in lending decisions  

---

## 10. Tools & Skills Demonstrated

### Technical Skills
- Python (Pandas, NumPy, Matplotlib, Seaborn)  
- Data cleaning and exploratory analysis  
- Actuarial credit risk modeling  
- Portfolio performance analytics  

### Business & Actuarial Skills
- Financial risk management  
- Data-driven decision-making  
- Portfolio risk assessment  
- Communication of insights to stakeholders  

---

## 11. Project Use Cases

This project can be presented as:
- A portfolio project  
- An interview case study  
- A proposal to a SACCO  
- A foundation for a full SACCO analytics system  

---

## Author
**Emmanuel Ouma**  
BSc Actuarial Science | Data Analyst  

---

