# 🎯 Phase 2B: Risk Prediction Models

## 🚀 **Advanced Library Analytics - Predictive Modeling**

Building machine learning models to predict:
- **Overdue Loans**: Which books will be returned late?
- **Member Churn**: Which members are at risk of leaving?
- **High-Value Interventions**: Prioritize staff efforts for maximum impact

**Business Value:**
- 📧 **Proactive Reminders**: Contact members before books become overdue
- 📊 **Resource Planning**: Allocate staff time efficiently
- 💰 **Revenue Protection**: Reduce lost revenue from penalties and unreturned books
- 😊 **Member Experience**: Improve satisfaction through early warnings

## 📚 Connect to Foundation Data

Loading the comprehensive dataset generated in `01_data_modeling.ipynb`

In [None]:
# Core libraries
import pandas as pd
import numpy as np
import sqlite3
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta, date
import warnings
warnings.filterwarnings('ignore')

# Set plotting style
plt.style.use('default')
sns.set_palette("husl")

print("📊 Libraries loaded successfully!")
print(f"🕒 Analysis date: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

In [None]:
# Connect to the database created in 01_data_modeling.ipynb
conn = sqlite3.connect('../notebooks/library.db')

# Verify our data is available
data_summary = pd.read_sql_query("""
    SELECT 
        'Members' as Table_Name, COUNT(*) as Record_Count 
    FROM Member
    UNION ALL
    SELECT 'Loans', COUNT(*) FROM Loan
    UNION ALL
    SELECT 'Fact_Borrow_Events', COUNT(*) FROM Fact_Borrow_Events
    UNION ALL
    SELECT 'Member_Behavior_Analytics', COUNT(*) FROM Member_Behavior_Analytics
    ORDER BY Record_Count DESC;
""", conn)

print("🔌 Database connected successfully!")
print("📊 Available data:")
for _, row in data_summary.iterrows():
    print(f"   {row['Table_Name']}: {row['Record_Count']:,} records")

conn.close()

## 🎯 Model 1: Overdue Loan Prediction

**Goal**: Predict probability that a newly issued loan will be returned late

**Features to Engineer:**
- Member historical behavior (late return rate, reading frequency)
- Book characteristics (popularity, length, genre)
- Temporal patterns (season, day of week, holidays)
- Library context (branch type, capacity)
- Member demographics (persona, membership level)

In [None]:
# Ready for ML model development!
print("🚀 Ready to build predictive models!")
print("🎯 Next: Feature engineering for overdue prediction")
print("📈 Then: Model training and validation")
print("🔍 Finally: Business insights and recommendations")