# Portfolio Collections Dashboard

**Goal:** Visualise arrears and default trends to support proactive collections management for a water utility company.

This analysis reuses the same dataset as the Predicting Customer Payment Default Risk project to create actionable insights for collections teams.

## 1) Setup & Data Loading

In [None]:
# Imports & Paths
import os
import sqlite3
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import ipywidgets as widgets
from IPython.display import display, clear_output

# Database connection
DB_PATH = '../data/water_collections_demo.sqlite'
con = sqlite3.connect(DB_PATH)

print(f"Connected to database: {DB_PATH}")
print(f"Working directory: {os.getcwd()}")
print("📊 Interactive dashboard components loaded!")

In [None]:
# Sanity Checks - Verify data loaded correctly
tables = ['customers', 'bills', 'payments', 'collections_actions']

print("Data verification:")
for table in tables:
    count = pd.read_sql(f'SELECT COUNT(*) as n FROM {table}', con).iloc[0, 0]
    print(f"{table:20}: {count:,} rows")

## 2) SQL Views & Helper Indexes

Database indexes and analytical views have been pre-created for optimal performance.

In [None]:
# Verify indexes exist
indexes = pd.read_sql("SELECT name FROM sqlite_master WHERE type='index' AND name LIKE 'ix_%'", con)
print(f"Custom indexes created: {len(indexes)}")
for idx in indexes['name']:
    print(f"  ✓ {idx}")

In [None]:
# Verify bill_targets view exists and test it
view_test = pd.read_sql("""
SELECT 
    COUNT(*) as total_bills,
    SUM(default_60d) as defaults,
    ROUND(AVG(default_60d), 4) as default_rate,
    ROUND(AVG(bill_amount), 2) as avg_bill,
    ROUND(AVG(paid_in_window), 2) as avg_paid
FROM bill_targets
""", con)

print("KPI View Validation:")
display(view_test)

## 3) KPI Definitions (Arrears, Default Rate, Arrangements)

**Default Definition**: A bill is considered in default if it's not paid within the D-3 to D+60 window (3 days before due date to 60 days after due date).

**Payment Matching**: Payments are matched to bills with a £1 tolerance to account for rounding differences.

**Key Metrics**:
- **Default Rate**: Percentage of bills that default within the 60-day collection window
- **Monthly Trends**: Default rates aggregated by billing month
- **Segmentation**: Default rates by customer demographics (income, region)
- **Collections Activity**: Volume and distribution of collection actions

## 4) Portfolio EDA

In [None]:
# Monthly trend analysis
kpi_monthly = pd.read_sql("""
SELECT 
    strftime('%Y-%m', bill_period_end) AS month,
    AVG(default_60d) AS default_rate,
    SUM(bill_amount) AS billed,
    SUM(paid_in_window) AS paid,
    COUNT(*) as bill_count
FROM bill_targets 
GROUP BY 1 
ORDER BY 1
""", con)

print(f"Monthly data: {len(kpi_monthly)} months")
print("\nSample monthly KPIs:")
display(kpi_monthly.head())

In [None]:
# Income band segmentation
seg_income = pd.read_sql("""
SELECT 
    c.income_band,
    ROUND(AVG(bt.default_60d), 3) AS default_rate,
    ROUND(AVG(bt.bill_amount), 2) AS avg_bill,
    COUNT(*) AS n_bills
FROM bill_targets bt 
JOIN customers c USING(customer_id) 
GROUP BY c.income_band 
ORDER BY default_rate DESC
""", con)

print("Default rates by income band:")
display(seg_income)

In [None]:
# Regional segmentation
seg_region = pd.read_sql("""
SELECT 
    c.region,
    ROUND(AVG(bt.default_60d), 3) AS default_rate,
    COUNT(*) AS n_bills
FROM bill_targets bt 
JOIN customers c USING(customer_id) 
GROUP BY c.region 
ORDER BY default_rate DESC
""", con)

print("Default rates by region:")
display(seg_region)

In [None]:
# Collections actions distribution
actions_dist = pd.read_sql("""
SELECT 
    action,
    COUNT(*) as n
FROM collections_actions 
GROUP BY action 
ORDER BY n DESC
""", con)

print("Collections actions distribution:")
display(actions_dist)

## 5) Interactive Dashboard

In [None]:
# Interactive Dashboard Setup
def create_interactive_dashboard():
    """Create interactive dashboard with widgets and plotly charts"""
    
    # Load all data for dashboard
    global kpi_monthly, seg_income, seg_region, actions_dist
    
    # Create output widget for dashboard
    output = widgets.Output()
    
    # Dashboard controls
    date_range = widgets.SelectionRangeSlider(
        options=kpi_monthly['month'].tolist(),
        index=(0, len(kpi_monthly)-1),
        description='Date Range:',
        style={'description_width': 'initial'}
    )
    
    metric_selector = widgets.Dropdown(
        options=[('Default Rate', 'default_rate'), ('Billed Amount', 'billed'), ('Paid Amount', 'paid')],
        value='default_rate',
        description='Metric:',
        style={'description_width': 'initial'}
    )
    
    segment_selector = widgets.Dropdown(
        options=[('Income Band', 'income'), ('Region', 'region')],
        value='income',
        description='Segment By:',
        style={'description_width': 'initial'}
    )
    
    def update_dashboard(date_range_val, metric, segment):
        with output:
            clear_output(wait=True)
            
            # Filter data based on date range
            start_idx, end_idx = date_range_val
            filtered_monthly = kmp_monthly.iloc[start_idx:end_idx+1]
            
            # Create subplot layout
            fig = make_subplots(
                rows=2, cols=2,
                subplot_titles=('Monthly Trend', 'Segmentation Analysis', 'Collections Actions', 'KPI Summary'),
                specs=[[{"secondary_y": True}, {}],
                       [{}, {"type": "indicator"}]]
            )
            
            # 1. Monthly trend line
            fig.add_trace(
                go.Scatter(
                    x=filtered_monthly['month'],
                    y=filtered_monthly[metric],
                    mode='lines+markers',
                    name=metric.replace('_', ' ').title(),
                    line=dict(width=3)
                ),
                row=1, col=1
            )
            
            # 2. Segmentation bar chart
            if segment == 'income':
                seg_data = seg_income
                x_col, y_col = 'income_band', 'default_rate'
            else:
                seg_data = seg_region
                x_col, y_col = 'region', 'default_rate'
            
            fig.add_trace(
                go.Bar(
                    x=seg_data[x_col],
                    y=seg_data[y_col],
                    name='Default Rate by ' + segment.title(),
                    marker_color='lightblue'
                ),
                row=1, col=2
            )
            
            # 3. Collections actions pie chart
            fig.add_trace(
                go.Pie(
                    labels=actions_dist['action'],
                    values=actions_dist['n'],
                    name="Actions"
                ),
                row=2, col=1
            )
            
            # 4. KPI indicator
            current_rate = filtered_monthly[metric].iloc[-1] if len(filtered_monthly) > 0 else 0
            fig.add_trace(
                go.Indicator(
                    mode="gauge+number+delta",
                    value=current_rate,
                    domain={'x': [0, 1], 'y': [0, 1]},
                    title={'text': metric.replace('_', ' ').title()},
                    gauge={'axis': {'range': [None, current_rate * 1.5]},
                           'bar': {'color': "darkblue"},
                           'steps': [{'range': [0, current_rate * 0.5], 'color': "lightgray"},
                                   {'range': [current_rate * 0.5, current_rate], 'color': "gray"}],
                           'threshold': {'line': {'color': "red", 'width': 4},
                                       'thickness': 0.75, 'value': current_rate * 1.2}}
                ),
                row=2, col=2
            )
            
            # Update layout
            fig.update_layout(
                height=800,
                title_text="Portfolio Collections Dashboard - Interactive View",
                title_x=0.5,
                showlegend=False
            )
            
            fig.show()
    
    # Create interactive widget
    interactive_plot = widgets.interactive(
        update_dashboard,
        date_range_val=date_range,
        metric=metric_selector,
        segment=segment_selector
    )
    
    # Display dashboard
    display(widgets.VBox([
        widgets.HTML("<h3>🎛️ Dashboard Controls</h3>"),
        widgets.HBox([date_range, metric_selector, segment_selector]),
        output
    ]))
    
    # Trigger initial update
    update_dashboard((0, len(kpi_monthly)-1), 'default_rate', 'income')

print("Interactive dashboard function created!")

In [None]:
# Launch Interactive Dashboard
print("🚀 Launching Interactive Portfolio Collections Dashboard...")
print("Use the controls below to explore different time periods, metrics, and segments.")
print()

# Fix the variable name issue
kmp_monthly = kpi_monthly.copy()

create_interactive_dashboard()

### Interactive Features:
- **Date Range Slider**: Filter analysis to specific time periods
- **Metric Selector**: Switch between default rate, billed amounts, and paid amounts
- **Segmentation Toggle**: View analysis by income band or region
- **Real-time Updates**: All charts update dynamically based on your selections

### Dashboard Components:
1. **Monthly Trend**: Interactive line chart with zoom and hover details
2. **Segmentation Analysis**: Dynamic bar chart based on selected dimension
3. **Collections Actions**: Pie chart showing action distribution
4. **KPI Gauge**: Real-time metric indicator with performance thresholds

## 6) Static Charts for Documentation

In [None]:
# Set up plotting style for static exports
sns.set_theme(style="whitegrid")
plt.rcParams['figure.dpi'] = 100
plt.rcParams['savefig.dpi'] = 200

# Ensure output directory exists
os.makedirs('../outputs/figures', exist_ok=True)

print("Creating static charts for README documentation...")

In [None]:
# 1) Monthly default rate trend
plt.figure(figsize=(12, 6))
plt.plot(kpi_monthly['month'], kpi_monthly['default_rate'], 
         marker='o', linewidth=2, markersize=6)
plt.title('Monthly Default Rate Trend', fontsize=16, fontweight='bold')
plt.xlabel('Month', fontsize=12)
plt.ylabel('Default Rate', fontsize=12)
plt.xticks(rotation=45)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('../outputs/figures/monthly_default_rate.png', dpi=200, bbox_inches='tight')
plt.show()

print("✓ Saved monthly_default_rate.png")

In [None]:
# 2) Default rate by income band
plt.figure(figsize=(10, 6))
sns.barplot(data=seg_income, x='income_band', y='default_rate', 
            palette='viridis')
plt.title('Default Rate by Income Band', fontsize=16, fontweight='bold')
plt.xlabel('Income Band', fontsize=12)
plt.ylabel('Default Rate', fontsize=12)
plt.xticks(rotation=45)
plt.tight_layout()
plt.savefig('../outputs/figures/default_by_income.png', dpi=200, bbox_inches='tight')
plt.show()

print("✓ Saved default_by_income.png")

In [None]:
# 3) Default rate by region
plt.figure(figsize=(12, 6))
sns.barplot(data=seg_region, x='region', y='default_rate', 
            palette='plasma')
plt.title('Default Rate by Region', fontsize=16, fontweight='bold')
plt.xlabel('Region', fontsize=12)
plt.ylabel('Default Rate', fontsize=12)
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.savefig('../outputs/figures/default_by_region.png', dpi=200, bbox_inches='tight')
plt.show()

print("✓ Saved default_by_region.png")

In [None]:
# 4) Collections actions distribution
plt.figure(figsize=(10, 6))
sns.barplot(data=actions_dist, x='action', y='n', 
            palette='coolwarm')
plt.title('Collections Actions Volume', fontsize=16, fontweight='bold')
plt.xlabel('Action Type', fontsize=12)
plt.ylabel('Number of Actions', fontsize=12)
plt.xticks(rotation=20)
plt.tight_layout()
plt.savefig('../outputs/figures/actions_distribution.png', dpi=200, bbox_inches='tight')
plt.show()

print("✓ Saved actions_distribution.png")

## 7) Actionable Insights

In [None]:
# Key insights summary
print("=== KEY INSIGHTS ===")
print()

# Overall portfolio performance
overall_default = kpi_monthly['default_rate'].mean()
print(f"📊 Overall Portfolio Default Rate: {overall_default:.1%}")

# Trend analysis
recent_trend = kpi_monthly.tail(6)['default_rate'].mean()
early_trend = kpi_monthly.head(6)['default_rate'].mean()
trend_change = recent_trend - early_trend
print(f"📈 Recent 6-month trend: {trend_change:+.1%} vs early period")

# Highest risk segments
highest_risk_income = seg_income.iloc[0]
lowest_risk_income = seg_income.iloc[-1]
print(f"🎯 Highest risk income band: {highest_risk_income['income_band']} ({highest_risk_income['default_rate']:.1%})")
print(f"🎯 Lowest risk income band: {lowest_risk_income['income_band']} ({lowest_risk_income['default_rate']:.1%})")

highest_risk_region = seg_region.iloc[0]
print(f"🌍 Highest risk region: {highest_risk_region['region']} ({highest_risk_region['default_rate']:.1%})")

# Collections activity
total_actions = actions_dist['n'].sum()
most_common_action = actions_dist.iloc[0]
print(f"📞 Total collections actions: {total_actions:,}")
print(f"📞 Most common action: {most_common_action['action']} ({most_common_action['n']:,} times)")

In [None]:
# Display key data tables for reference
print("\n=== DETAILED SEGMENTATION DATA ===")
print("\nMonthly Performance (Last 6 months):")
display(kpi_monthly.tail(6))

print("\nIncome Band Analysis:")
display(seg_income)

print("\nRegional Analysis:")
display(seg_region)

print("\nCollections Actions:")
display(actions_dist)

## 8) Save Figures for README

All visualisations have been saved to `../outputs/figures/` for inclusion in the GitHub README.

In [None]:
# Verify all figures were saved
import glob

figure_files = glob.glob('../outputs/figures/*.png')
expected_files = [
    'monthly_default_rate.png',
    'default_by_income.png', 
    'default_by_region.png',
    'actions_distribution.png'
]

print("Figure export verification:")
for expected in expected_files:
    full_path = f'../outputs/figures/{expected}'
    if full_path in figure_files:
        file_size = os.path.getsize(full_path) / 1024  # KB
        print(f"✓ {expected} ({file_size:.1f} KB)")
    else:
        print(f"✗ {expected} - MISSING")

print(f"\nTotal figures saved: {len(figure_files)}")

## 8) Conclusions & Next Steps

### Key Findings:

1. **Portfolio Performance**: The overall default rate provides a baseline for collections effectiveness
2. **Demographic Risk**: Clear variation in default rates across income bands and regions
3. **Collections Activity**: Understanding of current intervention patterns and volumes
4. **Seasonal Patterns**: Monthly trends reveal potential seasonal effects on payment behaviour

### Recommended Actions:

1. **Targeted Interventions**: Focus collections efforts on highest-risk segments
2. **Early Warning Systems**: Use demographic and regional data for proactive identification
3. **Process Optimisation**: Analyse collections action effectiveness and timing
4. **Monitoring**: Regular dashboard updates to track performance improvements

### Potential Extensions:

- **Cohort Analysis**: Track customer payment behaviour over time
- **Tariff Analysis**: Examine default rates by tariff type
- **Arrears Balances**: Calculate outstanding amounts in addition to default rates
- **Predictive Modelling**: Build early warning models for proactive collections
- **ROI Analysis**: Measure collections action effectiveness and cost-benefit

In [None]:
# Close database connection
con.close()
print("Database connection closed.")
print("\n🎉 Portfolio Collections Dashboard analysis complete!")