### Connecting to Postgresql

In [65]:

pip install sqlalchemy #helps python program to talk to databases

Note: you may need to restart the kernel to use updated packages.


In [66]:
pip install psycopg2-binary #programs to connect to and work with PostgreSQL databases

Note: you may need to restart the kernel to use updated packages.


In [67]:
%load_ext sql #loads sql extension
from sqlalchemy import create_engine #set up a connection between Python and a database
%config SqlMagic.style = '_DEPRECATED_DEFAULT' #show SQL results using an older style.

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [68]:
pip install ipython-sql

Note: you may need to restart the kernel to use updated packages.


In [69]:
%sql postgresql://postgres:@localhost:5432/customerdb #connection string

### 2. Transaction Analysis

## Overview

### 2.1 How do different types of transactions and customer behaviors influence the overall transaction landscape?


### 2.1.1 How many transactions do customers make on average, and which types are the most common?

In [21]:
%%sql
SELECT 
    COUNT(t.transaction_id) AS total_transactions,
    COUNT(DISTINCT t.customer_identifier) AS total_customers,
    ROUND(SUM(t.amt)) AS sum_transaction_amount,
    t.transaction_description
FROM 
    transactions t
GROUP BY 
    t.transaction_description
ORDER BY 
    sum_transaction_amount DESC;


 * postgresql://postgres:***@localhost:5432/customerdb
129 rows affected.


total_transactions,total_customers,sum_transaction_amount,transaction_description
18641,2307,135659407.0,ACB CREDIT
1303,1008,22289971.0,TRANSFER FROM
7731,1214,22157756.0,IBANK PAYMENT FROM
6295,1112,13994160.0,IMMEDIATE TRF CR
11515,769,12146092.0,IBANK TRANSFER
1509,1509,9116482.0,OPENED-FROM SAV
1025,192,3675248.0,NPF CREDIT
2068,609,3159090.0,CARDLESS CASH DEP
517,197,2460239.0,CREDIT TRANSFER
1223,493,2101470.0,CASH ACCEPTOR DEP


### 2.1.2 What is the breakdown of transaction volume across different transaction channels?

In [22]:
%%sql
SELECT 
    t.channel,
    COUNT(t.transaction_id) AS total_transactions,
    ROUND(SUM(t.amt)) AS sum_transaction_amount
FROM 
    transactions t
GROUP BY 
    t.channel
ORDER BY 
    sum_transaction_amount DESC;


 * postgresql://postgres:***@localhost:5432/customerdb
4 rows affected.


channel,total_transactions,sum_transaction_amount
system,509581,74810502.0
teller,21464,61690496.0
internet,119238,-29068726.0
atm,197261,-96933397.0


### 2.1.3 How do transaction volumes vary across different income groups?

In [24]:
%%sql

SELECT 
    i.income_group_desc AS income_group,
    COUNT(t.transaction_id) AS total_transactions,
    ROUND(sum(t.amt)) AS sum_transaction_amount
FROM 
    transactions t
JOIN 
    customers c ON t.customer_identifier = c.customer_identifier
JOIN 
    income_group i ON c.income_group_code = i.income_group_code
GROUP BY 
    i.income_group_desc
ORDER BY 
    sum_transaction_amount DESC;


 * postgresql://postgres:***@localhost:5432/customerdb
22 rows affected.


income_group,total_transactions,sum_transaction_amount
"R8,000 - 8,999",47663,1154261.0
"R10,000 - 11,999",69661,1121973.0
"R20,000 - 24,999",105093,968400.0
"R7,000 - 7,999",47822,918829.0
"R25,000 - 33,999",90633,918054.0
"R5,000 - 5,999",40719,894124.0
"R15,000 - 19,999",131996,842871.0
"R4,000 - 4,999",38242,761389.0
"R6,000 - 6,999",41316,740206.0
"R9,000 - 9,999",31597,526430.0


### 2.1.4 How do different income groups use product codes?

In [25]:
%%sql
SELECT 
    i.income_group_desc AS income_group,
    t.product_code,
    COUNT(t.transaction_id) AS total_transactions,
    ROUND(sum(t.amt)) AS sum_transaction_amount
FROM 
    transactions t
JOIN 
    customers c ON t.customer_identifier = c.customer_identifier
JOIN 
    income_group i ON c.income_group_code = i.income_group_code
GROUP BY 
    i.income_group_desc, t.product_code
ORDER BY 
    sum_transaction_amount DESC;



 * postgresql://postgres:***@localhost:5432/customerdb
43 rows affected.


income_group,product_code,total_transactions,sum_transaction_amount
"R20,000 - 24,999",CHEQ,73552,887761.0
"R4,000 - 4,999",CHEQ,7135,796450.0
"R7,000 - 7,999",CHEQ,19917,783928.0
"R8,000 - 8,999",CHEQ,22222,737025.0
"R25,000 - 33,999",CHEQ,67685,713317.0
"R10,000 - 11,999",CHEQ,33476,661376.0
"R5,000 - 5,999",CHEQ,13084,639712.0
"R15,000 - 19,999",CHEQ,89797,628060.0
"R6,000 - 6,999",CHEQ,13472,627688.0
"R12,000 - 14,999",CHEQ,49197,513703.0


### 2.2 How can the bank optimize transaction channels to increase efficiency, reduce costs, and enhance customer satisfaction?

### 2.2.1 What proportion of teller-based transactions could have been conducted on digital platforms, and how can the bank encourage customers to transition to these more efficient channels?

In [27]:
%%sql

SELECT
    t.transaction_description,
    COUNT(*) AS transaction_count,
    CASE 
        WHEN c.age BETWEEN 16 AND 24 THEN '16-24: Youth'
        WHEN c.age BETWEEN 25 AND 34 THEN '25-34: Young Professionals'
        WHEN c.age BETWEEN 35 AND 44 THEN '35-44: Middle-Aged Adults'
        WHEN c.age BETWEEN 45 AND 54 THEN '45-54: Prime Working Years'
        WHEN c.age BETWEEN 55 AND 64 THEN '55-64: Pre-Retirement'
        WHEN c.age >= 65 THEN '65+: Seniors/Elderly'
        ELSE 'Under 16'
    END AS age_group,
    i.income_group_desc
FROM transactions t
JOIN customers c ON t.customer_identifier = c.customer_identifier
JOIN income_group i ON c.income_group_code = i.income_group_code
WHERE t.channel = 'teller'
  AND t.transaction_description IN (
      'BAD DEBT W/OFF',
      'CLOSE C/WITHDRAWAL',
      'CLOSE CASH DEP',
      'CLOSED-TO SAV',
      'DIGITAL TRAN FEES',
      'DIGITAL TRANSF DT',
      'FOREIGN NOTES',
      'IBANK PAYMENT FROM',
      'IBANK PAYMENT TO',
      'IBANK TRANSFER',
      'IMMEDIATE TRF CR',
      'JOURNAL CR C/OPS',
      'JOURNAL CREDIT',
      'JOURNAL DEBIT',
      'NPF CREDIT',
      'NPF DEBIT',
      'OPENED-FROM SAV',
      'TRANSFER COR FROM',
      'TRANSFER FROM',
      'TRANSFER TO',
      'TRANSFER TO CLOSE',
      'TRAVEL FOREX(PFC)',
      'TRAVEL FOREX(TFC)',
      'TRF TO CLOSE COR',
      'UNPAID DEBIT',
      'WESTERN UNION(CBI)',
      'WESTERN UNION(CBO)'
  )
GROUP BY t.transaction_description, age_group, i.income_group_desc
ORDER BY transaction_count DESC
;

 * postgresql://postgres:***@localhost:5432/customerdb
627 rows affected.


transaction_description,transaction_count,age_group,income_group_desc
IMMEDIATE TRF CR,764,25-34: Young Professionals,"R15,000 - 19,999"
IMMEDIATE TRF CR,537,25-34: Young Professionals,"R20,000 - 24,999"
IMMEDIATE TRF CR,394,25-34: Young Professionals,"R25,000 - 33,999"
IMMEDIATE TRF CR,320,16-24: Youth,"R15,000 - 19,999"
IMMEDIATE TRF CR,275,25-34: Young Professionals,"R10,000 - 11,999"
IMMEDIATE TRF CR,273,25-34: Young Professionals,"R34,000 - 41,999"
IMMEDIATE TRF CR,197,35-44: Middle-Aged Adults,"R12,000 - 14,999"
IMMEDIATE TRF CR,191,35-44: Middle-Aged Adults,"R25,000 - 33,999"
IMMEDIATE TRF CR,184,35-44: Middle-Aged Adults,"R15,000 - 19,999"
IMMEDIATE TRF CR,177,25-34: Young Professionals,"R12,000 - 14,999"


### 2.3 Customers missing payments my income group

In [93]:
%%sql
SELECT 

    ROUND(SUM(t.amt)) AS total_unpaid,
    COUNT(DISTINCT c.customer_identifier) AS bad_count,
    CASE 
        WHEN c.age BETWEEN 16 AND 24 THEN '16-24: Youth'
        WHEN c.age BETWEEN 25 AND 34 THEN '25-34: Young Professionals'
        WHEN c.age BETWEEN 35 AND 44 THEN '35-44: Middle-Aged Adults'
        WHEN c.age BETWEEN 45 AND 54 THEN '45-54: Prime Working Years'
        WHEN c.age BETWEEN 55 AND 64 THEN '55-64: Pre-Retirement'
        WHEN c.age >= 65 THEN '65+: Seniors/Elderly'
        ELSE 'Under 16'
    END AS age_group
FROM transactions t
JOIN customers c ON t.customer_identifier = c.customer_identifier
WHERE t.transaction_description IN ('BAD DEBT W/OFF', 'DC UNPAID', 'NAEDO DO UNPD', 'ACB DEBIT REVERSAL', 'UNPAID DEBIT')
GROUP BY  age_group
ORDER BY age_group


 * postgresql://postgres:***@localhost:5432/customerdb
6 rows affected.


total_unpaid,bad_count,age_group
196986.0,108,16-24: Youth
677021.0,370,25-34: Young Professionals
597368.0,324,35-44: Middle-Aged Adults
276494.0,155,45-54: Prime Working Years
64699.0,31,55-64: Pre-Retirement
4166.0,3,65+: Seniors/Elderly
