### Connecting to Postgresql

In [35]:
pip install sqlalchemy

Note: you may need to restart the kernel to use updated packages.


In [36]:
pip install psycopg2-binary

Note: you may need to restart the kernel to use updated packages.


In [37]:
%load_ext sql
from sqlalchemy import create_engine
%config SqlMagic.style = '_DEPRECATED_DEFAULT'

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [38]:
pip install ipython-sql

Note: you may need to restart the kernel to use updated packages.


In [39]:
%sql postgresql://postgres:@localhost:5432/customerdb

### 3. Periodic Analysis

## Overview

### 3.1 How does the transaction activity of customers from different vary over time?

In [40]:
%%sql
SELECT
    EXTRACT(YEAR FROM record_date) AS year,
    TO_CHAR(record_date, 'Mon') AS month,
    i.income_group_desc as income_group,
    COUNT(*) AS transaction_count,
    ROUND(SUM(amt)) AS total_transaction_value,
    ROUND(AVG(amt)) AS avg_transaction_value
FROM transactions t
JOIN customers c ON t.customer_identifier = c.customer_identifier
JOIN income_group i ON c.income_group_code = i.income_group_code
GROUP BY year, TO_CHAR(t.record_date, 'Mon'), EXTRACT(MONTH FROM t.record_date), i.income_group_desc
ORDER BY year, EXTRACT(MONTH FROM t.record_date), i.income_group_desc
limit 10;

 * postgresql://postgres:***@localhost:5432/customerdb


10 rows affected.


year,month,income_group,transaction_count,total_transaction_value,avg_transaction_value
2021,Jul,"R10,000 - 11,999",156,16132.0,103.0
2021,Jul,"R12,000 - 14,999",230,-13647.0,-59.0
2021,Jul,"R15,000 - 19,999",182,-8839.0,-49.0
2021,Jul,"R20,000 - 24,999",200,5447.0,27.0
2021,Jul,"R25,000 - 33,999",732,-2451.0,-3.0
2021,Jul,"R3,000 - 3,999",11,210.0,19.0
2021,Jul,"R34,000 - 41,999",41,1155.0,28.0
2021,Jul,"R4,000 - 4,999",41,-504.0,-12.0
2021,Jul,"R42,000 - 62,999",191,-4302.0,-23.0
2021,Jul,"R5,000 - 5,999",152,-7943.0,-52.0


### 3.2 Are there any anomailities you can identify over time?
### What is the correlation between transaction counts and total transaction amounts for each product code?

In [41]:
%%sql

SELECT
    EXTRACT(YEAR FROM t.record_date) AS year,
    TO_CHAR(t.record_date, 'Mon') AS month,
    t.product_code,
    COUNT(customer_identifier) AS transaction_count, 
    ROUND(SUM(amt)) AS total_amt
FROM transactions t
GROUP BY year, TO_CHAR(t.record_date, 'Mon'), t.product_code, EXTRACT(MONTH FROM t.record_date)
ORDER BY year, EXTRACT(MONTH FROM t.record_date), t.product_code;


 * postgresql://postgres:***@localhost:5432/customerdb
28 rows affected.


year,month,product_code,transaction_count,total_amt
2021,Jul,CHEQ,1436,-15401.0
2021,Jul,SAVE,827,2509.0
2021,Aug,CHEQ,4821,-104053.0
2021,Aug,SAVE,2629,6884.0
2021,Sep,CHEQ,11110,117962.0
2021,Sep,SAVE,6468,107299.0
2021,Oct,CHEQ,18514,-371630.0
2021,Oct,SAVE,13099,19936.0
2021,Nov,CHEQ,27513,302617.0
2021,Nov,SAVE,20473,269743.0


### 3.3 How does the frequency of transactions by loyal customers change over time?

In [42]:
%%sql

SELECT
    EXTRACT(YEAR FROM t.record_date) AS year,
    TO_CHAR(t.record_date, 'Mon') AS month,
    CASE
        WHEN c.number_of_accounts > (
            SELECT AVG(number_of_accounts) FROM customers
        )
        AND (
            SELECT COUNT(*) / COUNT(DISTINCT customer_identifier) FROM transactions
        ) < (
            SELECT COUNT(*) FROM transactions
        )
        THEN 'Loyal Customers'
        ELSE 'Regular Customers'
    END AS customer_type,
    COUNT(*) AS transaction_count,
    ROUND(SUM(t.amt)) AS total_amount,
    ROUND(AVG(t.amt)) AS total_amount
FROM transactions t
JOIN customers c 
ON t.customer_identifier = c.customer_identifier
GROUP BY 
    year, 
    TO_CHAR(t.record_date, 'Mon'), 
    EXTRACT(MONTH FROM t.record_date), 
    customer_type
ORDER BY 
    year, 
    EXTRACT(MONTH FROM t.record_date), 
    customer_type;




 * postgresql://postgres:***@localhost:5432/customerdb


28 rows affected.


year,month,customer_type,transaction_count,total_amount,total_amount_1
2021,Jul,Loyal Customers,1409,-2811.0,-2.0
2021,Jul,Regular Customers,854,-10081.0,-12.0
2021,Aug,Loyal Customers,4285,-82960.0,-19.0
2021,Aug,Regular Customers,3165,-14210.0,-4.0
2021,Sep,Loyal Customers,10468,98382.0,9.0
2021,Sep,Regular Customers,7110,126878.0,18.0
2021,Oct,Loyal Customers,17272,-158635.0,-9.0
2021,Oct,Regular Customers,14341,-193059.0,-13.0
2021,Nov,Loyal Customers,25948,295537.0,11.0
2021,Nov,Regular Customers,22038,276822.0,13.0


### 3.4  Monthly Unpaid Transactions and Customer Count over the period of 14 months

In [43]:
%%sql
SELECT 
    CONCAT(EXTRACT(YEAR FROM t.record_date), '-', EXTRACT(MONTH FROM t.record_date)) AS year_month,
    ROUND(SUM(t.amt)) AS total_unpaid,
    COUNT(DISTINCT c.customer_identifier) AS bad_count
FROM transactions t
JOIN customers c ON t.customer_identifier = c.customer_identifier

WHERE transaction_description IN ('BAD DEBT W/OFF', 'DC UNPAID', 'NAEDO DO UNPD', 'ACB DEBIT REVERSAL', 'UNPAID DEBIT')
GROUP BY year_month
ORDER BY year_month ASC

 * postgresql://postgres:***@localhost:5432/customerdb


14 rows affected.


year_month,total_unpaid,bad_count
2021-10,65590.0,60
2021-11,95085.0,85
2021-12,139903.0,148
2021-7,9680.0,9
2021-8,19784.0,21
2021-9,32853.0,34
2022-1,178955.0,200
2022-2,168752.0,202
2022-3,236213.0,231
2022-4,211973.0,233
