**Basic SQL Reminder Notes**

SELECT *

FROM *

WHERE *

"%" is wildcard for strings

"XOR" for one or other, but not both

ORDER BY [column name] asc/desc (asc/desc can be used after individual columns

ROUND works as we understand
--------

IN for lists Example:
-----
SELECT name FROM bbc
  WHERE name IN ('Sri Lanka', 'Ceylon',
                 'Persia',    'Iran')


**Multiple, indepnedent clauses are possible. Note the parentheses**

SELECT * FROM nobel

  WHERE subject IN ('physics')
  
  AND yr IN (1980)
  
  OR
  
  (subject IN ('chemistry')
  AND yr IN (1984))

ERD - Entity Relationship Diagram

The purpose of this micro-project is to demonstrate basic SQL skills for data analysis within the domain of capital forecasting.

We will create tables for fictional business units with their own balance sheets and minimum capital ratios, then project capital ratios at 4 different level of economic distress.

First, we will run reload_ext sql[format differently] to use sql magic command using the ipython-sql extension. SQLite works well in Jupyter since it is lightweight and serverless, supported by ipython-sql magic, and integrates smoothly with pandas, which we will explain and get to later.

In [None]:
%reload_ext sql
%sql sqlite:///test_capital_project.db

We will create our tables mentioned above. We will have 4 tables:

### 🗂️ Table Overview: Financial Stress Testing Scenario

Below is a brief description of each table used in this project, explaining its role in our simplified stress testing model.

---

#### 🏢 `business_units`

**Purpose**:  
This table defines each individual business unit within the organization.  
Each unit has a unique `unit_id` and a descriptive `unit_name`.

**Why it matters**:  
It provides a master list of business units that other tables reference.  
Serves as the anchor for linking financial and regulatory data to specific units.

---

#### 📊 `balance_sheets`

**Purpose**:  
Stores the historical (or baseline) balance sheet data for each business unit.  
Includes total `assets` and `liabilities` for a given `report_date`.

**Why it matters**:  
This is the core financial data used to evaluate how business units would perform under economic stress scenarios.  
It forms the basis for projections.

---

#### 📈 `forecast_assumptions`

**Purpose**:  
Holds the stress testing assumptions for different economic scenarios.  
Each row contains a `scenario_name` along with `asset_multiplier` and `liab_multiplier`.

**Why it matters**:  
Enables simulation of how financials might change under adverse or severely adverse conditions.  
These multipliers are applied to the baseline balance sheet data.

---

#### 🛡️ `capital_buffers`

**Purpose**:  
Contains the regulatory minimum capital ratio required for each business unit.  
Each unit has a `min_capital_ratio` associated with it.

**Why it matters**:  
Provides a benchmark to compare against projected capital ratios.  
Used to determine whether a business unit would remain compliant under stress.

---

Since we are in Jupyter Notebooks I don't want to try to create redundant tables in our db file. Working in Jupyter is usually done when you want to break our code into pieces and run multiple lines over and over as you iterate. So, we will drop the tables at the begining of our CREATE TABLE statemnes. Our output should be 8 Done statements, one for each drop, and one for each create.

In [106]:
%%sql

DROP TABLE IF EXISTS balance_sheets;
DROP TABLE IF EXISTS business_units;
DROP TABLE IF EXISTS capital_buffers;
DROP TABLE IF EXISTS forecast_assumptions;

CREATE TABLE business_units (
    unit_id INTEGER PRIMARY KEY,
    unit_name TEXT NOT NULL
);

CREATE TABLE balance_sheets (
    bank_id INTEGER,
    unit_id INTEGER,
    report_date,
    risk_weighted_exposure REAL,
    cet1_capital REAL,
    at1_capital REAL,
    tier2_capital REAL,
    FOREIGN KEY (unit_id) REFERENCES business_units(unit_id)
);

CREATE TABLE regulatory_requirements (
    buffer_id INTEGER PRIMARY KEY,
    unit_id INTEGER NOT NULL,
    tier TEXT,
    min_capital_ratio FLOAT NOT NULL,
    FOREIGN KEY (unit_id) REFERENCES business_units(unit_id)
);

CREATE TABLE forecast_assumptions (
    assumption_id INTEGER PRIMARY KEY,
    scenario_name TEXT NOT NULL,
    asset_multiplier FLOAT NOT NULL,
    liab_multiplier FLOAT NOT NULL
);

 * sqlite:///test_capital_project.db
Done.
Done.
Done.
Done.
Done.
Done.
Done.
Done.


[]

## 3. Create Mock Data

We will populate each of our 4 tables with small, realistic sample data. First, our imaginary Business Units.

In [107]:
%%sql
INSERT INTO business_units (unit_id, unit_name) VALUES
(1, 'Retail Banking'),
(2, 'Commercial Lending'),
(3, 'Credit Cards'),
(4, 'Wealth Management');
SELECT * FROM business_units

 * sqlite:///test_capital_project.db
4 rows affected.
Done.


unit_id,unit_name
1,Retail Banking
2,Commercial Lending
3,Credit Cards
4,Wealth Management


Second, we will insert Asset and Liability values for two ending periods one End-of-Year and one Mid-Year. When we created our tables we assigned our unit_id columns as foreign keys to match to the primary key of the business_units table.

In [108]:
%%sql

INSERT INTO balance_sheets (bank_id, unit_id, report_date, risk_weighted_exposure, cet1_capital, at1_capital, tier2_capital) VALUES
(1, 1, '2023-12-31', 4000000, 225000, 75000, 100000),
(2, 2, '2023-12-31', 7000000, 360000, 120000, 160000),
(3, 3, '2023-12-31', 5500000, 270000, 90000, 120000),
(4, 4, '2023-12-31', 7000000, 405000, 135000, 180000),
(5, 1, '2024-06-30', 4100000, 229500, 76500, 102000),
(6, 2, '2024-06-30', 7100000, 369000, 123000, 164000),
(7, 3, '2024-06-30', 5600000, 274500, 91500, 122000),
(8, 4, '2024-06-30', 7100000, 409500, 136500, 182000);

SELECT * FROM balance_sheets

 * sqlite:///test_capital_project.db
8 rows affected.
Done.


bank_id,unit_id,report_date,risk_weighted_exposure,cet1_capital,at1_capital,tier2_capital
1,1,2023-12-31,4000000.0,225000.0,75000.0,100000.0
2,2,2023-12-31,7000000.0,360000.0,120000.0,160000.0
3,3,2023-12-31,5500000.0,270000.0,90000.0,120000.0
4,4,2023-12-31,7000000.0,405000.0,135000.0,180000.0
5,1,2024-06-30,4100000.0,229500.0,76500.0,102000.0
6,2,2024-06-30,7100000.0,369000.0,123000.0,164000.0
7,3,2024-06-30,5600000.0,274500.0,91500.0,122000.0
8,4,2024-06-30,7100000.0,409500.0,136500.0,182000.0


For simplicity and the purpoose of demonstrating SQL skills we kept the values and names straighforward: even numbers and assets and liablities.

In reality

The idea is the same, one numnber needing to be x% larger than another.

Could have done pass fail on each tier individually, or did how many out of 3 meet the crieria for minimum percentages. Heatmap with which ones pass and fail in python?

In [109]:
%%sql
INSERT INTO regulatory_requirements (buffer_id, unit_id, tier, min_capital_ratio) VALUES
(1, 1, 'CET1', 0.045),
(2, 2, 'Tier 1',0.015),
(3, 3, 'Tier 2', 0.02);
SELECT * FROM regulatory_requirements

 * sqlite:///test_capital_project.db
3 rows affected.
Done.


buffer_id,unit_id,tier,min_capital_ratio
1,1,CET1,0.045
2,2,Tier 1,0.015
3,3,Tier 2,0.02


In [110]:
%%sql
INSERT INTO forecast_assumptions (assumption_id, scenario_name, cet1_multiplier, at1_multiplier, tier2_ultiplier, rwa_multiplier) VALUES
(1, 'Baseline', 1.00, 1.00, 1.00, 1.00),
(2, 'Mild Recession', 0.98, .097, 0.95, 1.05),
(3, 'Severe Recession', 0.90, 0.85, 0.80, 1.20),
(4, 'Expansion', 1.03, 1.02, 1.01, 1.05);
SELECT * FROM forecast_assumptions

 * sqlite:///test_capital_project.db
(sqlite3.OperationalError) table forecast_assumptions has no column named cet1_multiplier
[SQL: INSERT INTO forecast_assumptions (assumption_id, scenario_name, cet1_multiplier, at1_multiplier, tier2_ultiplier, rwa_multiplier) VALUES
(1, 'Baseline', 1.00, 1.00, 1.00, 1.00),
(2, 'Mild Recession', 0.98, .097, 0.95, 1.05),
(3, 'Severe Recession', 0.90, 0.85, 0.80, 1.20),
(4, 'Expansion', 1.03, 1.02, 1.01, 1.05);]
(Background on this error at: https://sqlalche.me/e/14/e3q8)


In [None]:
CASE
        WHEN (
            (bs.assets * fa.asset_multiplier - bs.liabilities * fa.liab_multiplier)
            / NULLIF(bs.assets * fa.asset_multiplier, 0)
        ) >= cb.min_capital_ratio THEN 'PASS'
        ELSE 'FAIL'
    END AS stress_test_result

In [None]:
%%sql
SELECT 
    bu.unit_name,
    bs.report_date,
    bs.cet1_capital,
    bs.at1_capital,
    bs.tier2_capital,
    bs.risk_weighted_exposure,
    rr.min_capital_ratio,
    ROUND(bs.cet1_capital + bs.at1_capital + bs.tier2_capital, 2) AS total_capital,
    CASE
        WHEN 
            bs.cet1_capital >= ROUND(bs.cet1_capital + bs.at1_capital + bs.tier2_capital, 2) * 0.045
            THEN 'PASS'
        ELSE 'FAIL'
    END AS stress_test_result
FROM balance_sheets bs
INNER JOIN business_units bu
    ON bs.unit_id = bu.unit_id
LEFT JOIN regulatory_requirements rr
    ON bs.unit_id = cb.unit_id;

Some of our records failed in our percentage requirements. We will simply discard these records for now. When we move this dataset to pandas later in our analysis, we will pinpoint exactly which ones and why.

## 4.Build our Queries

Now we have joined the tables for each of our 4 business units for the end of year and mid-year data, as well as the our made up minimum capital ratios.

Next we will CROSS JOIN our forecast data, since having a foreign key for that table would not make sense. This will lead to a record for each business unit, report date, and economic scenario. We will also include a stress_test_result column to easily identify where our winners and losers are in each scenario.

Our output will be our new table: 

In [None]:
%%sql
DROP VIEW IF EXISTS capital_projection_view;

CREATE VIEW capital_projection_view AS
SELECT
    fa.scenario_name,
    bu.unit_name,
    bs.report_date,
    ROUND(bs.assets * fa.asset_multiplier, 2) AS projected_assets,
    ROUND(bs.liabilities * fa.liab_multiplier, 2) AS projected_liabilities,
    ROUND(
        (bs.assets * fa.asset_multiplier - bs.liabilities * fa.liab_multiplier)
        / NULLIF(bs.assets * fa.asset_multiplier, 0), 
        4
    ) AS projected_capital_ratio,
    cb.min_capital_ratio,
    CASE
        WHEN (
            (bs.assets * fa.asset_multiplier - bs.liabilities * fa.liab_multiplier)
            / NULLIF(bs.assets * fa.asset_multiplier, 0)
        ) >= cb.min_capital_ratio THEN 'PASS'
        ELSE 'FAIL'
    END AS stress_test_result
FROM balance_sheets bs
JOIN business_units bu ON bs.unit_id = bu.unit_id
JOIN capital_buffers cb ON bs.unit_id = cb.unit_id
CROSS JOIN forecast_assumptions fa
ORDER BY fa.scenario_name, bu.unit_name, bs.report_date;

SELECT * FROM capital_projection_view

In [None]:
import pandas as pd

result = %sql SELECT * FROM capital_projection_view
df = result.DataFrame()
df

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Set seaborn style
sns.set(style="whitegrid")

# Plot projected capital ratios for each unit under different scenarios
plt.figure(figsize=(10, 6))
sns.barplot(
    data=df,
    x='unit_name',
    y='projected_capital_ratio',
    hue='scenario_name'
)
plt.axhline(0.08, color='red', linestyle='--', label='Target Ratio (8%)')  # Basel III min benchmark
plt.title('Projected Capital Ratios by Unit and Scenario')
plt.ylabel('Capital Ratio')
plt.xlabel('Business Unit')
plt.legend()
plt.tight_layout()
plt.show()

While none of our departments meet our Target Ratio in the case of a Severe Recession, Retail Banking and Wealth Managemnet remain resilient in every other circumstance. Next, in ascending order of fragility, is Commercial Lending and Credit Cards; with Credit Cards even dipping into having net negative assets in even a mild recession.

In [None]:
# Heatmap for which tiers in which department don't pass muster