### Identify the AIRB Banks in the Transparency Exercise

**Transparency Report is complex and contains many dimensions. Ensure that the filtering matches your goals.**

**The calculations may contain errors or misinterpretations of the EBA Transparency datasets.**


## Introduction

This notebook explores the datasets provided by the European Banking Authority (EBA) under their EU-wide Transparency Exercise, accessible [here](https://www.eba.europa.eu/risk-analysis-and-data/eu-wide-transparency-exercise). These datasets offer a comprehensive view of the banking sector's assets and risks, which is invaluable for financial analysis and regulatory assessment.

### Nations.
- **Ensure Data Integrity**: Although these datasets are generally user-friendly, they require additional checks to validate data integrity and reliability. You should compare a sample of your intermediary results with the results presented by the EBA's interactive tool, specifically the visualizations found [here](https://tools.eba.europa.eu/interactive-tools/2023/powerbi/837203/tr23_visualisation_page5.html), to confirm consistency and understand any discrepancies.
- **Download the credit risk datasets from here**: https://www.eba.europa.eu/assets/TE2023/Full_database/837203/tr_cre.csv

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

# Settings
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

pd.options.display.float_format = '{:,.2f}'.format

#### A. Create base table

In [2]:
# Import mappning tables
exposures = pd.read_excel('kala_eba_tr.xlsx', sheet_name='Exposures')
portfolio = pd.read_excel('kala_eba_tr.xlsx', sheet_name='Portfolio')
country = pd.read_excel('kala_eba_tr.xlsx', sheet_name='Country')
status = pd.read_excel('kala_eba_tr.xlsx', sheet_name='Status')
perf_status = pd.read_excel('kala_eba_tr.xlsx', sheet_name='Perf_Status')
institutions = pd.read_excel('kala_eba_tr.xlsx', sheet_name='List of institutions')

# import core dataset
df = pd.read_csv('tr_cre.csv', low_memory=False)

In [3]:
# Create final dataset
result = pd.merge(df, exposures, on='Exposure', how='left')
result = pd.merge(result, portfolio, on='Portfolio', how='left')
result = pd.merge(result, country, on='Country', how='left')
result = pd.merge(result, status, on='Status', how='left')
result = pd.merge(result, perf_status, on='Perf_Status', how='left')
result = pd.merge(result, institutions, on='LEI_Code', how='left')

result.head(2)

Unnamed: 0,LEI_Code,NSA,Period,Item,Label,Portfolio,Country,Country_rank,Exposure,Status,Perf_Status,NACE_codes,Amount,Footnote,Row,Column,Sheet,exposure_label,portfolio_label,country_label,ISO_code,status_label,perf_status_label,institution_country,Desc_country,Name,Finrep,Fin_year_end
0,0W2PZJM8XOY22M4GG883,DE,202209,2320502,Original Exposure - by exposure class (SA_and_...,2,0,0,103,0,0,0,419.98,,10,4,Credit Risk_IRB_a,Central governments or central banks,IRB,Total / No breakdown,0,No breakdown by status,No breakdown by Perf_status,DE,Germany,DekaBank Deutsche Girozentrale,Yes - IFRS,31/12
1,0W2PZJM8XOY22M4GG883,DE,202209,2320502,Original Exposure - by exposure class (SA_and_...,2,0,0,203,0,0,0,16941.37,,11,4,Credit Risk_IRB_a,Institutions,IRB,Total / No breakdown,0,No breakdown by status,No breakdown by Perf_status,DE,Germany,DekaBank Deutsche Girozentrale,Yes - IFRS,31/12


In [4]:
# Check the number of observations per row
idx_1 = result['Label'] == 'Exposure value - by exposure class (SA_and_IRB)'
idx_2 = result['Period'] == 202306
idx_3 = result['Country'] == 0

# Sanity checking, only one row per groupby is expected
test = (result[idx_1 & idx_2 & idx_3]
        .groupby(['Desc_country', 'exposure_label', 'portfolio_label', 'Name'])
        .agg({'Amount': ['sum', 'count']}))
assert test[test.columns[-1]].max() == 1, 'Check your filtering criteria'

In [5]:
# Create base table
idx_1 = result['Label'] == 'Exposure value - by exposure class (SA_and_IRB)'
idx_2 = result['Period'] == 202306
# idx_3 = result['Desc_country'] == result['country_label']
idx_3 = result['Country'] == 0

pivot_df = pd.pivot_table(result[idx_1 & idx_2 & idx_3], 
                          values='Amount', 
                          index=['exposure_label', 'portfolio_label'], 
                          columns=['Name'], 
                          aggfunc="sum", 
                          margins=True,
                          fill_value=0).reset_index()

#### B. Calculate RWA and exposure values

In [6]:
# Calculate Exposure values
idx_1 = result['Country'] == 0
idx_2 = result['Label'].isin(['Exposure value - by exposure class (SA_and_IRB)', 
                              'Risk exposure amount - by exposure class (SA_and_IRB)', 
                              'Risk Exposure amount - of which_DEFAULTED  - by exposure class (IRB)'])
idx_3 = result['Period'] == 202306

pivot_df2 = pd.pivot_table(result[idx_1 & idx_2 & idx_3], 
                      values='Amount', 
                      index=['exposure_label', 'portfolio_label', 'Name', 'country_label'], 
                      columns=['Label'], 
                      aggfunc="sum", 
                      margins=False,
                      fill_value=0).reset_index()


# Create a rank based on the summed 'Exposure value', sorted by size
grouped = pivot_df2.groupby('exposure_label')['Exposure value - by exposure class (SA_and_IRB)'].sum()
rank = grouped.sort_values(ascending=False).rank(method='dense', ascending=False).astype(int)

# Map the ranks back to the original DataFrame
pivot_df2['exposure_type_size'] = pivot_df2['exposure_label'].map(rank)

pivot_df2 = pivot_df2.rename(columns={
    'Exposure value - by exposure class (SA_and_IRB)': 'exposure_value',
    'Risk exposure amount - by exposure class (SA_and_IRB)': 'risk_exposure',
    'Risk Exposure amount - of which_DEFAULTED  - by exposure class (IRB)': 'defaulted_risk_exposure',
})

pivot_df2['risk_weight'] = (pivot_df2['risk_exposure'] / pivot_df2['exposure_value']) * 100

final_df = pivot_df2.sort_values(by='exposure_type_size', ascending=True).reset_index()
final_df.to_excel('transprency_irb_v2.xlsx')

#### C. Summary table of IRB banks

In [7]:
# Show RWA for segments with at least 1bn EUR per bank / exposure
idx_1 = pivot_df2['exposure_label'].isin(['Retail', 'Corporates'])
idx_2 = pivot_df2['portfolio_label'] == 'IRB'
idx_3 = pivot_df2['exposure_value'] > 1_000

df3 = pivot_df2[idx_2 & idx_3]

final_results = (df3.pivot_table(index='Name', 
                                 columns='exposure_label', 
                                 values='exposure_value', 
                                 aggfunc='sum', 
                                 margins=True)
                 .fillna(0)
                 .sort_values(by='All', ascending=False))
final_results.reset_index()

exposure_label,Name,Central governments or central banks,Corporates,Equity exposures,Institutions,Retail,All
0,All,2316632.92,6080935.41,120810.89,1158990.1,6261643.39,15939012.71
1,Groupe Crédit Agricole,349268.62,417203.32,17758.96,96044.17,749356.41,1629631.48
2,BNP Paribas,475905.7,584916.37,12508.25,70323.86,281555.1,1425209.28
3,Groupe BPCE,212171.94,195947.82,11204.37,31098.51,508786.84,959209.47
4,ING Groep N.V.,0.0,490841.62,3666.68,68345.53,352992.36,915846.18
5,Société générale S.A.,326080.65,325680.24,5410.77,59161.9,185586.25,901919.81
6,DEUTSCHE BANK AKTIENGESELLSCHAFT,148053.59,359595.08,4465.65,32934.44,228016.22,773064.98
7,"Banco Santander, S.A.",0.0,234617.44,9349.46,57137.08,383448.41,684552.39
8,Coöperatieve Rabobank U.A.,135460.38,245845.04,3761.93,9100.78,245149.87,639318.0
9,Confédération Nationale du Crédit Mutuel,0.0,175388.06,20497.9,41189.36,388215.37,625290.68
