# Analysis of EBA Transparency Exercise Datasets

## Introduction

This notebook explores the datasets provided by the European Banking Authority (EBA) under their EU-wide Transparency Exercise, accessible [here](https://www.eba.europa.eu/risk-analysis-and-data/eu-wide-transparency-exercise). These datasets offer a comprehensive view of the banking sector's assets and risks, which is invaluable for financial analysis and regulatory assessment.

### Notes

- **Understand the Structure**: We will examine the structure and content of the EBA's credit risk dataset to understand the type of data provided and its potential applications.
- **Ensure Data Integrity**: Although these datasets are generally user-friendly, they require additional checks to validate data integrity and reliability. You should compare a sample of your intermediary results with the results presented by the EBA's interactive tool, specifically the visualizations found [here](https://tools.eba.europa.eu/interactive-tools/2023/powerbi/837203/tr23_visualisation_page5.html), to confirm consistency and understand any discrepancies.
- **Download the credit risk datasets from here**: https://www.eba.europa.eu/assets/TE2023/Full_database/837203/tr_cre.csv

In [24]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

# Settings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

pd.options.display.float_format = '{:,.2f}'.format

#### A. Create base table

In [25]:
# Import mappning tables
exposures = pd.read_excel('metadata_tr_2023.xlsx', sheet_name='Exposures')
portfolio = pd.read_excel('metadata_tr_2023.xlsx', sheet_name='Portfolio')
fin_instrument = pd.read_excel('metadata_tr_2023.xlsx', sheet_name='Financial_Instruments')
assets_stages = pd.read_excel('metadata_tr_2023.xlsx', sheet_name='ASSETS_Stages')
assets_fv = pd.read_excel('metadata_tr_2023.xlsx', sheet_name='ASSETS_FV')
institutions = pd.read_excel('metadata_tr_2023.xlsx', sheet_name='List of institutions')

# import core dataset
df = pd.read_csv('tr_oth_2023.csv', low_memory=False)

In [26]:
# Create final dataset
result = pd.merge(df, exposures, on='Exposure', how='left')
result = pd.merge(result, fin_instrument, on='Financial_instruments', how='left')
result = pd.merge(result, assets_stages, on='ASSETS_Stages', how='left')
result = pd.merge(result, assets_fv, on='ASSETS_FV', how='left')
result = pd.merge(result, institutions, on='LEI_Code', how='left')

result.head()

Unnamed: 0,LEI_Code,NSA,Period,Item,Label,ASSETS_FV,ASSETS_Stages,Exposure,Financial_instruments,Amount,Fin_end_year,n_quarters,Footnote,Row,Column,Sheet,exposure_label,Financial_instruments_label,ASSETS_Stages_label,ASSETS_FV_label,institution_country,Desc_country,Name,Finrep,Fin_year_end
0,0W2PZJM8XOY22M4GG883,DE,202209,2321001,"Cash, cash balances at central banks and other demand deposits",0,0,0,0,22438.22,0,3,,10,5,Assets,Total / No breakdown,Total / No breakdown,No breakdown by ASSETS_Stages,No breakdown by ASSETS_FV,DE,Germany,DekaBank Deutsche Girozentrale,Yes - IFRS,31/12
1,0W2PZJM8XOY22M4GG883,DE,202209,2321002,Financial assets held for trading,0,0,0,0,16279.12,0,3,,11,5,Assets,Total / No breakdown,Total / No breakdown,No breakdown by ASSETS_Stages,No breakdown by ASSETS_FV,DE,Germany,DekaBank Deutsche Girozentrale,Yes - IFRS,31/12
2,0W2PZJM8XOY22M4GG883,DE,202209,2321002,Financial assets held for trading,1,0,0,0,3956.74,0,3,,11,6,Assets,Total / No breakdown,Total / No breakdown,No breakdown by ASSETS_Stages,Fair value hierarchy: Level 1,DE,Germany,DekaBank Deutsche Girozentrale,Yes - IFRS,31/12
3,0W2PZJM8XOY22M4GG883,DE,202209,2321002,Financial assets held for trading,2,0,0,0,11736.86,0,3,,11,7,Assets,Total / No breakdown,Total / No breakdown,No breakdown by ASSETS_Stages,Fair value hierarchy: Level 2,DE,Germany,DekaBank Deutsche Girozentrale,Yes - IFRS,31/12
4,0W2PZJM8XOY22M4GG883,DE,202209,2321002,Financial assets held for trading,3,0,0,0,585.52,0,3,,11,8,Assets,Total / No breakdown,Total / No breakdown,No breakdown by ASSETS_Stages,Fair value hierarchy: Level 3,DE,Germany,DekaBank Deutsche Girozentrale,Yes - IFRS,31/12


In [27]:
result['Label'].unique()

array(['Cash, cash balances at central banks and other demand deposits',
       'Financial assets held for trading',
       'Non-trading financial assets mandatorily at fair value through profit or loss',
       'Financial assets designated at fair value through profit or loss',
       'Financial assets at fair value through other comprehensive income',
       'Financial assets at amortised cost',
       'Derivatives – Hedge accounting',
       'Fair value changes of the hedged items in portfolio hedge of interest rate risk',
       'Other assets', 'Total Assets',
       'Gross carrying amount: Financial assets at fair value through other comprehensive income, Debt securities',
       'Gross carrying amount: Financial assets at fair value through other comprehensive income, Loans and advances',
       'Gross carrying amount: Financial assets at amortised cost, Debt securities',
       'Gross carrying amount: Financial assets at amortised cost, Loans and advances',
       'Accumulated i

### B. Compare stage 2 to stage 3

In [33]:
# Create base table
idx_1 = result['Label'].isin(['Accumulated impairment: Financial assets at amortised cost, Loans and advances'])
idx_2 = result['Period'] == 202306

pivot_df = pd.pivot_table(result[idx_1 & idx_2], 
                          values='Amount', 
                          index=['Name'], 
                          columns=['ASSETS_Stages'], 
                          aggfunc="sum", 
                          margins=True,
                          fill_value=0).reset_index()

pivot_df.sort_values(by=pivot_df.columns[-1], ascending=True).head(40)

ASSETS_Stages,Name,1,2,3,All
112,All,-31895.95,-54005.76,-148668.11,-234569.82
21,"Banco Santander, S.A.",-3941.48,-5224.76,-14130.64,-23296.88
53,Groupe Crédit Agricole,-2986.88,-5806.15,-12265.42,-21058.45
15,BNP Paribas,-2050.02,-2476.04,-13272.16,-17798.22
52,Groupe BPCE,-1310.86,-3993.91,-8717.38,-14022.15
39,Confédération Nationale du Crédit Mutuel,-1920.2,-2124.87,-7309.19,-11354.26
19,"Banco Bilbao Vizcaya Argentaria, S.A.",-2143.18,-2021.14,-7106.52,-11270.84
106,"UNICREDIT, SOCIETA' PER AZIONI",-1396.87,-3856.85,-5792.77,-11046.49
99,Société générale S.A.,-1052.14,-2036.6,-7578.85,-10667.59
61,Intesa Sanpaolo S.p.A.,-780.01,-1671.22,-4973.36,-7424.59


In [38]:
# Create base table
idx_1 = result['Label'].isin(['Gross carrying amount: Financial assets at amortised cost, Loans and advances'])
idx_2 = result['Period'] == 202306

pivot_df = pd.pivot_table(result[idx_1 & idx_2], 
                          values='Amount', 
                          index=['Name'], 
                          columns=['ASSETS_Stages'], 
                          aggfunc="sum", 
                          margins=True,
                          fill_value=0).reset_index()

pivot_df.columns = pivot_df.columns.astype(str)
pivot_df['Stage_1_vs_Stage_2'] = pivot_df['2'] / pivot_df['1']

pivot_df.sort_values(by='All', ascending=False).head(40)

ASSETS_Stages,Name,1,2,3,All,Stage_1_vs_Stage_2
112,All,13788690.65,1411158.98,333903.79,15533753.42,0.1
53,Groupe Crédit Agricole,1138802.13,111480.18,25153.91,1275436.22,0.1
21,"Banco Santander, S.A.",1009877.95,72759.93,33575.03,1116212.91,0.07
52,Groupe BPCE,795888.56,126951.26,21024.95,943864.78,0.16
15,BNP Paribas,831092.5,70675.02,25926.28,927693.8,0.09
39,Confédération Nationale du Crédit Mutuel,645056.14,40543.74,15388.53,700988.41,0.06
59,ING Groep N.V.,621071.11,49906.81,10648.75,681626.67,0.08
42,DEUTSCHE BANK AKTIENGESELLSCHAFT,530274.2,52086.53,12146.44,594507.18,0.1
99,Société générale S.A.,491919.26,36930.96,16436.19,545286.41,0.08
106,"UNICREDIT, SOCIETA' PER AZIONI",410441.67,79767.72,12163.08,502372.48,0.19


### C. Compare IRB Shortfall

In [51]:
# Create base table
idx_1 = result['Period'] == 202306
idx_2 = result['Label'].isin(['(-) IRB shortfall of credit risk adjustments to expected losses'])
idx_3 = result['Label'].isin(['Accumulated impairment: Financial assets at amortised cost, Loans and advances'])
idx_4 = result['ASSETS_Stages'] = 3

pivot_df = pd.pivot_table(result[idx_1 & (idx_2 | (idx_3 & idx_4))], 
                          values='Amount', 
                          index=['Name'], 
                          columns=['Label'], 
                          aggfunc="sum", 
                          # margins=True,
                          fill_value=0).reset_index()

pivot_df.columns = pivot_df.columns.astype(str)
pivot_df['irb_shortfall_vs_stage_3_provisions'] = pivot_df['(-) IRB shortfall of credit risk adjustments to expected losses'] / pivot_df['Accumulated impairment: Financial assets at amortised cost, Loans and advances']

pivot_df.sort_values(by='Accumulated impairment: Financial assets at amortised cost, Loans and advances', ascending=True).head(40)

Label,Name,(-) IRB shortfall of credit risk adjustments to expected losses,"Accumulated impairment: Financial assets at amortised cost, Loans and advances",irb_shortfall_vs_stage_3_provisions
21,"Banco Santander, S.A.",-250.09,-23296.88,0.01
60,Groupe Crédit Agricole,-371.25,-21058.45,0.02
15,BNP Paribas,-478.67,-17798.22,0.03
59,Groupe BPCE,-191.32,-14022.15,0.01
44,Confédération Nationale du Crédit Mutuel,-381.31,-11354.26,0.03
19,"Banco Bilbao Vizcaya Argentaria, S.A.",-193.96,-11270.84,0.02
116,"UNICREDIT, SOCIETA' PER AZIONI",-6.75,-11046.49,0.0
109,Société générale S.A.,-359.58,-10667.59,0.03
68,Intesa Sanpaolo S.p.A.,-230.8,-7424.59,0.03
40,"CaixaBank, S.A.",0.0,-7151.07,-0.0
