In [None]:
#importing the necessary libraries 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [None]:
#block out warnings
import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')

#### Columns description:

Countries_Num: Numeric code representing the 8 countries in the West African Economic and Monetary Union (ranges from 1 to 8).

id: An identifier for each bank in the dataset.

Countries: Categorical variable indicating the names of the countries in the West African Economic and Monetary Union.

Banks: Categorical variable representing the names of the banks within the specified countries.

Year: Integer variable indicating the year in which the data was recorded.

RIR (Risk Index Rating): A measure assessing the level of risk associated with financial institutions.

SFS (Solvency and Financial Stability): A metric indicating the financial health and stability of the banks.

INF (Inflation Rate): Represents the inflation rate, a measure of the general rise in prices over a period of time.

ERA (Economic Risk Assessment): An evaluation of the potential economic risks within the banking sector.

INL (Internationalization Level): Indicates the extent to which banks are involved in international activities.

Zscore: A metric used as a measure of a bank's financial health and the likelihood of it going bankrupt in the next two years.

DEBT (Debt Level): Represents the amount of debt held by the banks.

SIZE: Represents the size of the banks, typically measured by total assets or other relevant financial metrics.

CC (Capital Adequacy): A measure of a bank's capital in relation to its risk-weighted assets.

GE (Governance and Ethics): Evaluates the governance practices and ethical standards within the banking institutions.

PS (Profitability and Sustainability): Assesses the profitability and sustainability of the banks.

RQ (Regulatory Compliance): Measures the extent to which banks adhere to regulatory requirements.

RL (Liquidity Risk): Evaluates the risk associated with a bank's ability to meet its short-term obligations.

VA (Value Added): Indicates the value added by the banks to the overall economic environment.

In [None]:
#loading the dataset
uemoa_df = pd.read_csv("uemoa_banking.csv", encoding = 'utf-8')
uemoa_df.head()

In [None]:
#checking for null values in the dataset
uemoa_df.isnull().sum()

In [None]:
#shape of the dataset
uemoa_df.shape

In [None]:
# Exploring basic statistics
metrics = ['RIR', 'SFS', 'INF', 'ERA', 'INL', 'Zscore', 'DEBT', 'SIZE', 'CC', 'GE', 'PS', 'RQ', 'RL', 'VA']
uemoa_df[metrics].describe()

## Summary explanation of the basic statistics for the selected metrics:

1. **Risk Index Rating (RIR):**
   - **Mean (3.71):** On average, the risk index rating for banks in the dataset is moderate.
   - **Standard Deviation (4.11):** There is a significant variability in risk index ratings, suggesting a diverse risk profile among banks.
   - **Range (-23.14 to 7.58):** The risk index ratings span a wide range, including negative values, indicating diverse risk levels.

2. **Solvency and Financial Stability (SFS):**
   - **Mean (31.97):** The average solvency and financial stability score is around 32, indicating a relatively stable financial position.
   - **Standard Deviation (8.24):** There is moderate variability in solvency and financial stability scores among the banks.
   - **Range (15.83 to 51.68):** The scores vary between 15.83 and 51.68, showcasing diversity in financial stability.

3. **Inflation Rate (INF):**
   - **Mean (0.52):** The average inflation rate is relatively low, indicating a generally stable economic environment.
   - **Standard Deviation (1.27):** There is some variability in inflation rates across the dataset.
   - **Range (-3.23 to 2.97):** The inflation rates range from negative values to positive values, reflecting economic conditions.

4. **Economic Risk Assessment (ERA):**
   - **Mean (9.77):** The average economic risk assessment is around 9.77, suggesting a moderate level of economic risk within the banking sector.
   - **Standard Deviation (18.97):** There is a wide range of economic risk assessments, indicating diverse economic conditions across banks.
   - **Range (-179.75 to 179.06):** The assessments span a broad range, including negative values, indicating significant economic risk variations.

5. **Internationalization Level (INL):**
   - **Mean (11.65):** On average, banks have an internationalization level of approximately 11.65, suggesting a moderate involvement in international activities.
   - **Standard Deviation (10.89):** There is considerable variation in the extent of internationalization among banks.
   - **Range (0.00 to 79.61):** The internationalization levels vary from zero to relatively high values.

6. **Zscore:**
   - **Mean (2.97):** The average Z-score is positive, indicating overall financial health.
   - **Standard Deviation (5.12):** There is considerable variability in Z-scores, suggesting diverse financial health among banks.
   - **Range (-47.78 to 39.38):** The Z-scores span a wide range, including negative values, indicating potential financial distress for some banks.

7. **Debt Level (DEBT):**
   - **Mean (38.78):** The average debt level is relatively high at 38.78.
   - **Standard Deviation (11.66):** There is significant variability in debt levels among banks.
   - **Range (18.50 to 65.87):** The debt levels vary widely, with some banks having lower debt and others having higher debt.

8. **Size of Banks (SIZE):**
   - **Mean (12.08):** The average size of banks is around 12.08, suggesting moderate-sized institutions.
   - **Standard Deviation (1.11):** There is relatively low variability in the size of banks.
   - **Range (8.68 to 14.58):** The sizes of banks are within a narrow range, indicating a relatively homogeneous distribution.

9. **Capital Adequacy (CC):**
   - **Mean (37.29):** The average capital adequacy ratio is 37.29, indicating a generally strong capital position.
   - **Standard Deviation (12.59):** There is notable variability in capital adequacy ratios, suggesting differences in capital strength.
   - **Range (13.46 to 58.65):** The capital adequacy ratios vary widely, reflecting diverse capital positions.

These summaries provide insights into the central tendency, variability, and distribution of key metrics in the dataset, helping to understand the overall characteristics of the banking sector in the UEMOA countries.

In [None]:
# Calculate the correlation matrix
correlation_matrix = uemoa_df[metrics].corr()

# Visualize the correlation matrix
plt.figure(figsize=(14, 10))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f", linewidths=.5)
plt.title('Correlation Matrix')
plt.show()


In [None]:
# Visualize the distribution of selected metrics using boxplots
plt.figure(figsize=(16, 10))
sns.boxplot(data=uemoa_df[metrics], orient='v')
plt.title('Boxplots for selected metrics')
plt.show()



- **Economic Risk Assessment (ERA):**
  - The Economic Risk Assessment metric exhibits the highest number of outliers, both on the positive and negative sides of the box plot. This suggests significant variations in the evaluation of potential economic risks within the banking sector. Outliers may indicate extreme economic conditions in certain years or specific countries, impacting the overall risk assessment.

- **Inflation Rate (INF):**
  - The Inflation Rate metric shows a few outliers on the positive end of the box plot. These outliers could be attributed to exceptional economic circumstances in specific years or countries, resulting in unusually high inflation rates. It's essential to investigate these instances further to understand the economic factors contributing to the outliers.

- **Zscore:**
  - Zscore exhibits outliers on both ends of the box plot, with a higher concentration on the positive side. Outliers on the positive side may indicate financial institutions with exceptionally strong financial health, possibly due to robust business strategies or prudent financial management. On the negative side, outliers may highlight banks facing financial challenges, warranting a closer examination of their financial structures.

- **Risk Index Rating (RIR):**
  - Risk Index Rating displays a few outliers on the negative side of the box plot. These outliers may represent banks with unusually high-risk profiles, potentially influenced by economic downturns, regulatory issues, or other factors. Investigating these outliers can provide insights into specific risks affecting certain banks.

- **Liquidity Risk (RL):**
  - Liquidity Risk has a few outliers, indicating variations in a bank's ability to meet short-term obligations. Outliers may be linked to specific events such as economic crises or changes in banking regulations. Understanding the circumstances surrounding these outliers is crucial for assessing the liquidity resilience of individual banks.

- **Other Metrics:**
  - The remaining metrics appear to conform within the box plots, with very few or no outliers. This suggests a general consistency in the values of these metrics across the dataset. Any absence of outliers in these metrics implies a relatively stable and uniform performance within the banking sector for those specific aspects.

By identifying and understanding the outliers in each metric, stakeholders can gain valuable insights into the factors influencing the financial and economic dynamics of the banking institutions in the West African Economic and Monetary Union (UEMOA) countries. Further analysis and contextual information may be required to interpret the outliers accurately.

In [None]:
# Scatter plot of Risk Index Rating (RIR) vs. Size of Banks (SIZE)
plt.figure(figsize=(10, 6))
sns.scatterplot(data=uemoa_df, x='RIR', y='SIZE', hue='Countries')
plt.title('Scatter plot of RIR vs. SIZE')
plt.xlabel('Risk Index Rating (RIR)')
plt.ylabel('Size of Banks (SIZE)')
plt.show()


In [None]:
uemoa_df['Year'].value_counts()