<a href="https://colab.research.google.com/github/Ravi-kjain84/Articles/blob/main/GSIB_Predictive_Analysis_RWA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Predictive Analysis of SNL_RWA_TO_ASSETS Using Macroeconomic Indicators
This notebook explores the relationship between various macroeconomic indicators and the `SNL_RWA_TO_ASSETS` ratio using a dataset of global banking institutions. We will perform a correlation analysis to determine the influence of these indicators and subsequently develop a predictive model using Linear Regression.

In [None]:
!pip install distutils

ERROR: Could not find a version that satisfies the requirement distutils (from versions: none)
ERROR: No matching distribution found for distutils


In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import seaborn as sns
import matplotlib.pyplot as plt

ModuleNotFoundError: No module named 'distutils'

## Data Loading and Initial Analysis

In [None]:
file_path = 'Final_data_without_formula.xlsx'
data = pd.read_excel(file_path)
data.head()

## Data Cleaning
Remove rows where any relevant macroeconomic attribute or the target variable (`SNL_RWA_TO_ASSETS`) is zero as these may not represent meaningful data.

In [None]:
relevant_columns = ['Trade Balance ($B)', 'Goods Exports ($B)', 'Goods Imported ($B)', 'Nominal GDP ($B)', 'GDP Per Capita ($)', 'Real GDP Growth (%)', 'Real GDP, Local Currency ($B)', 'Population (M)', 'Consumer Price Inflation (%)', 'Producer Price Inflation (%)', 'Exchange Rate, average ', 'Exchange Rate, end ', 'SNL_RWA_TO_ASSETS']
cleaned_data = data[relevant_columns][(data[relevant_columns] != 0).all(axis=1)]

## Correlation Analysis
Calculate and visualize the correlation between `SNL_RWA_TO_ASSETS` and macroeconomic indicators to understand their relationship.

In [None]:
correlation_results = cleaned_data.corr()['SNL_RWA_TO_ASSETS'].sort_values(key=abs, ascending=False).drop('SNL_RWA_TO_ASSETS')
sns.heatmap(cleaned_data.corr(), annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()

## Model Development and Evaluation
Develop a Linear Regression model to predict `SNL_RWA_TO_ASSETS` and evaluate its performance.

In [None]:
X = cleaned_data.drop('SNL_RWA_TO_ASSETS', axis=1)
y = cleaned_data['SNL_RWA_TO_ASSETS']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LinearRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
r2 = r2_score(y_test, y_pred)
rmse = mean_squared_error(y_test, y_pred, squared=False)
print(f'R-squared: {r2}')
print(f'RMSE: {rmse}')