In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns


In [None]:
df=pd.read_csv('/content/Churn_Modelling.csv')
df.head()

In [None]:
sns.countplot(x='Gender', hue='Exited', data=df)



 # Gender shows a noticeable impact on churn, with female customers having a higher likelihood of exiting compared to male customers

In [None]:
sns.countplot(x='Geography', hue='Exited', data=df)


Among the three regions, Spain has the lowest churn count. Although France has the highest number of customers and a large number of retained customers, churn is still present. Germany exhibits a comparatively higher churn level relative to its customer base, indicating a higher risk of customer exit in that region

In [None]:
sns.boxplot(x='Exited', y='Age', data=df)


Non-exited customers have a wider age variation and more extreme old ages (outliers).

Exited customers are more clustered in middle age, with fewer extreme ages.

This could indicate that older customers are more likely to stay with the bank, or the older population in this dataset is mostly loyal.

In [None]:
sns.boxplot(x='Exited', y='Balance', data=df)


The median balance for churned customers (Exited = 1) is higher than zero but generally lower than the median of non-churned customers (Exited = 0).

Most low-balance customers are more likely to churn, as the lower quartiles of the churned group are near zero.

Higher-balance members are mostly in the non-churned group, as shown by the higher median and the spread of balances in Exited = 0.

In [None]:
corr = df.select_dtypes(include='number').corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')


Strong correlation is usually above 0.7 or below -0.7. Here, nothing is that strong except Exited vs Age (0.29) is still weak to moderate.

In [None]:
corr['Exited'].sort_values(ascending=False)

In [None]:
sns.scatterplot(x='Balance', y='NumOfProducts', hue='Exited', data=df)
plt.show()

Customers with more products (3–4) are less likely to churn, while those with fewer products (1–2) are more likely to leave. This could be useful for targeting retention strategies, like offering additional products or benefits to low-product customers

In [None]:
sns.boxplot(x='Geography', y='Balance', data=df)
plt.show()


In [None]:
df.groupby('Geography')['Balance'].describe()


Customers from France and Spain are more likely to churn due to low or zero balances, while German customers show financial stability.

In [None]:
df.groupby('Geography')['Exited'].mean() * 100


Customers from Germany show a significantly higher churn rate compared to France and Spain, indicating geography is a strong predictor of churn.