# 🧪 Beginner Notebook – FairFrame Challenge

This notebook provides a basic exploratory analysis of synthetic loan data. 
Your goal is to identify potential group-level disparities (e.g., based on sex, caste, or religion) in loan approvals.


In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load dataset
df = pd.read_csv("data/loan_data.csv")
df.head()


In [None]:
# Basic Overview
df.describe(include='all')
df['loan_approved'].value_counts(normalize=True)


In [None]:
# Group-Level Approval Rates
group_cols = ['sex', 'caste', 'religion']
for col in group_cols:
    print(f"\nApproval rate by {col}")
    print(df.groupby(col)['loan_approved'].mean())


In [None]:
# Visualizations
sns.barplot(x='sex', y='loan_approved', data=df)
plt.title("Loan Approval Rate by Sex")
plt.show()

sns.barplot(x='caste', y='loan_approved', data=df)
plt.title("Loan Approval Rate by Caste")
plt.xticks(rotation=45)
plt.show()

sns.barplot(x='religion', y='loan_approved', data=df)
plt.title("Loan Approval Rate by Religion")
plt.xticks(rotation=45)
plt.show()


In [None]:
# Optional: Disparate Impact Ratio (Sex)
approval_rate = df.groupby('sex')['loan_approved'].mean()
di_ratio = approval_rate.min() / approval_rate.max()
print(f"Disparate Impact Ratio (Sex): {di_ratio:.2f}")


# ✍️ Summary

In your final submission, briefly summarize:

- Which group(s) appeared to have lower approval rates?
- Are there any group disparities that could indicate unfair bias?
- What further data would help you clarify your findings?

This analysis is intended as a first step toward identifying potential fairness concerns in the loan approval system.
