# 🏥 Patient Outcome Analysis

This notebook explores hospital patient data to uncover patterns in patient outcomes such as length of stay, readmission, and discharge status.

## 🔍 1. Import Libraries and Load Data

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import seaborn.objects as so
import numpy as np

sns.set(style='whitegrid')

# Load the dataset
df = pd.read_csv('../data/hospital_data.csv')
df.head()

## 📊 2. Initial Data Exploration

In [None]:
# Basic structure
df.info()

In [None]:
# Summary statistics
df.describe(include='all')

In [None]:
# Check for missing values
df.isnull().sum()

## 📌 3. Outcome Distribution

In [None]:
sns.countplot(data=df, x='Outcome')
plt.title('Distribution of Patient Outcomes')
plt.xticks(rotation=45)
plt.show()

In [None]:
df['Outcome'].value_counts(normalize=True)

## 🧬 4. Outcome by Age, Gender, and Diagnosis

In [None]:
sns.boxplot(data=df, x='Outcome', y='Age')
plt.title('Age Distribution by Outcome')
plt.xticks(rotation=45)
plt.show()

In [None]:
sns.countplot(data=df, x='Gender', hue='Outcome')
plt.title('Outcome Distribution by Gender')
plt.show()

In [None]:
plt.figure(figsize=(12,6))
sns.countplot(data=df, x='Diagnosis', hue='Outcome')
plt.title('Outcome by Diagnosis')
plt.xticks(rotation=90)
plt.show()

## 🏨 5. Length of Stay (LOS)

In [None]:
sns.histplot(df['Length_of_Stay'], kde=True, bins=20)
plt.title('Length of Stay Distribution')
plt.xlabel('Days')
plt.show()

In [None]:
plt.figure(figsize=(12,6))
sns.boxplot(data=df, x='Diagnosis', y='Length_of_Stay')
plt.title('Length of Stay by Diagnosis')
plt.xticks(rotation=90)
plt.show()

## 🔁 6. Readmission Analysis

In [None]:
readmitted = df[df['Outcome'].str.contains('Readmitted', case=False, na=False)]
readmitted.groupby('Diagnosis').size().sort_values(ascending=False).head(10)

In [None]:
sns.countplot(data=readmitted, x='Gender')
plt.title('Gender of Readmitted Patients')
plt.show()

In [None]:
sns.boxplot(data=readmitted, x='Diagnosis', y='Age')
plt.title('Age of Readmitted Patients by Diagnosis')
plt.xticks(rotation=90)
plt.show()

## ✅ Summary of Insights
- **Most common outcome**: Discharged home
- **Age trends**: Older patients more likely to be readmitted
- **Gender differences**: Slightly more female patients readmitted
- **Length of stay**: Varies by diagnosis; some conditions significantly increase LOS
- **Readmission risks**: Concentrated in certain diagnoses (e.g. chronic illnesses)

---

🧠 *Next steps could include modeling readmission risk using logistic regression or decision trees.*