# Exploratory Data Analysis (EDA) for Patient Admissions Prediction

In this notebook, we will perform exploratory data analysis on the patient admissions dataset. The goal is to understand the dataset, visualize trends, and identify patterns related to patient admissions.

In [1]:
# Import necessary libraries
%pip install pandas numpy matplotlib seaborn

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set visualisation style
sns.set(style='whitegrid')

Note: you may need to restart the kernel to use updated packages.


In [None]:
# Load the dataset
data_path = '../data/processed/admissions_data.csv'
df = pd.read_csv(data_path)

# Display the first few rows of the dataset
df.head()

In [None]:
# Check for missing values
missing_values = df.isnull().sum()
missing_values[missing_values > 0]

In [None]:
# Visualize the distribution of patient admissions
plt.figure(figsize=(10, 6))
sns.countplot(data=df, x='admission_type')
plt.title('Distribution of Admission Types')
plt.xlabel('Admission Type')
plt.ylabel('Count')
plt.xticks(rotation=45)
plt.show()

In [None]:
# Analyze the correlation between features
plt.figure(figsize=(12, 8))
correlation_matrix = df.corr()
sns.heatmap(correlation_matrix, annot=True, fmt='.2f', cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()

## Conclusion

In this notebook, we performed exploratory data analysis on the patient admissions dataset. We visualized the distribution of admission types and analyzed the correlation between different features. Further analysis and feature engineering can be performed based on these insights.