# Topic 15 â€“ Daily Hospital Admissions Time Series (Kaggle)

**Level:** Easy  
**Goal:** Forecast daily hospital admissions using univariate time series analysis. Suitable for both traditional ARIMA/SARIMA methods and ML/DL approaches.

## Dataset
- **Source:** Daily hospital admissions time series dataset from Kaggle (e.g., searches for "hospital admissions" or "emergency department visits").
- **Link:** Choose a specific Kaggle dataset URL that fits your project focus and paste it in your own project copy of this notebook.

## Download Instructions
1. Find a suitable daily hospital admissions dataset on Kaggle.
2. Log in to Kaggle.
3. Click "Download" for your chosen dataset.
4. Extract to `data/hospital/` folder.
5. Use the main CSV file (e.g., `hospital_admissions.csv` or `daily_admissions.csv`).


## Installation

Install required packages:


In [None]:
# Install required packages (uncomment if needed)
# !pip install pandas numpy matplotlib seaborn statsmodels scikit-learn tensorflow


## Data Loading

Load the daily hospital admissions data.


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set plotting style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

# Load the dataset
# Adjust the file path and name based on the dataset you downloaded from Kaggle
df = pd.read_csv("data/hospital/hospital_admissions.csv")  # adjust filename

# Display basic information
print(f"Dataset shape: {df.shape}")
print(f"\nColumns: {df.columns.tolist()}")
print(f"\nFirst few rows:")
df.head()


In [None]:
# Convert Date column to datetime and set as index
# Adjust column names based on your dataset (e.g., 'date', 'Day', 'AdmissionDate')
df["Date"] = pd.to_datetime(df["Date"])  # adapt column name if different
df = df.set_index("Date").sort_index()

# Display basic information
print(f"Date range: {df.index.min()} to {df.index.max()}")
print(f"Number of observations: {len(df)}")
print(f"\nFirst few rows:")
df.head()


In [None]:
# Check for missing values
print("Missing values:")
print(df.isnull().sum())

# Basic statistics
print(f"\nBasic statistics:")
df.describe()


In [None]:
# Plot the daily admissions over time
# Adjust column name based on your dataset (e.g., 'Admissions', 'Count', 'Value')
admissions_col = df.columns[0]  # Use first column or specify column name

plt.figure(figsize=(12, 6))
plt.plot(df.index, df[admissions_col], linewidth=1.5)
plt.title("Daily Hospital Admissions Over Time", fontsize=14, fontweight='bold')
plt.xlabel("Date", fontsize=12)
plt.ylabel("Number of Admissions", fontsize=12)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
