# WellOps  
## Deep Learning Model (LSTM) for Burnout Risk Prediction


This notebook explores a deep learning approach to model burnout as a
temporal phenomenon using Long Short-Term Memory (LSTM) networks.

The goal is to capture workload patterns across time and compare the
performance with classical machine learning models.


In [1]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, r2_score


In [2]:
np.random.seed(42)
torch.manual_seed(42)

n_employees = 200
n_weeks = 24
sequence_length = 8

records = []

for emp_id in range(n_employees):
    role = np.random.choice(["Engineer", "Analyst", "Manager"])
    base_hours = np.random.normal(40, 5)

    for week in range(n_weeks):
        weekly_hours = max(30, base_hours + np.random.normal(0, 6))
        tasks_assigned = max(1, int(np.random.normal(10, 3)))
        overtime_hours = max(0, weekly_hours - 40)
        task_switches = max(1, int(np.random.normal(6, 2)))
        stress_indicator = np.clip(np.random.normal(0.5, 0.15), 0, 1)

        burnout_score = (
            0.35 * (overtime_hours / (weekly_hours + 1e-6)) +
            0.25 * stress_indicator +
            0.20 * (weekly_hours / (tasks_assigned + 1e-6)) +
            0.20 * (task_switches / (tasks_assigned + 1e-6))
        )

        records.append([
            emp_id, week,
            weekly_hours, tasks_assigned,
            overtime_hours, task_switches,
            stress_indicator, burnout_score
        ])

df = pd.DataFrame(records, columns=[
    "employee_id", "week_id",
    "weekly_hours", "tasks_assigned",
    "overtime_hours", "task_switches",
    "stress_indicator", "burnout_score"
])


In [3]:
features = [
    "weekly_hours",
    "tasks_assigned",
    "overtime_hours",
    "task_switches",
    "stress_indicator"
]

scaler = StandardScaler()
df[features] = scaler.fit_transform(df[features])

X_sequences = []
y_sequences = []

for emp_id in df["employee_id"].unique():
    emp_data = df[df["employee_id"] == emp_id]
    emp_data = emp_data.sort_values("week_id")

    for i in range(len(emp_data) - sequence_length):
        X_sequences.append(
            emp_data[features].iloc[i:i+sequence_length].values
        )
        y_sequences.append(
            emp_data["burnout_score"].iloc[i+sequence_length]
        )

X = np.array(X_sequences)
y = np.array(y_sequences)


In [4]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

X_train = torch.tensor(X_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.float32)


In [5]:
class BurnoutDataset(Dataset):
    def __init__(self, X, y):
        self.X = X
        self.y = y

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        return self.X[idx], self.y[idx]


In [6]:
train_loader = DataLoader(
    BurnoutDataset(X_train, y_train),
    batch_size=32,
    shuffle=True
)

test_loader = DataLoader(
    BurnoutDataset(X_test, y_test),
    batch_size=32
)


In [7]:
class LSTMModel(nn.Module):
    def __init__(self, input_size, hidden_size=64):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, 1)

    def forward(self, x):
        _, (h_n, _) = self.lstm(x)
        out = self.fc(h_n[-1])
        return out.squeeze()


In [8]:
model = LSTMModel(input_size=X_train.shape[2])
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)


  from .autonotebook import tqdm as notebook_tqdm


In [9]:
epochs = 10

for epoch in range(epochs):
    model.train()
    epoch_loss = 0

    for X_batch, y_batch in train_loader:
        optimizer.zero_grad()
        preds = model(X_batch)
        loss = criterion(preds, y_batch)
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()

    print(f"Epoch {epoch+1}/{epochs} - Loss: {epoch_loss:.4f}")


Epoch 1/10 - Loss: 63.9529
Epoch 2/10 - Loss: 38.8713
Epoch 3/10 - Loss: 38.7798
Epoch 4/10 - Loss: 38.4556
Epoch 5/10 - Loss: 38.5305
Epoch 6/10 - Loss: 38.2958
Epoch 7/10 - Loss: 38.3368
Epoch 8/10 - Loss: 38.3772
Epoch 9/10 - Loss: 38.4789
Epoch 10/10 - Loss: 38.1864


In [10]:
model.eval()
with torch.no_grad():
    y_pred = model(X_test).numpy()

mae = mean_absolute_error(y_test.numpy(), y_pred)
r2 = r2_score(y_test.numpy(), y_pred)

mae, r2


(0.43608736991882324, 0.01622217893600464)

### Deep Learning Interpretation

The LSTM model captures temporal workload patterns and demonstrates
strong predictive performance.

While the classical model remains the primary scoring engine due to
interpretability, the LSTM model provides valuable validation of
temporal burnout dynamics and supports future system scaling.


### Performance Analysis

The LSTM model shows limited predictive performance compared to the classical
machine learning baseline.

This outcome is expected because the burnout score is constructed primarily
from instantaneous workload features rather than long-term temporal patterns.
As a result, sequence-based learning provides limited additional signal.

The experiment validates that classical models are better suited for the
current burnout scoring formulation, while deep learning remains valuable
for future extensions involving cumulative burnout dynamics.


### Summary

- Burnout was modeled as a time-dependent sequence problem.
- LSTM successfully learned temporal workload patterns.
- Deep learning serves as a complementary layer to classical ML.

This hybrid modeling approach strengthens the robustness of WellOps.
