# Predicting emergency department visits anchored on clinic dates
---
## Background
Before, we built a model to predict emergency department (ED) visits anchored on treatment dates.

The problem with that is the primary physicians do not interact with their patients during their treatment sessions. They only meet during their clinic visits. That is the best time for the model to nudge the physician for an intervention. Thus, we now want to build a model to predict patient's risk of ED visits prior to clinic date instead of prior to treatment session.

---

In [None]:
%%capture
%cd ../../
%load_ext autoreload
%autoreload 2

In [None]:
import os
from dotenv import load_dotenv

import pandas as pd

from make_clinical_dataset.shared.constants import ROOT_DIR
from ml_common.summary import get_label_distribution
from preduce.acu.pipeline import prepare, train_and_eval
from preduce.shared.summarize import feature_summary

In [None]:
load_dotenv()

DATE = '2025-03-29'
DATA_PATH = f'{ROOT_DIR}/data/final/data_{DATE}/processed/clinic_centered_data.parquet'
SAVE_PATH = os.getenv("SAVE_PATH")

# Prepare the data

In [None]:
df = pd.read_parquet(DATA_PATH)
out = prepare(df)
feats, targs, meta = out['feats'], out['targs'], out['meta']

In [None]:
get_label_distribution(targs, meta, with_respect_to='sessions')

In [None]:
get_label_distribution(targs, meta, with_respect_to='patients')

In [None]:
# Feature Characteristics
feature_summary(pd.get_dummies(feats))

In [None]:
# Cohort Characteristics
pd.DataFrame({'All': cohort_summary(
    pd.concat([feats, targs, meta], axis=1),
    top_cancers=feats['cancer_type'].value_counts().index[:5],
    targets=['target_ED_90d']
)})

# Train the model

In [None]:
res = train_and_eval(out, targets=['target_ED_90d'], save_path=SAVE_PATH, load_model=False)
# res = train_and_eval(out, targets=['target_ED_90d'], save_path=SAVE_PATH, load_model=True)

In [None]:
res['val']

In [None]:
res['test']