# Encounter Stitching Demo

This notebook demonstrates how to use the encounter stitching functionality in CLIFpy to link related hospital encounters that occur within a specified time window.

## 1. Setup and Imports

In [1]:
import sys
from pathlib import Path
import pandas as pd
import numpy as np
from clifpy.clif_orchestrator import ClifOrchestrator

In [2]:
def find_project_root(start=None):
    p = Path(start or Path.cwd())
    for d in [p, *p.parents]:
        if (d / "pyproject.toml").exists() or (d / "clifpy").is_dir():
            return d
    return p

project_root = find_project_root()
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))
DATA_DIR = (project_root / "clifpy" / "data" / "clif_demo").resolve()
OUTPUT_DIR = (project_root / "examples" / "output").resolve()
FILETYPE = "parquet"
TIMEZONE = "US/Eastern"

print(f"Data directory: {DATA_DIR}")
print(f"Output directory: {OUTPUT_DIR}")

Data directory: /Users/kavenchhikara/Desktop/CLIF/CLIFpy/clifpy/data/clif_demo
Output directory: /Users/kavenchhikara/Desktop/CLIF/CLIFpy/examples/output


## 2. Initialize ClifOrchestrator with Encounter Stitching

The encounter stitching feature can be enabled by setting `stitch_encounter=True` when creating the orchestrator.

In [3]:
# Initialize orchestrator with encounter stitching enabled
clif = ClifOrchestrator(
    data_directory=str(DATA_DIR),
    filetype=FILETYPE,
    timezone=TIMEZONE,
    output_directory=str(OUTPUT_DIR),
    stitch_encounter=True,  # Enable encounter stitching
    stitch_time_interval=6  # 6-hour window (default)
)

Using directly provided parameters
ClifOrchestrator initialized.


## 3. Load Required Tables

Encounter stitching requires both hospitalization and ADT tables. The stitching will happen automatically during initialization.

In [4]:
# Load the required tables - stitching happens automatically
clif.initialize(['hospitalization', 'adt'])

Using directly provided parameters
Loading clif_hospitalization.parquet
Data loaded successfully from clif_hospitalization.parquet
admission_dttm: null count before conversion= 0
admission_dttm: Converted from UTC to your timezone (US/Eastern).
admission_dttm: null count after conversion= 0
discharge_dttm: null count before conversion= 0
discharge_dttm: Converted from UTC to your timezone (US/Eastern).
discharge_dttm: null count after conversion= 0
Using directly provided parameters
Loading clif_adt.parquet
Data loaded successfully from clif_adt.parquet
in_dttm: null count before conversion= 0
in_dttm: Converted from UTC to your timezone (US/Eastern).
in_dttm: null count after conversion= 0
out_dttm: null count before conversion= 275
out_dttm: Converted from UTC to your timezone (US/Eastern).
out_dttm: null count after conversion= 275
Performing encounter stitching with time interval of 6 hours...
Encounter stitching completed successfully.


## 4. Examine the Results

After stitching, both tables will have a new `encounter_block` column that groups related encounters.

In [5]:
clif.hospitalization.df

Unnamed: 0,patient_id,hospitalization_id,hospitalization_joined_id,admission_dttm,discharge_dttm,age_at_admission,admission_type_name,admission_type_category,discharge_name,discharge_category,zipcode_nine_digit,zipcode_five_digit,census_block_code,census_block_group_code,census_tract,state_code,county_code,encounter_block
0,10004235,24181354,,2196-02-24 14:38:00-05:00,2196-03-04 14:02:00-05:00,47,URGENT,ed,SKILLED NURSING FACILITY,skilled nursing facility (snf),,,,,,,,36
1,10009628,25926192,,2153-09-17 17:08:00-05:00,2153-09-25 13:20:00-05:00,58,URGENT,ed,HOME HEALTH CARE,home,,,,,,,,75
2,10018081,23983182,,2134-08-18 02:02:00-05:00,2134-08-23 19:35:00-05:00,80,URGENT,ed,SKILLED NURSING FACILITY,skilled nursing facility (snf),,,,,,,,139
3,10006053,22942076,,2111-11-13 23:39:00-05:00,2111-11-15 17:20:00-05:00,52,URGENT,ed,DIED,expired,,,,,,,,61
4,10031404,21606243,,2113-08-04 18:46:00-05:00,2113-08-06 20:57:00-05:00,82,URGENT,ed,HOME,home,,,,,,,,218
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
270,10004457,28108313,,2147-12-19 00:00:00-05:00,2147-12-21 16:10:00-05:00,72,SURGICAL SAME DAY ADMISSION,elective,SKILLED NURSING FACILITY,skilled nursing facility (snf),,,,,,,,44
271,10037975,27617929,,2185-01-17 19:11:00-05:00,2185-01-22 14:25:00-05:00,60,URGENT,ed,DIED,expired,,,,,,,,245
272,10019777,27738145,,2187-02-10 18:57:00-05:00,2187-02-27 13:22:00-05:00,51,EW EMER.,ed,HOSPICE,hospice,,,,,,,,159
273,10018501,28479513,,2141-07-30 22:34:00-05:00,2141-08-05 18:06:00-05:00,83,EW EMER.,ed,SKILLED NURSING FACILITY,skilled nursing facility (snf),,,,,,,,144


In [6]:
# Access the encounter mapping
encounter_mapping = clif.get_encounter_mapping()

if encounter_mapping is not None:
    print(f"Total hospitalizations: {len(encounter_mapping)}")
    print(f"Total encounter blocks: {encounter_mapping['encounter_block'].nunique()}")
    print(f"\nEncounter mapping shape: {encounter_mapping.shape}")
    print("\nFirst few rows of encounter mapping:")
    print(encounter_mapping.head())

Total hospitalizations: 275
Total encounter blocks: 272

Encounter mapping shape: (275, 2)

First few rows of encounter mapping:
   hospitalization_id  encounter_block
0            22595853                1
3            22841357                2
6            29079034                3
12           25742920                4
15           24597018                5


## 5. Direct function usage 

In [7]:
import pandas as pd

# Load the hospitalization parquet file directly
df = pd.read_parquet('/Users/kavenchhikara/Desktop/CLIF/CLIFpy/clifpy/data/clif_demo/clif_hospitalization.parquet')
print(f"Loaded hospitalization data with shape: {df.shape}")
df.head()


Loaded hospitalization data with shape: (275, 17)


Unnamed: 0,patient_id,hospitalization_id,hospitalization_joined_id,admission_dttm,discharge_dttm,age_at_admission,admission_type_name,admission_type_category,discharge_name,discharge_category,zipcode_nine_digit,zipcode_five_digit,census_block_code,census_block_group_code,census_tract,state_code,county_code
0,10004235,24181354,,2196-02-24 19:38:00+00:00,2196-03-04 19:02:00+00:00,47,URGENT,ed,SKILLED NURSING FACILITY,Skilled Nursing Facility (SNF),,,,,,,
1,10009628,25926192,,2153-09-17 22:08:00+00:00,2153-09-25 18:20:00+00:00,58,URGENT,ed,HOME HEALTH CARE,Home,,,,,,,
2,10018081,23983182,,2134-08-18 07:02:00+00:00,2134-08-24 00:35:00+00:00,80,URGENT,ed,SKILLED NURSING FACILITY,Skilled Nursing Facility (SNF),,,,,,,
3,10006053,22942076,,2111-11-14 04:39:00+00:00,2111-11-15 22:20:00+00:00,52,URGENT,ed,DIED,Expired,,,,,,,
4,10031404,21606243,,2113-08-04 23:46:00+00:00,2113-08-07 01:57:00+00:00,82,URGENT,ed,HOME,Home,,,,,,,


In [8]:
from clifpy import Adt, Hospitalization
from clifpy.utils.stitching_encounters import stitch_encounters

hospitalization = Hospitalization.from_file(
    data_directory=str(DATA_DIR),
    filetype=FILETYPE,
    timezone=TIMEZONE,
    output_directory=str(OUTPUT_DIR),
)
print("Hospitalization data loaded:", hospitalization.df is not None)

adt = Adt.from_file(
    data_directory=str(DATA_DIR),
    filetype=FILETYPE,
    timezone=TIMEZONE,
    output_directory=str(OUTPUT_DIR),
)
print("ADT data loaded:", adt.df is not None)



Using directly provided parameters
Loading clif_hospitalization.parquet
Data loaded successfully from clif_hospitalization.parquet
admission_dttm: null count before conversion= 0
admission_dttm: Converted from UTC to your timezone (US/Eastern).
admission_dttm: null count after conversion= 0
discharge_dttm: null count before conversion= 0
discharge_dttm: Converted from UTC to your timezone (US/Eastern).
discharge_dttm: null count after conversion= 0
Hospitalization data loaded: True
Using directly provided parameters
Loading clif_adt.parquet
Data loaded successfully from clif_adt.parquet
in_dttm: null count before conversion= 0
in_dttm: Converted from UTC to your timezone (US/Eastern).
in_dttm: null count after conversion= 0
out_dttm: null count before conversion= 275
out_dttm: Converted from UTC to your timezone (US/Eastern).
out_dttm: null count after conversion= 275
ADT data loaded: True


In [9]:
hospitalization.df.dtypes

patient_id                               string[python]
hospitalization_id                       string[python]
hospitalization_joined_id                string[python]
admission_dttm               datetime64[us, US/Eastern]
discharge_dttm               datetime64[us, US/Eastern]
age_at_admission                                  int64
admission_type_name                              object
admission_type_category                          object
discharge_name                                   object
discharge_category                               object
zipcode_nine_digit                               object
zipcode_five_digit                               object
census_block_code                                object
census_block_group_code                          object
census_tract                                     object
state_code                                       object
county_code                                      object
dtype: object

In [10]:
# Perform stitching
hosp_stitched, adt_stitched, encounter_mapping = stitch_encounters(
    hospitalization=hospitalization.df,
    adt=adt.df,
    time_interval=12  # 12-hour window
)

In [11]:
hosp_stitched

Unnamed: 0,patient_id,hospitalization_id,hospitalization_joined_id,admission_dttm,discharge_dttm,age_at_admission,admission_type_name,admission_type_category,discharge_name,discharge_category,zipcode_nine_digit,zipcode_five_digit,census_block_code,census_block_group_code,census_tract,state_code,county_code,encounter_block
0,10004235,24181354,,2196-02-24 14:38:00-05:00,2196-03-04 14:02:00-05:00,47,URGENT,ed,SKILLED NURSING FACILITY,skilled nursing facility (snf),,,,,,,,36
1,10009628,25926192,,2153-09-17 17:08:00-05:00,2153-09-25 13:20:00-05:00,58,URGENT,ed,HOME HEALTH CARE,home,,,,,,,,75
2,10018081,23983182,,2134-08-18 02:02:00-05:00,2134-08-23 19:35:00-05:00,80,URGENT,ed,SKILLED NURSING FACILITY,skilled nursing facility (snf),,,,,,,,139
3,10006053,22942076,,2111-11-13 23:39:00-05:00,2111-11-15 17:20:00-05:00,52,URGENT,ed,DIED,expired,,,,,,,,61
4,10031404,21606243,,2113-08-04 18:46:00-05:00,2113-08-06 20:57:00-05:00,82,URGENT,ed,HOME,home,,,,,,,,218
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
270,10004457,28108313,,2147-12-19 00:00:00-05:00,2147-12-21 16:10:00-05:00,72,SURGICAL SAME DAY ADMISSION,elective,SKILLED NURSING FACILITY,skilled nursing facility (snf),,,,,,,,44
271,10037975,27617929,,2185-01-17 19:11:00-05:00,2185-01-22 14:25:00-05:00,60,URGENT,ed,DIED,expired,,,,,,,,245
272,10019777,27738145,,2187-02-10 18:57:00-05:00,2187-02-27 13:22:00-05:00,51,EW EMER.,ed,HOSPICE,hospice,,,,,,,,159
273,10018501,28479513,,2141-07-30 22:34:00-05:00,2141-08-05 18:06:00-05:00,83,EW EMER.,ed,SKILLED NURSING FACILITY,skilled nursing facility (snf),,,,,,,,144


In [12]:
encounter_mapping


Unnamed: 0,hospitalization_id,encounter_block
0,22595853,1
3,22841357,2
6,29079034,3
12,25742920,4
15,24597018,5
...,...,...
1111,22251969,271
1117,27876215,272
1122,27259207,273
1125,25933959,274
