# 04a. Unadjusted Outcomes Overview (Non-ICU Admissions)

## 0. Overview

This notebook provides a descriptive overview of unadjusted (crude) in-hospital outcomes stratified by early RAAS inhibitor exposure among non-ICU adult hospital admissions.

The purpose of this step is to summarize raw outcome distributions prior to any multivariable adjustment, effect estimation, or causal modeling.

## 1. Introduction

Early RAAS inhibitor use has been examined in prior observational studies of hospitalized populations, with reported associations with in-hospital outcomes.

Before conducting adjusted analyses, it is informative to examine unadjusted outcome patterns to understand the raw distribution of events, exposure prevalence, and baseline differences between groups.

## 2. Methods

### 2.1 Data Source and Study Population

- **MIMIC-IV v3.1** (BigQuery public dataset)
- Project: `mimic-iv-portfolio`

**Source Tables:**
  - `mimic-iv-portfolio.nonicu_raas.analysis_dataset`<br>
    (created in 02 using `03_build_analysis_dataset.sql`)

This retrospective observational study used data from the Medical Information Mart for Intensive Care IV (MIMIC-IV), a large, publicly available, de-identified clinical database containing detailed electronic health record data from hospital admissions at Beth Israel Deaconess Medical Center.

The study population consisted of adult patients (aged ≥ 18 years) with at least one recorded hospital admission. Admissions with missing key identifiers or implausible demographic values were excluded. Each hospital admission was treated as an independent observation. Patient age at admission was derived from anchor age information provided by MIMIC-IV.

Because the database is fully de-identified, institutional review board approval and informed consent were not required.

### 2.2 Exposure and Outcome Definitions

The exposure of interest was early use of renin–angiotensin–aldosterone system (RAAS) inhibitors, defined using inpatient medication prescription records indicating use of either an angiotensin-converting enzyme inhibitor (ACE inhibitor) or an angiotensin receptor blocker (ARB) during the early phase of hospitalization, as specified by pre-defined, time-restricted criteria applied upstream.

Exposure was operationalized as a binary indicator representing any early RAAS inhibitor use (raas_any_early).

The outcome was in-hospital mortality, defined using the hospital expiration flag in MIMIC-IV. Mortality during the index hospitalization was coded as 1, and survival to discharge as 0.


### 2.3 Covariates

Baseline demographic and admission-related variables were summarized for descriptive purposes only. These included:

- Age at admission
- Sex
- Race and ethnicity
- Admission type
- Insurance category

No covariates were used for statistical adjustment in this analysis.

### 2.4 Statistical Analysis

This analysis was descriptive in nature.
Unadjusted outcome summaries were computed separately for admissions with and without early RAAS inhibitor exposure.

Specifically, we reported:

- The number of admissions in each exposure group
- The number of in-hospital deaths
- Crude in-hospital mortality rates
- Length of stay summaries by exposure group

No regression modeling, hypothesis testing, or multivariable adjustment was performed at this stage.

### 2.5 Interpretation Framework

All results presented in this analysis represent unadjusted comparisons and are intended to provide descriptive context only.

Differences in outcomes between exposure groups may reflect confounding by age, admission characteristics, baseline clinical severity, or other unmeasured factors. Accordingly, no causal interpretation was made based on these findings.

Adjusted analyses and absolute effect estimation were conducted in subsequent modeling steps.


## 3. Data Preparation
### 3.1 Dataset Loading and Initial Inspection

In [1]:
from google.cloud import bigquery
from google.auth import default

PROJECT_ID = "mimic-iv-portfolio"
DATASET = "nonicu_raas"

TABLE_ANALYSIS = f"{PROJECT_ID}.{DATASET}.analysis_dataset"  # Created in 03_build_analysis_dataset.sql

creds, _ = default()
client = bigquery.Client(project=PROJECT_ID, credentials=creds)

In [2]:
# Helper for read-only SELECT queries → DataFrame
def query_to_df(query) :
    """
    Run a SELECT query in BigQuery and return a pandas DataFrame.
    """
    job = client.query(query)
    return job.to_dataframe(create_bqstorage_client=False)

In [3]:
# Sanity check of the analysis dataset
q = f"""
SELECT *
FROM `{TABLE_ANALYSIS}`
"""
df = query_to_df(q)

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 460786 entries, 0 to 460785
Data columns (total 24 columns):
 #   Column                Non-Null Count   Dtype         
---  ------                --------------   -----         
 0   subject_id            460786 non-null  Int64         
 1   hadm_id               460786 non-null  Int64         
 2   admittime             460786 non-null  datetime64[us]
 3   dischtime             460786 non-null  datetime64[us]
 4   deathtime             2324 non-null    datetime64[us]
 5   hospital_expire_flag  460786 non-null  Int64         
 6   admission_type        460786 non-null  object        
 7   admission_location    460785 non-null  object        
 8   discharge_location    311810 non-null  object        
 9   insurance             452862 non-null  object        
 10  language              460377 non-null  object        
 11  marital_status        454118 non-null  object        
 12  race                  460786 non-null  object        
 13 

### 3.2 Dataset Overview and Descriptive Checks

In [4]:
df["raas_any_early"].value_counts(dropna=False)
df["hospital_expire_flag"].value_counts(dropna=False)

hospital_expire_flag
0    458460
1      2326
Name: count, dtype: Int64

In [5]:
print("N admissions:", df.shape[0])
print(f"Overall mortality rate: {(df['hospital_expire_flag'].mean()*100):.2f} %")
print("Median age:", df["age"].median())

N admissions: 460786
Overall mortality rate: 0.50 %
Median age: 60.0


### 3.3 Outcome and exposure construction

In [6]:
expo_summary = (
    df["raas_any_early"]
    .map({1: "Early RAAS inhibitor use", 0: "No early RAAS inhibitor use"})
    .value_counts(dropna=False)
    .rename_axis("Exposure group")
    .reset_index(name="Number of admissions")
)

expo_summary

Unnamed: 0,Exposure group,Number of admissions
0,No early RAAS inhibitor use,403961
1,Early RAAS inhibitor use,56825


### 3.4 Baseline age distribution

In [7]:
df.groupby("raas_any_early")["age"].describe()

Unnamed: 0_level_0,count,mean,std,min,25%,50%,75%,max
raas_any_early,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,403961.0,56.648924,19.554315,18.0,41.0,58.0,72.0,106.0
1,56825.0,68.631236,14.007736,18.0,59.0,69.0,79.0,102.0


## 4. Results
### 4.1 Unadjusted Outcomes

In [8]:
tbl = (
    df.groupby("raas_any_early")
      .agg(
          n_patients=("raas_any_early", "count"),
          n_events=("hospital_expire_flag", "sum"),
          mortality_rate=("hospital_expire_flag", "mean"),
      )
)
tbl

Unnamed: 0_level_0,n_patients,n_events,mortality_rate
raas_any_early,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,403961,2177,0.005389
1,56825,149,0.002622


**Interpretation**

Among admissions without early RAAS inhibitor exposure, the crude in-hospital mortality rate was 0.54% (2,177 deaths among 403,961 admissions), whereas among admissions with early RAAS exposure, the crude mortality rate was 0.26% (149 deaths among 56,825 admissions).

These estimates represent unadjusted group-level summaries and do not account for differences in baseline characteristics between exposure groups.

The large age difference between exposure groups highlights the importance of covariate adjustment in subsequent analyses.

### 4.2 Unadjusted Length of Stay by Early RAAS Exposure

In [9]:
import numpy as np
df["expo_group"] = np.where(df["raas_any_early"] == 1, "RAAS early", "No RAAS early")

In [10]:
los_summary = (
    df
    .groupby("expo_group")["hosp_los"]
    .agg(["median", "mean"])
    .round(2)
    .reset_index()
)

los_summary

Unnamed: 0,expo_group,median,mean
0,No RAAS early,2.33,3.75
1,RAAS early,2.75,3.98


**Interpretation**

Admissions with early RAAS inhibitor exposure had a slightly longer median hospital stay (2.75 days) compared with those without early RAAS exposure (2.33 days). Mean length of stay showed a similar pattern (3.98 vs. 3.75 days, respectively).

These crude differences reflect overall group-level patterns and may be influenced by differences in patient characteristics, illness severity, or admission context.

Length of stay was not adjusted for discharge disposition or competing risks and should therefore be interpreted descriptively.

## Discussion

This section summarizes unadjusted descriptive comparisons between admissions with and without early RAAS inhibitor exposure.

In the study population, a total of N = 460,786 hospital admissions were included. Early RAAS inhibitor exposure occurred in 56,825 admissions (12.3%), while 403,961 admissions (87.7%) had no early RAAS exposure. The overall in-hospital mortality rate in the cohort was approximately 0.5%, reflecting the low event rate characteristic of the study population.

Baseline characteristics differed between exposure groups. Median age and age distributions varied between admissions with and without early RAAS exposure, and admission context differed across groups. Unadjusted in-hospital mortality rates also differed between exposure categories, with lower crude mortality observed among admissions with early RAAS inhibitor exposure.

These unadjusted differences reflect raw distributions in the data and do not account for potential confounding by age, admission type, insurance status, or other clinical factors. Because exposure was not randomly assigned and no covariate adjustment was applied in this analysis, the observed differences should not be interpreted as evidence of a causal or protective effect of early RAAS inhibitor use.

Accordingly, the primary role of this unadjusted analysis is to establish baseline descriptive context and to motivate the need for multivariable modeling. Adjusted analyses incorporating covariate control and clinically interpretable effect measures are presented in the subsequent multivariable analysis (Section 04b).