# Paper Review

Shauna Heron (Laurentian University)

In [None]:
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

# **Introduction and Motivation**

This review analyzes the study by Garriga et al. (2023), published in *Cell Medical*, titled *“Combining Clinical Notes with Structured Electronic Health Records Enhances the Prediction of Mental Health Crises.”* The research investigates the utility of combining unstructured clinical notes with structured data from electronic mental health records (EMHR) to improve the prediction of mental health crises. The relevance of @garriga2023 study is underscored by an alarming rise in mental health-related hospitalizations coinciding with significant resource and workforce challenges. Recent data highlights that hospitalizations for mental health conditions, particularly among youth in Ontario, have surged since the COVID-19 pandemic. Mental health crises requiring hospitalization (e.g., emotional breakdowns, substance overdoses and suicide attempts) have increased \_\_-fold. What’s worse, many of these crises might have been prevented with early intervention. However a lack of predictive tools has made it difficult for healthcare systems to anticipate and address these crises before they peak.

Leveraging Electronic Health Records (EHRs) to bolster clinical decision-making is not new, as clinicians and researchers have long utilized structured data like diagnosis codes, lab results, and medication records, to inform predictive models and treatment strategies. However, due to computational restraints and …. , unstructured data like clinical notes and other free-form text and imagery until recently has been left largely untapped. Considering this form of data constitutes a substantial portion of EHRs in terms of volume and contains critical narrative information like clinician observations, patient-reported symptoms, and contextual details surrounding patient interactions that structured data might lack, by incorporating this textual data into predictive models…..

## Problem Statement

When a patient is admitted to acute care with a mental crisis a mix of information is collected. The structured component of EHRs contain discrete variables like gender, diagnoses, and school district as well as continuous variables like standardized assessment scores, administrative counts, like number of visits, hours of service and missed appointment counts. EHRs also contain time data like intake dates, discharge date and assessment dates.

While structured data is often easier to process–though even it is known to be notoriously unwieldy in terms of sparseness, missing values, nested features and high multicollinearity–they do not provide the complete clinical context of a patient. Research shows that bolstering structured data with qualitative, unstructured sources can improve cohort identifiaction as well as the prediction accuracy of hospital readmission and suicide attempts. Futhermore, unlike in cardivascular disorders where objective measurements of blood pressure or plasma workups can be stored in structured EHRs, mental health assessments are highly subjective and often the bulk of important information about patient health is stored under clinical observations in the clinical notes. For this reason, understanding how these data predict clinitical events is essential to fully leverage the breadth and depth of information contained in EHRs.

The main challenge posed by these data is the clinical reality that the information contained in any one EHR will vary significantly from patient to patient. Moreover, idosyncricies relating to differences in how clinicians differ in style and clinical opinion mean that the volume, quality and availability of clinical notes is highly inconsistent. Moreoever, the volume of notes is usually directly related with the severity and frequency of the patients mental health crises, which means that those with more severe or recurrent episodes will have a greater volume of clinical notes, making it a more difficult task to predict crises in new patients without a long history.

The primary objective of the study was to predict mental health crises utilizing both structured and unstructured mental health data.

Structed data is composed of a mix of data like clinical notes, discharge reports or even correspondance with clients such as text messages). While structured data is easier to process–

-   **Using an example to illustrate the problem can be particularly effective.**

To this end, the authors

## **Solution**

-   Detail the paper’s **significant contribution to the subject.**

-   Describe the proposed solution in depth.

<figure>
<img src="attachment:images/paste-1.png" alt="Includes all five trained models and the types of data used as input. Struct XGB is an XGBoost model and the rest feed forward neural networks with ensemble DNN combining the results of a neural network trained on structured data only and a neural network trained on both structured and unstructured data." />
<figcaption aria-hidden="true">Includes all five trained models and the types of data used as input. Struct XGB is an XGBoost model and the rest feed forward neural networks with ensemble DNN combining the results of a neural network trained on structured data only and a neural network trained on both structured and unstructured data.</figcaption>
</figure>

-   Contrast the presented solution with other existing methods, focusing on its advantages and disadvantages. Reflect on what we’ve studied in class, making connections where relevant.

## **Experimental Results**

-   **Identify the platform or medium employed (e.g., simulation, robot, or experimental setup).**

-   **Provide an in-depth analysis of the performance results, and feel free to include diagrams or data directly from the paper.**

-   **For papers focused on theory, clearly explain the fundamental theoretical concepts. Address improvements in metrics such as accuracy, efficiency, and other relevant benchmarks.**

## **Conclusions and Further Work:**

-   **Summarize the paper’s key takeaways and suggest potential areas for future research or exploration.**