![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/healthcare/SOCIAL_DETERMINANT_NER.ipynb)

# **Social Determinants of Health-NER**




ðŸ“ŒTo run this yourself, you will need to upload your license keys to the notebook. Just Run The Cell Below in order to do that. Also You can open the file explorer on the left side of the screen and upload `license_keys.json` to the folder that opens.
Otherwise, you can look at the example outputs at the bottom of the notebook.

# **Colab Setup**

In [None]:
import json, os
from google.colab import files

if 'spark_jsl.json' not in os.listdir():
  license_keys = files.upload()
  os.rename(list(license_keys.keys())[0], 'spark_jsl.json')

with open('spark_jsl.json') as f:
    license_keys = json.load(f)

# Defining license key-value pairs as local variables
locals().update(license_keys)
os.environ.update(license_keys)

In [None]:
# Installing pyspark and spark-nlp
! pip install --upgrade -q pyspark==3.1.2 spark-nlp==$PUBLIC_VERSION

# Installing Spark NLP Healthcare
! pip install --upgrade -q spark-nlp-jsl==$JSL_VERSION  --extra-index-url https://pypi.johnsnowlabs.com/$SECRET

# Installing Spark NLP Display Library for visualization
! pip install -q spark-nlp-display

In [None]:
import sparknlp
import sparknlp_jsl

from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *

from pyspark.sql import SparkSession
from pyspark.sql import functions as F
from pyspark.ml import Pipeline,PipelineModel
from pyspark.sql.types import StringType, IntegerType

import pandas as pd
pd.set_option('display.max_colwidth', 200)

import warnings
warnings.filterwarnings('ignore')

params = {"spark.driver.memory":"16G", 
          "spark.kryoserializer.buffer.max":"2000M", 
          "spark.driver.maxResultSize":"2000M"} 

spark = sparknlp_jsl.start(license_keys['SECRET'],params=params)

print("Spark NLP Version :", sparknlp.version())
print("Spark NLP_JSL Version :", sparknlp_jsl.version())

spark

Spark NLP Version : 4.3.0
Spark NLP_JSL Version : 4.3.0


# ðŸ”Ž MODELS 

### Named Entity Recognition : 
> * ### *`ner_sdoh_wip`*
> * ### *`ner_sdoh_mentions`*
> * ### *`ner_sdoh_slim_wip`*
> * ### *`ner_sdoh_income_social_status_wip`*
> * ### *`ner_sdoh_demographics_wip`*
> * ### *`ner_sdoh_social_environment_wip`*





**ðŸ”ŽYou can find all these models and more [NLP Models Hub](https://nlp.johnsnowlabs.com/models?task=Named+Entity+Recognition&edition=Spark+NLP+for+Healthcare)**

## **`ner_sdoh_wip`**

In [None]:
documentAssembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

sentenceDetector = SentenceDetector()\
    .setInputCols(["document"])\
    .setOutputCol("sentence")

tokenizer = Tokenizer()\
    .setInputCols(["sentence"])\
    .setOutputCol("token")
    
word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
    .setInputCols(["sentence", "token"])\
    .setOutputCol("embeddings")

ner = MedicalNerModel.pretrained("ner_sdoh_wip", "en", "clinical/models") \
    .setInputCols(["sentence", "token", "embeddings"]) \
    .setOutputCol("ner")

ner_converter = NerConverterInternal() \
    .setInputCols(["sentence", "token", "ner"]) \
    .setOutputCol("ner_chunk")

ner_pipeline = Pipeline(stages=[
                                documentAssembler, 
                                sentenceDetector,
                                tokenizer,
                                word_embeddings,
                                ner,
                                ner_converter])



embeddings_clinical download started this may take some time.
Approximate size to download 1.6 GB
[OK!]
ner_sdoh_wip download started this may take some time.
[OK!]


In [None]:
text_list = ["Smith is a 55 years old, divorced Mexcian American woman with financial problems. She speaks spanish. She lives in an apartment. She has been struggling with diabetes for the past 10 years and has recently been experiencing frequent hospitalizations due to uncontrolled blood sugar levels. Smith works as a cleaning assistant and does not have access to health insurance or paid sick leave. She has a son student at college. Pt with likely long-standing depression. She is aware she needs rehab. Pt reprots having her catholic faith as a means of support as well.  She has long history of etoh abuse, beginning in her teens. She reports she has been drinker for 30 years, most recently drinking beer daily. She smokes a pack of cigarettes a day. She had DUI back in April and was due to be in court this week.",
                "The patient is a 42-year-old female who presented to the healthcare institution with complaints of hypertension and hyperlipidemia. The patient reported experiencing childhood trauma related to domestic violence and has struggled with mental health issues as a result. The patient also disclosed a history of substance use, smoking, and alcohol consumption, which have contributed to her health issues. The patient is currently unemployed and facing financial difficulties, which have impacted her access to care and quality of life. Additionally, the patient reported feeling socially excluded due to her sexual orientation and has limited social support. The patient's housing situation is unstable, and she expressed concerns about being able to maintain adequate housing in the future. The patient's healthcare provider recommended an exercise regimen and dietary changes to help manage her health issues, but the patient expressed difficulty in accessing healthy food options due to limited transportation and financial resources.",
                "The patient reported a history of substance use, specifically alcohol and marijuana, which began during their college years. They also disclosed a history of childhood trauma related to emotional abuse by a family member. The patient is currently experiencing financial difficulties and is unemployed, which has caused significant stress and impacted their access to healthcare services. Additionally, the patient has been diagnosed with hypertension and hyperlipidemia, and struggles with maintaining a healthy diet due to limited access to healthy food options and a lack of social support. The patient has no current legal issues and identifies as bisexual. They report limited transportation options and reside in a geographic area with limited access to healthcare institutions. The patient speaks Spanish and has some difficulty communicating with healthcare providers due to language barriers.",
                "During a routine check-up, a patient disclosed that they had experienced childhood trauma, including  physical abuse and emotional abuse by a family member. They also reported having financial difficulties and limited access to healthcare due to their low income status. Additionally, the patient disclosed that they were a member of a minority population group and had faced discrimination and social exclusion as a result. They expressed concerns about their mental health, specifically feeling depressed and anxious, and reported using alcohol as a coping mechanism. The patient expressed interest in seeking support and resources to improve their mental health and overall well-being",
                "The patient reported experiencing symptoms of anxiety and depression, which have been affecting their quality of life. The patient disclosed that they had recently lost their job and were facing financial difficulties.The patient reported a history of childhood trauma related to violence and abuse in their household, which has contributed to their current mental health struggles. The patient's family history is significant for a first-degree relative with a history of alcohol abuse. The patient also reported a history of smoking, but had recently quit and was interested in receiving resources for smoking cessation. The patient's medical history is notable for hypertension, which is currently well-controlled with medication. The patient denied any recent substance use or sexual activity, and reported being monogamous in their relationship with their partner. The patient is an immigrant and speaks English as a second language. They reported difficulty accessing healthcare due to lack of transportation and insurance status."]

In [None]:
df = spark.createDataFrame(text_list, StringType()).toDF("text")

In [None]:
result = ner_pipeline.fit(df).transform(df)

In [None]:
result.select(F.explode(F.arrays_zip(result.ner_chunk.result, 
                                     result.ner_chunk.metadata)).alias("cols"))\
      .select(F.expr("cols['0']").alias("chunk"),
              F.expr("cols['1']['entity']").alias("ner_label")).show(30, truncate=False)

+------------------+-----------------+
|chunk             |ner_label        |
+------------------+-----------------+
|55 years old      |Age              |
|divorced          |Marital_Status   |
|Mexcian American  |Race_Ethnicity   |
|woman             |Gender           |
|financial problems|Financial_Status |
|She               |Gender           |
|spanish           |Language         |
|She               |Gender           |
|apartment         |Housing          |
|She               |Gender           |
|diabetes          |Other_Disease    |
|cleaning assistant|Employment       |
|health insurance  |Insurance_Status |
|She               |Gender           |
|son               |Family_Member    |
|student           |Education        |
|college           |Education        |
|depression        |Mental_Health    |
|She               |Gender           |
|she               |Gender           |
|rehab             |Access_To_Care   |
|her               |Gender           |
|catholic faith    |Spiri

In [None]:
from sparknlp_display import NerVisualizer

for i in range(len(text_list)):
    NerVisualizer().display(
        result = result.collect()[i],
        label_col = 'ner_chunk',
        document_col = 'document')
    print("\n"*2)



























## **`ner_sdoh_mentions`**

In [None]:
ner = MedicalNerModel.pretrained("ner_sdoh_mentions", "en", "clinical/models") \
    .setInputCols(["sentence", "token", "embeddings"]) \
    .setOutputCol("ner")

ner_pipeline = Pipeline(stages=[
                                documentAssembler, 
                                sentenceDetector,
                                tokenizer,
                                word_embeddings,
                                ner,
                                ner_converter])



In [None]:
text_list = [
       """The patient is a pleasant, cooperative gentleman with a long standing history (20 years) diverticulitis. He is married and has 3 children. He works in a bank. He denies any alcohol or intravenous drug use. He has been smoking for many years.""",
       """Cooperative gentleman with a long standing history (20 years) diverticulitis. He has been having flares of diverticulitis. The pain is nonradiating, has no provoking factors but is alleviated with narcotics. Social History: He is history teacher. He is divorced and lives at home with his girlfriend. He does not currently and never has used tobacco or illicit drugs. Until 3 weeks ago, he was having 1-19 drinks per day. Currently he uses no alcohol at all. Family History: noncontributory, no history of colon cancers or IBD."""]

In [None]:
df = spark.createDataFrame(text_list, StringType()).toDF("text")

In [None]:
result = ner_pipeline.fit(df).transform(df)

In [None]:
result.select(F.explode(F.arrays_zip(result.ner_chunk.result, 
                                     result.ner_chunk.metadata)).alias("cols"))\
      .select(F.expr("cols['0']").alias("chunk"),
              F.expr("cols['1']['entity']").alias("ner_label")).show(30, truncate=False)

+----------------+----------------+
|chunk           |ner_label       |
+----------------+----------------+
|married         |sdoh_community  |
|children        |sdoh_community  |
|works           |sdoh_economics  |
|alcohol         |behavior_alcohol|
|intravenous drug|behavior_drug   |
|smoking         |behavior_tobacco|
|narcotics       |behavior_drug   |
|teacher         |sdoh_economics  |
|divorced        |sdoh_community  |
|home            |sdoh_environment|
|girlfriend      |sdoh_community  |
|tobacco         |behavior_tobacco|
|illicit drugs   |behavior_drug   |
|drinks          |behavior_alcohol|
|alcohol         |behavior_alcohol|
+----------------+----------------+



In [None]:
from sparknlp_display import NerVisualizer

for i in range(len(text_list)):
    NerVisualizer().display(
        result = result.collect()[i],
        label_col = 'ner_chunk',
        document_col = 'document')
    print("\n"*2)













## **`ner_sdoh_slim_wip`**

In [None]:
text_list = [
""" Mother states that he does smoke, there is a family hx of alcohol on both maternal and paternal sides of the family, maternal grandfather who died of alcohol related complications and paternal grandmother with alcoholism. Pts own drinking began at age 16, living in LA, had a DUI at 17yo after totaling a new car that his mother bought for him, he was married. """,
""" Husband presented as anxious , while friend took notes about pt s condition and names of providers, etc.  Husb reports both he and pt had been drinking on Saturday night, and he left her sitting up in a chair.  In the morning he found her bleeding from the mouth, and it became apparent she had overdosed, and left a suicide note.  Husband and friend report pt has hx of suicide attempts, most recently in of this year.  She also has hx of EtOH abuse, has been to detox and treatment programs several times over recent years, and resided in sober homes until recently.  Husband reports pt was a pedestrian struck by motor vehicle at 12yo , sustained head injury. He reports pt had been diagnosed bipolar disorder. """
]

In [None]:
df = spark.createDataFrame(text_list, StringType()).toDF("text")

In [None]:
ner = MedicalNerModel.pretrained("ner_sdoh_slim_wip", "en", "clinical/models") \
    .setInputCols(["sentence", "token", "embeddings"]) \
    .setOutputCol("ner")

ner_pipeline = Pipeline(stages=[
                                documentAssembler, 
                                sentenceDetector,
                                tokenizer,
                                word_embeddings,
                                ner,
                                ner_converter])



ner_sdoh_slim_wip download started this may take some time.
[OK!]


In [None]:
result = ner_pipeline.fit(df).transform(df)

In [None]:
result.select(F.explode(F.arrays_zip(result.ner_chunk.result, 
                                     result.ner_chunk.metadata)).alias("cols"))\
      .select(F.expr("cols['0']").alias("chunk"),
              F.expr("cols['1']['entity']").alias("ner_label")).show(30, truncate=False)

+-----------+-----------------+
|chunk      |ner_label        |
+-----------+-----------------+
|Mother     |Family_Member    |
|he         |Gender           |
|smoke      |Smoking          |
|alcohol    |Alcohol          |
|maternal   |Family_Member    |
|paternal   |Family_Member    |
|maternal   |Family_Member    |
|grandfather|Family_Member    |
|alcohol    |Alcohol          |
|paternal   |Family_Member    |
|grandmother|Family_Member    |
|alcoholism |Alcohol          |
|drinking   |Alcohol          |
|age 16     |Age              |
|LA         |Geographic_Entity|
|17yo       |Age              |
|his        |Gender           |
|mother     |Family_Member    |
|him        |Gender           |
|he         |Gender           |
|married    |Marital_Status   |
|Husband    |Family_Member    |
|anxious    |Mental_Health    |
|providers  |Employment       |
|Husb       |Family_Member    |
|he         |Gender           |
|drinking   |Alcohol          |
|he         |Gender           |
|her    

In [None]:
from sparknlp_display import NerVisualizer

for i in range(len(text_list)):
    NerVisualizer().display(
        result = result.collect()[i],
        label_col = 'ner_chunk',
        document_col = 'document')
    print("\n"*2)













## **`ner_sdoh_income_social_status_wip`**

In [None]:
text_list = ["Mr. Chen is a 35-year-old immigrant who presents to the emergency department with complaints of abdominal pain and nausea. He reports a history of gastritis and diverticulitis, which have been managed with medication in the past. However, he recently lost his job as a plumber and has been experiencing financial difficulties, which have made it difficult for him to afford his medication and maintain his health.During his visit, Mr. Chen disclosed that he has been divorced for several years and has been struggling to support himself while pursuing a college degree. He reports that the stress of his financial situation and educational demands has taken a toll on his mental health, and he has been experiencing anxiety and depression.The healthcare team conducted a comprehensive assessment of Mr. Chen's social determinants of health and identified several potential barriers to his healthcare access and management of his chronic conditions. They found that his financial difficulties and lack of stable employment have made it difficult for him to afford and access healthcare services.",
                "The patient reported experiencing significant financial difficulties, which have been linked to increased risk for mental health issues such as anxiety and depression. Additionally, the patient disclosed that they were divorced and working as a plumber. The patient expressed concern about being able to provide for their children as a single parent with limited income. As an immigrant and college student, the patient faces additional challenges in terms of finding stable employment and accessing resources for financial assistance",
                "The patient is a 50-year-old female who identifies as African American and primarily speaks Spanish. She comes from a low-income family and resides in a densely populated urban area. She reports feeling socially isolated due to language barriers and struggles to find employment despite having a college degree. Additionally, the patient reports experiencing childhood trauma related to her parent's divorce and subsequent financial difficulties. She notes that her spiritual beliefs and family support have been instrumental in her coping with these stressors. The patient is currently uninsured and expresses concerns about accessing affordable healthcare. She denies any current substance use or smoking history. The patient is alert and oriented with no acute distress on examination.",
                "Mrs. Smith, a 45-year-old immigrant woman, presented to the healthcare institution with symptoms of hypertension and hyperlipidemia. She reported experiencing childhood trauma and emotional abuse from her primary caregiver. Her financial status is precarious, and she struggles to make ends meet as a divorced single mother of two young children. She works as a plumber and does not have health insurance. Mrs. Smith lives in a crowded apartment complex in a high-crime neighborhood, which has led to social exclusion and a lack of social support. She reports smoking and drinking alcohol frequently as a way of coping with stress, and also struggles with obesity and a poor diet due to limited access to healthy foods in her neighborhood. Mrs. Smith's healthcare provider discussed the importance of regular exercise and a healthy diet, as well as the potential benefits of therapy to address her childhood trauma and emotional distress. The provider also referred Mrs. Smith to resources for financial assistance, transportation, and language services, as well as programs for substance abuse and smoking cessation.",
                "A 45-year-old divorced male with financial difficulties presented to the healthcare institution complaining of hypertension and hyperlipidemia. He reported experiencing childhood trauma, including emotional abuse from his primary caregiver. The patient is an immigrant and speaks English as a second language. He works as a plumber and has a history of smoking and alcohol use. He reported experiencing social exclusion due to his race and ethnic background. The patient's quality of life has been impacted by his chronic conditions and financial stress."]

In [None]:
df = spark.createDataFrame(text_list, StringType()).toDF("text")

In [None]:
ner = MedicalNerModel.pretrained("ner_sdoh_income_social_status_wip", "en", "clinical/models") \
    .setInputCols(["sentence", "token", "embeddings"]) \
    .setOutputCol("ner")

ner_pipeline = Pipeline(stages=[
                                documentAssembler, 
                                sentenceDetector,
                                tokenizer,
                                word_embeddings,
                                ner,
                                ner_converter])


result = ner_pipeline.fit(df).transform(df)

ner_sdoh_income_social_status_wip download started this may take some time.
[OK!]


In [None]:
result.select(F.explode(F.arrays_zip(result.ner_chunk.result, 
                                     result.ner_chunk.metadata)).alias("cols"))\
      .select(F.expr("cols['0']").alias("chunk"),
              F.expr("cols['1']['entity']").alias("ner_label")).show(30, truncate=False)

+----------------------------------------------+----------------+
|chunk                                         |ner_label       |
+----------------------------------------------+----------------+
|immigrant                                     |Population_Group|
|plumber                                       |Employment      |
|financial difficulties                        |Financial_Status|
|divorced                                      |Marital_Status  |
|college degree                                |Education       |
|financial situation                           |Financial_Status|
|financial difficulties                        |Financial_Status|
|stable employment                             |Employment      |
|financial difficulties                        |Financial_Status|
|divorced                                      |Marital_Status  |
|plumber                                       |Employment      |
|single                                        |Marital_Status  |
|limited i

In [None]:
from sparknlp_display import NerVisualizer

for i in range(len(text_list)):
    NerVisualizer().display(
        result = result.collect()[i],
        label_col = 'ner_chunk',
        document_col = 'document')
    print("\n"*2)



























## **`ner_sdoh_demographics_wip`**

In [None]:
text_list = ["During a medical evaluation, a healthcare provider asked a 40 year old Hispanic woman about her spiritual beliefs and language preference. The patient indicated that she is Catholic and finds comfort in her faith during times of stress and illness, and she primarily speaks English but also speaks Spanish at home with her family. Understanding these aspects of the patient's background and culture can help the provider deliver culturally sensitive and patient-centered care.",
                "A 61 year old Caucasian man was admitted to a hospital in Korea with respiratory distress. He was accompanied by his adult children, who expressed concern about their father's condition. The patient, a devout Catholic, spoke English as his primary Language.",
                "A pathology report for a patient with breast cancer indicated that she was a 45 year old African American woman with a family history of breast cancer. When the healthcare provider discussed the patient's medical history with her, she mentioned that her mother and aunt had both been diagnosed with breast cancer in their 50s. The provider also inquired about the patient's spiritual beliefs and language preference. The patient shared that she was raised Catholic but no longer practices the faith, and that English is her primary language. The patient expressed concern about the cost of treatment, as she lived in a low-income neighborhood and struggled to afford healthcare.",
                "A radiology report for a patient with suspected lung cancer indicated that he was a 60 year old male who had worked in a factory for over 30 years. The patient is Caucasian and has a history of smoking. The healthcare provider also inquired about the patient's family history of cancer, and the patient indicated that his father had died from lung cancer. The provider recommended that the patient speak with his family members about their medical histories. Additionally, the provider asked about the patient's language preference, and the patient indicated that he primarily speaks English but also speaks Spanish with his wife, who is of Hispanic descent.",
                "A 38 year old woman from a Hispanic background presented to the emergency department with chest pain and shortness of breath. She spoke English as her primary Language and identified as Catholic. The patient reported no significant past medical history or medication use."]

In [None]:
df = spark.createDataFrame(text_list, StringType()).toDF("text")

In [None]:
ner = MedicalNerModel.pretrained("ner_sdoh_demographics_wip", "en", "clinical/models") \
    .setInputCols(["sentence", "token", "embeddings"]) \
    .setOutputCol("ner")

ner_pipeline = Pipeline(stages=[
                                documentAssembler, 
                                sentenceDetector,
                                tokenizer,
                                word_embeddings,
                                ner,
                                ner_converter])


result = ner_pipeline.fit(df).transform(df)

ner_sdoh_demographics_wip download started this may take some time.
[OK!]


In [None]:
result.select(F.explode(F.arrays_zip(result.ner_chunk.result, 
                                     result.ner_chunk.metadata)).alias("cols"))\
      .select(F.expr("cols['0']").alias("chunk"),
              F.expr("cols['1']['entity']").alias("ner_label")).show(30, truncate=False)

+-----------------+-----------------+
|chunk            |ner_label        |
+-----------------+-----------------+
|40 year old      |Age              |
|Hispanic         |Race_Ethnicity   |
|woman            |Gender           |
|her              |Gender           |
|spiritual beliefs|Spiritual_Beliefs|
|she              |Gender           |
|Catholic         |Spiritual_Beliefs|
|her              |Gender           |
|faith            |Spiritual_Beliefs|
|she              |Gender           |
|English          |Language         |
|Spanish          |Language         |
|her              |Gender           |
|61 year old      |Age              |
|Caucasian        |Race_Ethnicity   |
|man              |Gender           |
|Korea            |Geographic_Entity|
|He               |Gender           |
|his              |Gender           |
|adult            |Age              |
|children         |Family_Member    |
|father's         |Family_Member    |
|Catholic         |Spiritual_Beliefs|
|English    

In [None]:
from sparknlp_display import NerVisualizer

for i in range(len(text_list)):
    NerVisualizer().display(
        result = result.collect()[i],
        label_col = 'ner_chunk',
        document_col = 'document')
    print("\n"*2)


























## **`ner_sdoh_social_environment_wip`**





In [None]:
text_list = ["Medical history: Jane was born in a low - income household and experienced significant trauma during her childhood, including emotional abuse. Events such as social exclusion, violence, and abuse can have significant and lasting impacts on a person's physical and mental health. Additionally, patients who have experienced childhood trauma may struggle to access or maintain social support systems, which can exacerbate the negative effects of these events.",
             "During the patient's intake interview, it was noted that she had experienced childhood trauma, which can have long-term effects on a person's mental and physical health. The patient also disclosed that she is currently in an abusive relationship and that her partner is her primary caregiver. Additionally, the patient reported feeling ostracized from her community and the provider referred her to resources for social inclusion. Finally, the provider screened the patient for a history of incarceration and domestic violence, both of which can impact a patient's health, and made appropriate referrals.",
             "Mrs. Smith is a 60-year-old woman who presents with symptoms of anxiety and depression. During her intake interview, she disclosed a history of childhood trauma, including experiences of social exclusion from her primary caregiver. Mrs. Smith reported feeling unsupported by her family and community during this difficult time, leading to long-term psychological distress. Additionally, she shared that she had recently experienced physical violence at the hands of her partner and was concerned for her safety.The healthcare team worked to address these social determinants of health by connecting Mrs. Smith with resources for domestic violence support. They also provided education on the importance of social support networks and connected Mrs. Smith with local community organizations that offer support groups for survivors of domestic violence. By addressing these social factors and providing targeted support, the healthcare team was able to help Mrs. Smith improve her mental health and safety.",
             "Individuals who have been incarcerated, as the patient has, face unique health challenges related to social exclusion and a lack of social support systems. These can include higher rates of infectious disease, mental health disorders, and chronic medical conditions. It is essential that healthcare providers offer comprehensive care to address the physical and psychological impacts of incarceration, including any history of violence or abuse.",
             "Mr. Johnson is a 35-year-old man.During his evaluation, he disclosed a history of childhood trauma, including experiences of emotional abuse from his primary caregiver. Mr. Johnson reported feeling unsupported.He also shared that he had been imprisoned for a non-violent offense and was struggling with the social exclusion and stigma that often come with a criminal record.The healthcare team worked to address these social determinants of health by connecting Mr. Johnson with resources for trauma-informed therapy and counseling for re-entry into society after incarceration. They provided education on the importance of social support networks and connected Mr. Johnson with local organizations that offer support groups."]

In [None]:
df = spark.createDataFrame(text_list, StringType()).toDF("text")

In [None]:
ner = MedicalNerModel.pretrained("ner_sdoh_social_environment_wip", "en", "clinical/models") \
    .setInputCols(["sentence", "token", "embeddings"]) \
    .setOutputCol("ner")

ner_pipeline = Pipeline(stages=[
                                documentAssembler, 
                                sentenceDetector,
                                tokenizer,
                                word_embeddings,
                                ner,
                                ner_converter])


result = ner_pipeline.fit(df).transform(df)

ner_sdoh_social_environment_wip download started this may take some time.
[OK!]


In [None]:
result.select(F.explode(F.arrays_zip(result.ner_chunk.result, 
                                     result.ner_chunk.metadata)).alias("cols"))\
      .select(F.expr("cols['0']").alias("chunk"),
              F.expr("cols['1']['entity']").alias("ner_label")).show(30, truncate=False)

+---------------------------+--------------------+
|chunk                      |ner_label           |
+---------------------------+--------------------+
|trauma during her childhood|Chidhood_Event      |
|emotional abuse            |Violence_Abuse_Legal|
|social exclusion           |Social_Exclusion    |
|violence                   |Violence_Abuse_Legal|
|abuse                      |Violence_Abuse_Legal|
|childhood trauma           |Chidhood_Event      |
|social support             |Social_Support      |
|childhood trauma           |Chidhood_Event      |
|abusive                    |Violence_Abuse_Legal|
|primary caregiver          |Social_Support      |
|ostracized                 |Social_Exclusion    |
|social inclusion           |Social_Exclusion    |
|incarceration              |Violence_Abuse_Legal|
|domestic violence          |Violence_Abuse_Legal|
|childhood trauma           |Chidhood_Event      |
|social exclusion           |Social_Exclusion    |
|caregiver                  |So

In [None]:
from sparknlp_display import NerVisualizer

for i in range(len(text_list)):
    NerVisualizer().display(
        result = result.collect()[i],
        label_col = 'ner_chunk',
        document_col = 'document')
    print("\n"*2)

























