![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/healthcare/PUBLIC_HEALTH_MB4SC.ipynb)

# `Medical Bert For Sequence Classification` for **Public Health Models**

# **Colab Setup**

In [None]:
import json
import os

from google.colab import files

license_keys = files.upload()

with open(list(license_keys.keys())[0]) as f:
    license_keys = json.load(f)

# Defining license key-value pairs as local variables
locals().update(license_keys)

# Adding license key-value pairs to environment variables
os.environ.update(license_keys)

# **Install dependencies**

In [None]:
# Installing pyspark and spark-nlp
! pip install --upgrade -q pyspark==3.1.2 spark-nlp==$PUBLIC_VERSION

# Installing Spark NLP Healthcare
! pip install --upgrade -q spark-nlp-jsl==$JSL_VERSION  --extra-index-url https://pypi.johnsnowlabs.com/$SECRET

# Installing Spark NLP Display Library for visualization
! pip install -q spark-nlp-display

# **Import dependencies into Python and start the Spark session**

In [None]:
import json
import os

import sparknlp
import sparknlp_jsl

from sparknlp.base import *
from sparknlp.util import *
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *
from sparknlp.pretrained import ResourceDownloader

from pyspark.sql import SparkSession
from pyspark.sql import functions as F
from pyspark.sql.types import StringType, IntegerType
from pyspark.ml import Pipeline, PipelineModel

import pandas as pd
pd.set_option('display.max_columns', None)  
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', None)

import string
import numpy as np

params = {"spark.driver.memory":"16G",
          "spark.kryoserializer.buffer.max":"2000M",
          "spark.driver.maxResultSize":"2000M"}

spark = sparknlp_jsl.start(secret = SECRET, params=params)

print ("Spark NLP Version :", sparknlp.version())
print ("Spark NLP_JSL Version :", sparknlp_jsl.version())

spark

# **General Function for MedicalBertForTokenClassifier Pipeline**





In [4]:
def run_pipeline(model, text, lang = 'en'):  
  document_assembler = DocumentAssembler() \
    .setInputCol("text") \
    .setOutputCol("document")

  tokenizer = Tokenizer() \
    .setInputCols("document") \
    .setOutputCol("token")

  sequenceClassifier = MedicalBertForSequenceClassification.pretrained(model, lang, "clinical/models")\
    .setInputCols(["document","token"])\
    .setOutputCol("class")

  pipeline = Pipeline(stages=[
    document_assembler, 
    tokenizer,
    sequenceClassifier
    ])

  df = spark.createDataFrame(text, StringType()).toDF("text")
  results = pipeline.fit(df).transform(df)
   
  print("\n")
  print("<----------------- MODEL NAME:","\033[1m" + model + "\033[0m"," ----------------- >")
  
  res = results.select(F.explode(F.arrays_zip("document.result", "class.result","class.metadata")).alias("col"))\
               .select(F.expr("col['1']").alias("prediction"),
                       F.expr("col['2']").alias("confidence"),
                       F.expr("col['0']").alias("sentence"))
                  
  if res.count()>1:
    udf_func = F.udf(lambda x,y:  x["Some("+str(y)+")"])
    print("\n",model,"\n") 
    res.withColumn('confidence', udf_func(res.confidence, res.prediction)).show(truncate=False)

# **MODELS**

## **bert_sequence_classifier_ade_augmented**

In [None]:
model = "bert_sequence_classifier_ade_augmented"

In [None]:
sample_texts = [
"""I'm so fine today. increasing zyprexa,my condisition is became so good. it has a side effect that increase my weight. i must care about it.""",
"""Actually, also loving it because it is a medicine for bipolar disorder and they named it Latuda.""",
"""Yeah,it can be caused by swelling from around a nerve from ra,but the effexor causes shaking like ur cold(shivering)""",
"""Day three of #nonsmoking - 90% of my thoughts revolve around cigs. The nicotine lozenges I have taste like cherry infused with ashtray.""",
"""I just had a look buddy, and my medication (Seroquel) does affect tolerance to the sun.""",
"""Many new physicians have been identified and added to the Buprenorphine Certified Physicians and Treatment Providers directory!""",
"""I started out with lyrica but i could no longer afford it. it made me bloated. tried cymbalta , my heart was beating wicked fast."""
]

In [None]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_ade_augmented download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_ade_augmented[0m  ----------------- >

 bert_sequence_classifier_ade_augmented 

+----------+----------+-------------------------------------------------------------------------------------------------------------------------------------------+
|prediction|confidence|sentence                                                                                                                                   |
+----------+----------+-------------------------------------------------------------------------------------------------------------------------------------------+
|ADE       |0.99947673|I'm so fine today. increasing zyprexa,my condisition is became so good. it has a side effect that increase my weight. i must care about it.|
|noADE     |0.99999017|Actually, also loving it because it is a medicine for bipolar disorder and they named it L

## **bert_sequence_classifier_self_reported_age_tweet**

In [None]:
model = "bert_sequence_classifier_self_reported_age_tweet"

In [None]:
sample_texts = [
"""Who knew I would spend my Saturday mornings at 21 still watching Disney channel""",
"""My girl, Fancy, just turned 17. She’s getting up there, but she still has the energy of a puppy""",
"""You are my hero! I am 18 years old and have an 8 month old daughter! You and mckayla are so awesome! You and mckayla are such an inspiration to me! I have been watching mckayla since her first video announcing she was pregnant! I love you guys so much!""",
"""Karla, from Flushing visits the office of RepGraceMeng and stresses for a change in the age of entry requirement from 16 years old to 18 years old — "At 16 we are considered a minor and should still be protected" """,
"""Happy new year!  May you continue to be blessed and I was going to tell you on December 24 shout me out for my birthday but I party so hard and didn't realize turning 46 you have to workout before partying. Was to sore to text.""",
"""His name is Kostata age 68. He's the son of Chief Eneas, grandson of Chief Bapsiste, who signed the treaty of Hell Gate, 1855, btwn U.S. and the Allied Tribes of the Flashead Reservation.""",
]

In [None]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_self_reported_age_tweet download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_self_reported_age_tweet[0m  ----------------- >

 bert_sequence_classifier_self_reported_age_tweet 

+---------------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction     |confidence|sentence                                                                                                                                                                                                                                                    |
+---------------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------

## **bert_sequence_classifier_exact_age_reddit**

In [None]:
model = "bert_sequence_classifier_exact_age_reddit"

In [None]:
sample_texts = [
"""I recently learned that Soerens is a progressive systemic disease. I now walk with a cane and occasionally rent a wheelchair for trips to places like Disney. I'm 41 and have slowly been getting worse due to autonomic nervous system issues because of Soerens. I had to fire my original rheumatologist and find someone more familiar with the disease to learn more about it.""",
"""Well we know autoimmunes are a bit of a sliding scale and everyone with Sjogrens has a slightly different set of symptoms. I was diagnosed over 10 years ago and apart from slow saliva production I'm still healthy and relatively symptom free. So yes, it seems odd but maybe it's possible for someone to technically be diagnosed but have no noticeable symptoms?""",
"""Man. That's so scary and must've been so incredibly difficult not getting answers for so long. I'm happy that you finally got your results and began treatment. I'm only 18 and the aches and constant irritation is almost unbearable. I'm genuinely terrified to wait years like this.""",
"""I had usable vision immediately after the transplant. Before leaving the hospital, my opthalmogist came to check my eye pressure. The pressure increased during surgery. When he removed the gauze covering, I noticed a HUGE improvement in my vision in the right eye. Even with 18 stitches in my eye. """,
"""You need to go to a real dry eye specialist who has a Lipiview machine, who does an inflammadry test, and who offers lipiflow/IPL. These things are how you can identify a specialist. Take it from a 22 year old that has seen over 10 doctors in two years.""",
"""I'm from Canada and have worked in optical for 10 years before Optometry school. I know the laws and I can assure you, PD is still not a part of the Rx. She gave it to you for free even though it is a chargeable service.""",
]

In [None]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_exact_age_reddit download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_exact_age_reddit[0m  ----------------- >

 bert_sequence_classifier_exact_age_reddit 

+---------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction     |confidence|sentence                                                                                                                                                                                                                                                                                                                                   

## **bert_sequence_classifier_self_reported_symptoms_tweet**


In [None]:
model = "bert_sequence_classifier_self_reported_symptoms_tweet" 

In [None]:
sample_texts = [
"""Las vacunas 3 y hablamos inminidad vivo  Son bichito vivo dentro de lÃ­quido de la vacuna suelen tener reacciones alÃorgicas si que sepan.""",
"""Yo pense que me estaba dando el  coronavirus porque cuando me levante  casi no podia respirar pero que si era que tenia la nariz topada de mocos.""",
"""Tos, dolor de garganta y fiebre, los síntomas más reportados por los porteños con coronavirus.""",
"""Los pacientes y contactos asintomáticos pueden hacerse lavados nasales con hipoclorito de sodio o gárgaras de sal, de acuerdo a los galenos.""",
"""Enseguida empiezo a meterle por la cabeza con un ladrillo al que me diga que tengo coronavirus por estar con mocos""",
"""Las Jordan de Aliexpress no producen efectos secundarios, no son hipotéticamente capaces de dejarme estéril, causarme muerte súbita, parálisis, mielitis transversa irreversible o daños neurológicos. Creo que existe una gran diferencia. Digo yo, no sé"""
]

In [None]:
run_pipeline(model, sample_texts, lang = 'es')

bert_sequence_classifier_self_reported_symptoms_tweet download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_self_reported_symptoms_tweet[0m  ----------------- >

 bert_sequence_classifier_self_reported_symptoms_tweet 

+--------------------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction          |confidence|sentence                                                                                                                                                                                                                                                  |
+--------------------+----------+-------------------------------------------------------------------------------------------------------------------------

## **bert_sequence_classifier_health_mandates_stance_tweet**

In [None]:
model = "bert_sequence_classifier_health_mandates_stance_tweet"

In [None]:
sample_texts = [
"""It's too dangerous to hold the RNC, but let's send students and teachers back to school.""",
"""So is the flu and pneumonia what are their s stop the Media Manipulation covid has treatments Youre Speaker Pelosi nephew so stop the agenda LIES.""",
"""Just a quick update to my U.S. followers, I'll be making a stop in all 50 states this spring!  No tickets needed, just don't wash your hands, cough on each other.""",
"""Go to a restaurant no mask Do a food shop wear a mask INCONSISTENT No Masks No Masks.""",
"""But if schools close who is gonna occupy those graves Cause politiciansprotected smokers protected drunkardsprotected school kids amp teachers""",
"""New title Maskhole I think Im going to use this very soon coronavirus."""
]

In [None]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_health_mandates_stance_tweet download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_health_mandates_stance_tweet[0m  ----------------- >

 bert_sequence_classifier_health_mandates_stance_tweet 

+----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction|confidence|sentence                                                                                                                                                          |
+----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|FAVOR     |0.99996823|It's too dangerous to hold the RNC, but let's send students and teachers back to school.                                                    

## **bert_sequence_classifier_health_mandates_premise_tweet**


In [None]:
model = "bert_sequence_classifier_health_mandates_premise_tweet"

In [None]:
sample_texts = [
"""It's too dangerous to hold the RNC, but let's send students and teachers back to school.""",
"""So is the flu and pneumonia what are their s stop the Media Manipulation covid has treatments Youre Speaker Pelosi nephew so stop the agenda LIES.""",
"""Just a quick update to my U.S. followers, I'll be making a stop in all 50 states this spring!  No tickets needed, just don't wash your hands, cough on each other.""",
"""Go to a restaurant no mask Do a food shop wear a mask INCONSISTENT No Masks No Masks.""",
"""But if schools close who is gonna occupy those graves Cause politiciansprotected smokers protected drunkardsprotected school kids amp teachers""",
"""New title Maskhole I think Im going to use this very soon coronavirus."""
]

In [None]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_health_mandates_premise_tweet download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_health_mandates_premise_tweet[0m  ----------------- >

 bert_sequence_classifier_health_mandates_premise_tweet 

+--------------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction    |confidence|sentence                                                                                                                                                          |
+--------------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|has_premise   |0.9987241 |It's too dangerous to hold the RNC, but let's send students and teachers back to school.                                 

## **bert_sequence_classifier_self_reported_stress_tweet**


In [None]:
model = "bert_sequence_classifier_self_reported_stress_tweet"

In [None]:
sample_texts = [
"""I need a constant reminder that the world is only temporary. I shouldn't be so stressed abt it. Things in this world are meant to hurt me.""",
"""I lost a lot of weight and hair. I am constantly stressed, I hate this. I am battling on many fronts and I hate this.""",
"""The last years of his life were troubled by a new period of storm and stress which called for his highest powers of calculation and self-control.""",
"""My back hurts, I'm breaking out, I'm constantly crying.. I am so stressed out and there's no help to help me out 😞""",
"""I’m going to have to start turning my phone off when I’m trying to relax. I’m not trying to be constantly stressed."""
]

In [None]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_self_reported_stress_tweet download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_self_reported_stress_tweet[0m  ----------------- >

 bert_sequence_classifier_self_reported_stress_tweet 

+------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction  |confidence|sentence                                                                                                                                         |
+------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------+
|not-stressed|0.9329752 |I need a constant reminder that the world is only temporary. I shouldn't be so stressed abt it. Things in this world are meant to hurt me.       |
|stressed    |0.5380772 |I lost a lot of w

## **bert_sequence_classifier_stress**


In [None]:
model = "bert_sequence_classifier_stress"

In [None]:
sample_texts = [
"""No place in my city has shelter space for us, and I won't put my baby on the literal street. What cities have good shelter programs for homeless mothers and children?""",
"""Sometimes I hate being such an over achiever because I’m in constant stress to do good and be perfect ALL the time""",
"""The last years of his life were troubled by a new period of storm and stress which called for his highest powers of calculation and self-control.""",
"""My back hurts, I'm breaking out, I'm constantly crying.. I am so stressed out and there's no help to help me out 😞""",
"""I lost a lot of weight and hair. I am constantly stressed, I hate this. I am battling on many fronts and I hate this."""
]

In [None]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_stress download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_stress[0m  ----------------- >

 bert_sequence_classifier_stress 

+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction|confidence|sentence                                                                                                                                                              |
+----------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|stress    |0.9984272 |No place in my city has shelter space for us, and I won't put my baby on the literal street. What cities have good shelter programs for homeless mothers and children?|
|no stress |0.6518241 |Som

## **bert_sequence_classifier_stressor**


In [None]:
model = "bert_sequence_classifier_stressor"

In [None]:
sample_texts = [
"""No place in my city has shelter space for us, and I won't put my baby on the literal street. What cities have good shelter programs for homeless mothers and children?""",
"""I just started working a new job after 2 years and I’m worried about messing up. I am not sure about the company will understand my position.""",
"""The last years of his life were troubled by a new period of storm and stress which called for his highest powers of calculation and self-control.""",
"""My back hurts, I'm breaking out, I'm constantly crying.. I am so stressed out and there's no help to help me out 😞""",
"""this advanced stats class really got me dying less than 2 weeks in I’m constantly sweating at the stress of not knowing wtf is happening 😰"""
]

In [None]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_stressor download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_stressor[0m  ----------------- >

 bert_sequence_classifier_stressor 

+---------------------------------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction                       |confidence|sentence                                                                                                                                                              |
+---------------------------------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|Family Issues                    |0.61599123|No place in my city has shelter space for us, and I won't put my baby on the literal street. Wha

##bert_sequence_classifier_self_reported_partner_violence_tweet

In [5]:
model = "bert_sequence_classifier_self_reported_partner_violence_tweet"

In [6]:
sample_texts = [
    """ It s tempting to think of my abusive relationship as this discrete period of time, blocked off from the rest of my life.   But in reality  I think it s far more that in the 18 years leading up to it  I was being primed for it, and in the 17 years since  I ve been reinsured repeatedly.""",
    """ After reading through that the end   People will say anything to please the wome world.   Anything at all.  I hear the police invited him over for domestic violence.   If that child goes without a father they d have accomplished their goal right """,
    """ That twigs article is really hard to read and even harder when  I can relate to so many of her struggles about being in an abusive relationship. but the most heartbreaking part for me is the overwhelming amount of women the story is resonating with.""",
    """ I absolutely never said that. I did mention it s all about a balanced outlook on the situation.   You need to think about those suffering mental illness, domestic violence, potential job loss etc...but simultaneously it would be heartless to discredit the vulnerable   elderly """,
    """ Seriously    You are part of the problem.   Remember when  I broke my silence regarding the abusive   Relationship w    Keith   Ellison    You were supportive of abuse, lying to the public, corruption   smears.   This isn t justice.   You are part of the problem """,
    """ How could  I possibly tell that a male character is a piece of shit if there isn t an incredibly graphic domestic violence sequence   I m clearly the asshole here """ 
               ]

In [7]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_self_reported_partner_violence_tweet download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_self_reported_partner_violence_tweet[0m  ----------------- >

 bert_sequence_classifier_self_reported_partner_violence_tweet 

+-----------------------------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction                   |confidence|sentence                                                                                                                                                                                                                                                                                      |
+-----------------------------+---------

##bert_sequence_classifier_self_reported_vaccine_status_tweet

In [8]:
model = "bert_sequence_classifier_self_reported_vaccine_status_tweet"

In [9]:
sample_texts = [
    """ Perhaps when there s a  COVID 19 vaccine in the coming months  or years , it can help countries like   Pakistan increase national immunization stats if routine immunization is coupled with the virus jab drops.   Should the state, however, make vaccination a mandatory citizenship duty. """,
    """ Today  I received my second dose of the  COVID 19 vaccine. When it becomes available to you, don t wait   get vaccinated.  It s safe, easy, and it saves lives. """,
    """ I got my mom scheduled for the   Covid 19 vaccine.  A great relief to me, to be honest. """,
    """ It feels really exciting to have a personal connection to the province s vaccine numbers.   My step dad s 92 year old mother got her first dose yesterday. """,
    """ The current oxford vaccine is based off the work they did on the non mild coronavirus forms of  SARS and  MERS.   But since they were contained, the urgency to continue the work was reduced until  COVID 19. """,
    """ Got the covid vaccine tonight.... so far side effects for me. Super weak  exhausted. Injection site and arm hurts  AF Feel like  I smoked a fat. Other than that... feeling like a million bucks for doing my part. """
]


In [10]:
run_pipeline(model, sample_texts)

bert_sequence_classifier_self_reported_vaccine_status_tweet download started this may take some time.
[OK!]


<----------------- MODEL NAME: [1mbert_sequence_classifier_self_reported_vaccine_status_tweet[0m  ----------------- >

 bert_sequence_classifier_self_reported_vaccine_status_tweet 

+---------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|prediction     |confidence|sentence                                                                                                                                                                                                                                                                                       |
+---------------+----------+--------------------------------------------