In this Analysis we are going to find the hyperparameters for training the LDA model. 

We are going to use topic coherence for this

#### What is topic coherence?

Topic Coherence measures score a single topic by measuring the degree of semantic similarity between high scoring words in the topic. These measurements help distinguish between topics that are semantically interpretable topics and topics that are artifacts of statistical inference. But,

#### What is coherence?

A set of statements or facts is said to be coherent, if they support each other. Thus, a coherent fact set can be interpreted in a context that covers all or most of the facts. An example of a coherent fact set is "the game is a team sport", "the game is played with a ball", "the game demands great physical efforts"


### Coherence Measures

1. `C_v` measure is based on a sliding window, one-set segmentation of the top words and an indirect confirmation measure that uses normalized pointwise mutual information (NPMI) and the cosine similarity
2. `C_p` is based on a sliding window, one-preceding segmentation of the top words and the confirmation measure of Fitelson's coherence
3. `C_uci` measure is based on a sliding window and the pointwise mutual information (PMI) of all word pairs of the given top words
4. `C_umass` is based on document cooccurrence counts, a one-preceding segmentation and a logarithmic conditional probability as confirmation measure
5. `C_npmi` is an enhanced version of the C_uci coherence using the normalized pointwise mutual information (NPMI)
6. `C_a` is baseed on a context window, a pairwise comparison of the top words and an indirect confirmation measure that uses normalized pointwise mutual information (NPMI) and the cosine similarity

Pre processing the text

In [1]:
### 1. Import Libraries

# for text preprocessing
import re
import spacy

from nltk.corpus import stopwords 
from nltk.stem.wordnet import WordNetLemmatizer
import string

# import numpy for matrix operation
import numpy as np

# Importing Gensim
import gensim
from gensim import corpora

# to suppress warnings
from warnings import filterwarnings
filterwarnings('ignore')

nlp = spacy.load('en_core_web_sm')

In [2]:
# Importing modules
import pandas as pd
import os
import sys

# Read data into papers
#papers = pd.read_csv(os.path.join(sys.path[0],'../Dataset/ClinicalSTS2018/ClinicalSTS/clinicalSTS.train.txt'))
ClinicalTexts = pd.read_csv(os.path.join(sys.path[0],'data/clinicalSTS.train.txt'), sep='\t', header=None)

# combining all the documents into a list:

corpus = ClinicalTexts[0].tolist()

In [3]:
import gensim
from gensim.utils import simple_preprocess

def sent_to_words(sentences):
    for sentence in sentences:
        yield(gensim.utils.simple_preprocess(str(sentence), deacc=True))  # deacc=True removes punctuations

data_words = list(sent_to_words(corpus))

print(data_words[:1][0][:30])

['insulin', 'nph', 'human', 'novolin', 'unit', 'ml', 'suspension', 'subcutaneous', 'as', 'directed', 'by', 'prescriber']


In [4]:
# Build the bigram and trigram models
bigram = gensim.models.Phrases(data_words, min_count=5, threshold=100) # higher threshold fewer phrases.
trigram = gensim.models.Phrases(bigram[data_words], threshold=100)  

# Faster way to get a sentence clubbed as a trigram/bigram
bigram_mod = gensim.models.phrases.Phraser(bigram)
trigram_mod = gensim.models.phrases.Phraser(trigram)

In [5]:
def make_bigrams(texts):
    return [bigram_mod[doc] for doc in texts]

def make_trigrams(texts):
    return [trigram_mod[bigram_mod[doc]] for doc in texts]

def lemmatization(sent, allowed_postags=['NOUN', 'ADJ', 'VERB', 'ADV']):
    """https://spacy.io/api/annotation"""
    texts_out = []
    #print(sent)
    doc = nlp(" ".join(sent.split()))
    print(doc)
    return [token.lemma_ for token in doc if token.pos_ in allowed_postags]


In [6]:
# Apply Preprocessing on the Corpus

# stop loss words 
stop = set(stopwords.words('english'))
stop.update(['from', 'subject', 're', 'edu', 'use'])

# punctuation 
exclude = set(string.punctuation) 

# lemmatization
lemma = WordNetLemmatizer() 

# One function for all the steps:
def clean(doc):
    
    # convert text into lower case + split into words
    stop_free = " ".join([i for i in doc.lower().split() if i not in stop])
    
    # remove any stop words present
    punc_free = ''.join(ch for ch in stop_free if ch not in exclude)  
    
    # Form Bigrams
    data_words_bigrams = ''.join(bigram_mod[doc])
    print(data_words_bigrams)
    lem = lemmatization(data_words_bigrams, allowed_postags=['NOUN', 'ADJ', 'VERB', 'ADV'])
    print(lem)
    
    data_lemmatized = " ".join(lem)

    # remove punctuations + normalize the text
    #normalized = " ".join(lemma.lemmatize(word) for word in punc_free.split())  
    return data_lemmatized

# clean data stored in a new list
clean_corpus = [clean(doc).split() for doc in corpus] 

Insulin NPH Human [NOVOLIN N] 100 unit/mL suspension subcutaneous as directed by prescriber.
Insulin NPH Human [NOVOLIN N] 100 unit/mL suspension subcutaneous as directed by prescriber.
['unit', 'suspension', 'subcutaneous', 'direct']
 Patient arrives ambulatory, Gait steady, History obtained from patient, Patient appears comfortable, Patient cooperative, alert, Skin warm.
Patient arrives ambulatory, Gait steady, History obtained from patient, Patient appears comfortable, Patient cooperative, alert, Skin warm.
['arrive', 'ambulatory', 'gait', 'steady', 'history', 'obtain', 'patient', 'appear', 'comfortable', 'patient', 'cooperative', 'alert', 'skin', 'warm']
 Peripheral IV site, established in the right forearm, using an 18 gauge catheter, in one attempt.
Peripheral IV site, established in the right forearm, using an 18 gauge catheter, in one attempt.
['site', 'establish', 'right', 'forearm', 'use', 'gauge', 'catheter', 'attempt']
 No: new confusion or inability to stay alert and awake

Patient Education: Ready to learn, no apparent learning barriers were identified; learning preferences include listening.
['ready', 'learn', 'apparent', 'learning', 'barrier', 'identify', 'learning', 'preference', 'include', 'listen']
 Negative cardiovascular review of systems, Historian denies chest pain, dyspnea on exertion.
Negative cardiovascular review of systems, Historian denies chest pain, dyspnea on exertion.
['negative', 'cardiovascular', 'review', 'system', 'deny', 'chest', 'pain', 'dyspnea', 'exertion']
 No: joint swelling; pain of a type other than joint pain; a limp without known injury; discomfort in legs that improves with movement usually occurring at night; painful cramping in your hip, thigh or calf muscles after walking or climbing stairs; muscle cramps; lower leg swelling (without injury); varicose veins or 'spider veins'; ulcers or sores on the lower leg; red or hot area of the skin; pale, cold, or blue leg(s)or toes; nail problem; cracking, peeling, burning or it

When ambulating with a cane, place the cane in the hand on the side opposite the surgical limb.
['when', 'ambulate', 'cane', 'place', 'cane', 'hand', 'side', 'surgical', 'limb']
 Discussed goals, risk, alternatives, advance directives, and the necessity of the other members of the surgical team participating in the procedure with the patient (and whoever else is present).
Discussed goals, risk, alternatives, advance directives, and the necessity of the other members of the surgical team participating in the procedure with the patient (and whoever else is present).
['discuss', 'goal', 'risk', 'alternative', 'advance', 'directive', 'necessity', 'other', 'member', 'surgical', 'team', 'participate', 'procedure', 'patient', 'else', 'present']
No: fever present (greater than or equal to 100.4 F or 38 C) or suspected fever; shaking chills or nausea or vomiting present
No: fever present (greater than or equal to 100.4 F or 38 C) or suspected fever; shaking chills or nausea or vomiting present


Discussed risks, goals, alternatives, advance directives, and the necessity of other members of the healthcare team participating in the procedure with (patient) (legal representative and others present during the discussion).
['discuss', 'risk', 'goal', 'alternative', 'advance', 'directive', 'necessity', 'other', 'member', 'healthcare', 'team', 'participate', 'procedure', 'patient', 'legal', 'representative', 'other', 'present', 'discussion']
 Patient needs assistance with the following instrumental activities of daily  living: meal preparation, medication administration, housekeeping, shopping, managing finances, transportation use (drive car/use taxi/bus).
Patient needs assistance with the following instrumental activities of daily living: meal preparation, medication administration, housekeeping, shopping, managing finances, transportation use (drive car/use taxi/bus).
['patient', 'need', 'assistance', 'follow', 'instrumental', 'activity', 'daily', 'living', 'meal', 'preparation', 

No: new confusion or inability to stay alert and awake; noisy, wheezy or raspy breathing that does not clear with coughing; newly stiff or painful neck; purple or red rash/blotches that stay when pressed by a glass (purpural rash); feeling like you are going to pass out EVERY time you stand (or sit) up or muffled voice or inability to open mouth fully
['new', 'confusion', 'inability', 'stay', 'alert', 'awake', 'noisy', 'wheezy', 'breathing', 'clear', 'cough', 'newly', 'stiff', 'painful', 'neck', 'purple', 'red', 'rash', 'blotch', 'stay', 'when', 'press', 'glass', 'feel', 'go', 'pass', 'time', 'stand', 'sit', 'up', 'muffled', 'voice', 'inability', 'open', 'mouth', 'fully']
 Patient discharged to home, ambulating without assistance, family driving, accompanied by husband/wife/partner, Discharge instructions given to patient, Above person(s) verbalized understanding of discharge instructions and follow-up care.
Patient discharged to home, ambulating without assistance, family driving, acc

Identified Illness as a learning need, Identified Follow-up care as a learning need, Identified Medications as a learning need, Patients primary language is English, No Barriers to learning were identified, No interventions were used to address Barriers to Learning, Teaching methods used included Printed Patient Instructions.
['learning', 'need', 'care', 'learning', 'need', 'learning', 'need', 'patient', 'primary', 'language', 'barrier', 'learn', 'identify', 'intervention', 'use', 'address', 'barrier', 'teaching', 'method', 'use', 'include']
 I instructed the patient on the importance of maintaining a diet constant in vitamin K, reporting any change in over-the-counter or prescription medicines reporting any nausea with vomiting/diarrhea, reporting viral/bacterial febrile illnesses.
I instructed the patient on the importance of maintaining a diet constant in vitamin K, reporting any change in over-the-counter or prescription medicines reporting any nausea with vomiting/diarrhea, report

The patient has not experienced TIA symptoms, facial drooping, left leg/calf pain, right leg/calf pain, dyspnea, prolonged bleeding, blood in stool or urine, nose bleeds, easy bruising, changes in prescription medicines, changes in over-the-counter medicines, recent injury/fall(s), blood pressure elevations above 140/90.
['patient', 'experience', 'symptom', 'facial', 'drooping', 'leave', 'leg', 'calf', 'pain', 'right', 'leg', 'calf', 'pain', 'dyspnea', 'prolonged', 'bleeding', 'blood', 'stool', 'urine', 'nose', 'bleed', 'easy', 'bruising', 'change', 'prescription', 'medicine', 'change', 'counter', 'medicine', 'recent', 'injury', 'fall(s', 'blood', 'pressure', 'elevation']
 During or after exercise, patient denies:  rash or hives, vertigo, syncope episode, chest pain, shortness of breath, wheeze, cough, palpitations, heat-related illness, excessive fatigue, headaches.
During or after exercise, patient denies: rash or hives, vertigo, syncope episode, chest pain, shortness of breath, whee

Identified Illness as a learning need, No Barriers to learning were identified, No interventions were used to address Barriers to Learning, Teaching methods used included Printed Patient Instructions, Patient verbalized an understanding of discharge teaching.
['learning', 'need', 'barrier', 'learn', 'identify', 'intervention', 'use', 'address', 'barrier', 'teaching', 'method', 'use', 'include', 'verbalize', 'understanding', 'discharge', 'teaching']
 I explained the diagnosis and treatment plan in detail, and the patient clearly expressed understanding of the content reviewed.
I explained the diagnosis and treatment plan in detail, and the patient clearly expressed understanding of the content reviewed.
['explain', 'diagnosis', 'treatment', 'plan', 'detail', 'patient', 'clearly', 'express', 'understanding', 'content', 'review']
Explained diagnosis and treatment plan; patient appears to understand the content of our discussion.
Explained diagnosis and treatment plan; patient appears to u

Identified Illness as a learning need, Patients primary language is English, No Barriers to learning were identified, Involved Family Member or Primary Caregiver to address Barriers to Learning, Teaching methods used included Printed Patient Instructions, Verbal Instructions, Patient verbalized an understanding of discharge teaching.
['learning', 'need', 'patient', 'primary', 'language', 'barrier', 'learn', 'identify', 'address', 'barrier', 'teaching', 'method', 'use', 'include', 'verbalize', 'understanding', 'discharge', 'teaching']
 Medication(s) were reviewed by discussing verbally with patient and/or caregiver, reviewed by checking medication bottles brought in by patient and/or caregiver.
Medication(s) were reviewed by discussing verbally with patient and/or caregiver, reviewed by checking medication bottles brought in by patient and/or caregiver.
['review', 'discuss', 'verbally', 'patient', 'caregiver', 'review', 'check', 'medication', 'bottle', 'bring', 'patient', 'caregiver']
 

The patient had a chance to have any questions about this procedure answered, understand(s) and wish(es) to proceed.
['patient', 'chance', 'question', 'procedure', 'answer', 'proceed']
 F or 38 C) or suspected fever; vomiting; pain, redness, discharge, tearing or swelling of the eye; diarrhea or mouth ulcers
F or 38 C) or suspected fever; vomiting; pain, redness, discharge, tearing or swelling of the eye; diarrhea or mouth ulcers
['suspect', 'fever', 'vomiting', 'pain', 'redness', 'discharge', 'tear', 'swelling', 'eye', 'diarrhea', 'mouth', 'ulcer']
The patient understands the information and questions answered; the patient wishes to proceed with the biopsy.
The patient understands the information and questions answered; the patient wishes to proceed with the biopsy.
['patient', 'understand', 'information', 'question', 'answer', 'patient', 'wish', 'proceed', 'biopsy']
The diagnosis and treatment plans were explained and the patient expressed understanding of the content.
The diagnosis 

Negative neurologic review of systems, Historian denies confusion, dizziness, focal weakness, gait changes, headache.
['negative', 'neurologic', 'review', 'system', 'deny', 'confusion', 'dizziness', 'focal', 'weakness', 'gait', 'change']
Explained diagnosis and treatment plan; patient/child/care giver expressed understanding of the content.
Explained diagnosis and treatment plan; patient/child/care giver expressed understanding of the content.
['explain', 'diagnosis', 'treatment', 'plan', 'patient', 'child', 'care', 'giver', 'express', 'understanding', 'content']
 Discussed with the patient the necessity of other members of the healthcare team, both male and female, participating in the procedure if needed.
Discussed with the patient the necessity of other members of the healthcare team, both male and female, participating in the procedure if needed.
['discuss', 'patient', 'necessity', 'other', 'member', 'healthcare', 'team', 'male', 'female', 'participate', 'procedure', 'need']
 Educa

Patient's age is 9 years of age to less than 18 years old: Administer Inactivated Influenza Virus Vaccine (Fluzone) 0.5 mL, intramuscular.
['age', 'year', 'age', 'less', 'year', 'old', 'ml', 'intramuscular']
Albuterol 90 mcg/Act HFA Aerosol 1-2 puffs by inhalation as directed by prescriber as needed.
Albuterol 90 mcg/Act HFA Aerosol 1-2 puffs by inhalation as directed by prescriber as needed.
['mcg', 'puff', 'inhalation', 'direct', 'need']
Goals:  Patient will verbalize and demonstrate understanding of home exercise program following this therapy session.
Goals: Patient will verbalize and demonstrate understanding of home exercise program following this therapy session.
['goal', 'will', 'verbalize', 'demonstrate', 'understanding', 'home', 'exercise', 'program', 'follow', 'therapy', 'session']
 Identified Illness as a learning need, No Barriers to learning were identified, No interventions were used to address Barriers to Learning, Teaching methods used included Printed Patient Instruct

Explained diagnosis and treatment plan; patient/parents expressed understanding of the content.
['explain', 'diagnosis', 'treatment', 'plan', 'patient', 'parent', 'express', 'understanding', 'content']
 Patient's age is 9 years of age to less than 18 years old: Administer Inactivated Influenza Virus Vaccine (Fluzone) 0.5 mL, intramuscular.
Patient's age is 9 years of age to less than 18 years old: Administer Inactivated Influenza Virus Vaccine (Fluzone) 0.5 mL, intramuscular.
['age', 'year', 'age', 'less', 'year', 'old', 'ml', 'intramuscular']
 No: inability to speak or make normal sounds; new confusion or inability to stay alert and awake; sudden swelling of the lips, tongue or mouth; struggling to breathe even while inactive or resting; pain, pressure or tightness in the chest, arm or shoulder, jaw, or neck or currently feeling like you are going to collapse every time you stand or sit up
No: inability to speak or make normal sounds; new confusion or inability to stay alert and awake

Instructions: 1 drop 4 times per day for 1 week, then, 1 drop 3 times per day for 1 week, then, 1 drop 2 times per day for 1 week, then, 1 drop 1 time per day for 1 week, then stop.
['instruction', 'drop', 'time', 'day', 'week', 'then', 'drop', 'time', 'day', 'week', 'then', 'drop', 'time', 'day', 'week', 'then', 'drop', 'time', 'day', 'week', 'then', 'stop']
 I explained the diagnosis and treatment plan in detail, and the patient clearly expressed understanding of the content reviewed.
I explained the diagnosis and treatment plan in detail, and the patient clearly expressed understanding of the content reviewed.
['explain', 'diagnosis', 'treatment', 'plan', 'detail', 'patient', 'clearly', 'express', 'understanding', 'content', 'review']
 Patient Educational Needs:  Patient assessed:  Ready to learn, no apparent learning barriers.
Patient Educational Needs: Patient assessed: Ready to learn, no apparent learning barriers.
['patient', 'assess', 'ready', 'learn', 'apparent', 'learning', '

We discussed the procedure itself and the fact that he would need someone else to drive him home following the procedure.
['discuss', 'procedure', 'fact', 'would', 'need', 'else', 'drive', 'home', 'follow', 'procedure']
 Discussed the risks, goals, alternatives, and advanced directives and the necessity of other members of the healthcare team participating in the procedure with the patient (or legal representative and other present during the discussion).
Discussed the risks, goals, alternatives, and advanced directives and the necessity of other members of the healthcare team participating in the procedure with the patient (or legal representative and other present during the discussion).
['discuss', 'risk', 'goal', 'alternative', 'advanced', 'directive', 'necessity', 'other', 'member', 'healthcare', 'team', 'participate', 'procedure', 'patient', 'legal', 'representative', 'other', 'present', 'discussion']
Additional follow up:  advised to recheck with primary provider in one week or 

Identified Illness as a learning need, No Barriers to learning were identified, No interventions were used to address Barriers to Learning, Teaching methods used included Printed Patient Instructions.
['learning', 'need', 'barrier', 'learn', 'identify', 'intervention', 'use', 'address', 'barrier', 'teaching', 'method', 'use', 'include']
 No: joint swelling; pain of a type other than joint pain; a limp without a known injury; discomfort in the legs that improves with movement, usually occurring at night; painful cramping in your hip, thigh or calf muscles after walking or climbing stairs; muscle cramps; lower leg swelling without a known injury; varicose veins or spider veins; ulcers or sores on the lower leg; red or hot area of the skin; pale, cold or blue leg or toes; nail problem; cracking, peeling, burning or itching between the toes or on the soles of the feet; lump or swollen gland under the skin or unable to remove a ring or other piece of jewelry
No: joint swelling; pain of a ty

Parent ready to learn, no apparent learning barriers were identified; learning preferences include delineated.
['parent', 'ready', 'learn', 'apparent', 'learning', 'barrier', 'identify', 'learning', 'preference', 'include', 'delineate']
 After discussion of the risks, benefits, and alternatives to treatment with cryotherapy, informed consent was obtained.
After discussion of the risks, benefits, and alternatives to treatment with cryotherapy, informed consent was obtained.
['discussion', 'risk', 'benefit', 'alternative', 'treatment', 'cryotherapy', 'informed', 'consent', 'obtain']
 I have reviewed the physician assistant and nursing documentation, studies, and consultations and agree with the findings documented on the written ED Record.
I have reviewed the physician assistant and nursing documentation, studies, and consultations and agree with the findings documented on the written ED Record.
['review', 'physician', 'assistant', 'nursing', 'documentation', 'study', 'consultation', 'ag

Discussed risks, goals, alternatives, and advanced directives, the necessity of other members of the healthcare team participating in the procedure with the patient (or legal representative and others present during the discussion).
['discuss', 'risk', 'goal', 'alternative', 'advanced', 'directive', 'necessity', 'other', 'member', 'healthcare', 'team', 'participate', 'procedure', 'patient', 'legal', 'representative', 'other', 'present', 'discussion']
 No: new confusion or inability to stay alert and awake; currently feeling like you are going to collapse every time you stand (sit); newly stiff or painful neck; purple or red rash/blotches that stay when pressed by a glass (purpural rash) or pain, pressure or tightness in the chest, jaw or arm
No: new confusion or inability to stay alert and awake; currently feeling like you are going to collapse every time you stand (sit); newly stiff or painful neck; purple or red rash/blotches that stay when pressed by a glass (purpural rash) or pain,

Negative respiratory review of systems, Historian denies cough, shortness of breath, sputum, wheezing.
['negative', 'respiratory', 'review', 'system', 'deny', 'cough', 'shortness', 'breath', 'sputum', 'wheeze']
 No: new confusion or inability to stay alert and awake; feeling like you are going to pass out EVERY time you stand (or sit) up; blue or dusky lips, skin or nail beds; purple or red rash/blotches that stay when pressed by a glass (purpural rash); newly stiff or painful neck or puffiness around the eyes
No: new confusion or inability to stay alert and awake; feeling like you are going to pass out EVERY time you stand (or sit) up; blue or dusky lips, skin or nail beds; purple or red rash/blotches that stay when pressed by a glass (purpural rash); newly stiff or painful neck or puffiness around the eyes
['new', 'confusion', 'inability', 'stay', 'alert', 'awake', 'feel', 'go', 'pass', 'time', 'stand', 'sit', 'up', 'blue', 'dusky', 'lip', 'skin', 'nail', 'bed', 'purple', 'red', 'ras

Discussed with the patient the necessity of other members of the healthcare team, both male and female, participating in the procedure if needed.
['discuss', 'patient', 'necessity', 'other', 'member', 'healthcare', 'team', 'male', 'female', 'participate', 'procedure', 'need']
 Identified Illness as a learning need, No Barriers to learning were identified, Teaching methods used included Printed Patient Instructions, Verbal Instructions, Patient verbalized an understanding of discharge teaching.
Identified Illness as a learning need, No Barriers to learning were identified, Teaching methods used included Printed Patient Instructions, Verbal Instructions, Patient verbalized an understanding of discharge teaching.
['learning', 'need', 'barrier', 'learn', 'identify', 'teaching', 'method', 'use', 'include', 'verbalize', 'understanding', 'discharge', 'teaching']
Fluticasone Propionate [FLOVENT HFA] 110 mcg/Act Aerosol 2 puffs by inhalation two times a day.
Fluticasone Propionate [FLOVENT HFA]

Reviewed NCEP III guidelines with the patient including goals of total cholesterol less than 170, triglycerides less than 150, HDL cholesterol greater than 45 for men/greater than 55 for women, and LDL cholesterol of less than 70.
['guideline', 'patient', 'include', 'goal', 'total', 'cholesterol', 'less', 'triglyceride', 'less', 'cholesterol', 'great', 'man', 'great', 'woman', 'ldl', 'cholesterol', 'less']
 Patient discharged to home, ambulating without assistance, family driving, accompanied by parent, Discharge instructions given to patient, Discharge instructions given to mother, Above person(s) verbalized understanding of discharge instructions and follow-up care.
Patient discharged to home, ambulating without assistance, family driving, accompanied by parent, Discharge instructions given to patient, Discharge instructions given to mother, Above person(s) verbalized understanding of discharge instructions and follow-up care.
['patient', 'discharge', 'home', 'ambulate', 'assistance'

Patient discharged to home, ambulating without assistance, family driving, accompanied by parent, Discharge instructions given to patient, Above person(s) verbalized understanding of discharge instructions and follow-up care.
['patient', 'discharge', 'home', 'ambulate', 'assistance', 'family', 'driving', 'accompany', 'parent', 'discharge', 'instruction', 'give', 'patient', 'verbalize', 'understanding', 'discharge', 'instruction', 'follow', 'care']
 Discussed the risks, benefits, alternatives, Advance Directives, and the necessity of other members of the healthcare team participating in the procedure.
Discussed the risks, benefits, alternatives, Advance Directives, and the necessity of other members of the healthcare team participating in the procedure.
['discuss', 'risk', 'benefit', 'alternative', 'necessity', 'other', 'member', 'healthcare', 'team', 'participate', 'procedure']
The client verbalized understanding and consented to the plan of care and the goals established.
The client v

The general care of a sprain includes the use of a medication to reduce pain, the use of a splint to reduce movement and Resting, Icing, Compressing and Elevating the injured area.
['general', 'care', 'include', 'use', 'medication', 'reduce', 'pain', 'use', 'splint', 'reduce', 'movement', 'elevate', 'injure', 'area']
 Discussed the risks, goals, alternatives, and the necessity of other members of the healthcare team participating in the procedure.
Discussed the risks, goals, alternatives, and the necessity of other members of the healthcare team participating in the procedure.
['discuss', 'risk', 'goal', 'alternative', 'necessity', 'other', 'member', 'healthcare', 'team', 'participate', 'procedure']
 All questions were answered, and the patient was provided with the following printed materials:
All questions were answered, and the patient was provided with the following printed materials:
['question', 'answer', 'patient', 'provide', 'follow', 'print', 'material']
Education provided on 

Patient Education: Cognitive and/or language difficulties or the patient's age prevented the patient from understanding the diagnosis and/or treatment plan.
['language', 'difficulty', 'patient', 'age', 'prevent', 'patient', 'understand', 'diagnosis', 'treatment', 'plan']
 I reviewed the history, examined the patient, reviewed the available imaging studies, and agree with the findings as presented.
I reviewed the history, examined the patient, reviewed the available imaging studies, and agree with the findings as presented.
['review', 'history', 'examine', 'patient', 'review', 'available', 'imaging', 'study', 'agree', 'finding', 'present']
 No: new confusion or inability to stay alert and awake; currently feeling like you are going to collapse every time you stand or sit up; complete inability to swallow; new neck pain and difficulty bending the neck; purple or red rash/blotches that stay when pressed by a glass (purpuric rash); pain, pressure or tightness in the chest, arm or shoulder,

I explained the diagnosis and treatment plan for the patient who expressed understanding of the content.
['explain', 'diagnosis', 'treatment', 'plan', 'patient', 'express', 'understanding', 'content']
Triamcinolone Acetonide [KENALOG] 0.1 % cream 1 apply topically as directed by prescriber as needed.
Triamcinolone Acetonide [KENALOG] 0.1 % cream 1 apply topically as directed by prescriber as needed.
['%', 'cream', 'apply', 'topically', 'direct', 'need']
 No: inability to speak or make normal sounds; new confusion or inability to stay alert and awake; complete inability to swallow; sudden onset of cough, choking or gagging due to inhaling something into your airway; pain, pressure or tightness in the chest, arm or shoulder, jaw, or neck; large amounts of pink or white frothy sputum or noisy, wheezy or raspy breathing that does not clear with coughing
No: inability to speak or make normal sounds; new confusion or inability to stay alert and awake; complete inability to swallow; sudden on

No: new confusion or inability to stay alert and awake; currently struggling to breathe, even while inactive or resting; currently feeling like you are going to collapse every time you stand (sit); any chest pain or discomfort; vomit that looks like ground coffee; vomiting blood; uncontrollable or continuous rectal bleeding; black, sticky, tar-like stools or purple or red rash/blotches that stay when pressed by a glass (purpural rash)
['new', 'confusion', 'inability', 'stay', 'alert', 'awake', 'currently', 'struggle', 'breathe', 'even', 'inactive', 'rest', 'currently', 'feel', 'go', 'collapse', 'time', 'stand', 'sit', 'chest', 'pain', 'discomfort', 'vomit', 'look', 'ground', 'coffee', 'vomit', 'blood', 'uncontrollable', 'continuous', 'rectal', 'bleeding', 'black', 'sticky', 'tar', 'like', 'stool', 'purple', 'red', 'rash', 'blotch', 'stay', 'when', 'press', 'glass']
 No: new confusion or inability to stay alert and awake; noisy, wheezy or raspy breathing that does not clear with coughin

Green or yellow discharge in the eye or on eyelashes continuing throughout the day, bloodshot or pink eyes
['green', 'yellow', 'discharge', 'eye', 'eyelash', 'continue', 'day', 'bloodshot', 'pink', 'eye']
Patient is here for the following immunization(s):  Live Intranasal Influenza Virus Vaccination
Patient is here for the following immunization(s): Live Intranasal Influenza Virus Vaccination
['patient', 'here', 'follow', 'live']
 The above was discussed with the patient, and she voiced understanding of the content and plan.
The above was discussed with the patient, and she voiced understanding of the content and plan.
['above', 'discuss', 'patient', 'voice', 'understanding', 'content', 'plan']
 Side rails up, Cart/Stretcher in lowest position, Call light within reach, Hospital ID band on, Patient safety, dignity, sense of well being and individual rights are respected.
Side rails up, Cart/Stretcher in lowest position, Call light within reach, Hospital ID band on, Patient safety, digni

In [7]:
clean_corpus

[['unit', 'suspension', 'subcutaneous', 'direct'],
 ['arrive',
  'ambulatory',
  'gait',
  'steady',
  'history',
  'obtain',
  'patient',
  'appear',
  'comfortable',
  'patient',
  'cooperative',
  'alert',
  'skin',
  'warm'],
 ['site',
  'establish',
  'right',
  'forearm',
  'use',
  'gauge',
  'catheter',
  'attempt'],
 ['new',
  'confusion',
  'inability',
  'stay',
  'alert',
  'awake',
  'currently',
  'struggle',
  'breathe',
  'even',
  'inactive',
  'rest',
  'currently',
  'feel',
  'go',
  'collapse',
  'time',
  'stand',
  'sit',
  'vomit',
  'look',
  'ground',
  'coffee',
  'vomit',
  'blood',
  'uncontrollable',
  'continuous',
  'rectal',
  'bleeding',
  'black',
  'sticky',
  'tar',
  'like',
  'stool',
  'heavy',
  'vaginal',
  'bleeding',
  'purple',
  'red',
  'rash',
  'blotch',
  'stay',
  'when',
  'press',
  'glass'],
 ['spend',
  'minute',
  'patient',
  'great',
  '%',
  'time',
  'spend',
  'counsel',
  'patient',
  'regard',
  'diagnosis',
  'available',


Creating Document term Matrix

In [8]:
dict_ = corpora.Dictionary(clean_corpus)

print(dict_)

# Converting list of documents (corpus) into Document Term Matrix using the dictionary 
doc_term_matrix = [dict_.doc2bow(i) for i in clean_corpus]
doc_term_matrix[0:2]

Dictionary(1349 unique tokens: ['direct', 'subcutaneous', 'suspension', 'unit', 'alert']...)


[[(0, 1), (1, 1), (2, 1), (3, 1)],
 [(4, 1),
  (5, 1),
  (6, 1),
  (7, 1),
  (8, 1),
  (9, 1),
  (10, 1),
  (11, 1),
  (12, 1),
  (13, 2),
  (14, 1),
  (15, 1),
  (16, 1)]]

Building Base model

In [9]:
# Build LDA model
lda_model = gensim.models.LdaMulticore(corpus=doc_term_matrix,
                                       id2word=dict_,
                                       num_topics=6, 
                                       random_state=100,
                                       chunksize=100,
                                       passes=10,
                                       per_word_topics=True)

In [10]:
from pprint import pprint

# Print the Keyword in the 10 topics
pprint(lda_model.print_topics())


[(0,
  '0.062*"patient" + 0.044*"use" + 0.036*"barrier" + 0.034*"teaching" + '
  '0.033*"discharge" + 0.029*"need" + 0.029*"understanding" + 0.027*"learning" '
  '+ 0.022*"verbalize" + 0.021*"include"'),
 (1,
  '0.052*"other" + 0.045*"procedure" + 0.045*"discuss" + 0.039*"member" + '
  '0.039*"participate" + 0.039*"team" + 0.037*"necessity" + 0.033*"risk" + '
  '0.030*"healthcare" + 0.028*"alternative"'),
 (2,
  '0.042*"patient" + 0.039*"treatment" + 0.038*"plan" + 0.035*"learning" + '
  '0.029*"diagnosis" + 0.026*"explain" + 0.026*"understanding" + '
  '0.025*"express" + 0.024*"learn" + 0.024*"barrier"'),
 (3,
  '0.037*"pain" + 0.018*"patient" + 0.012*"right" + 0.012*"change" + '
  '0.011*"leg" + 0.011*"time" + 0.010*"eye" + 0.008*"leave" + 0.007*"history" '
  '+ 0.007*"mcg"'),
 (4,
  '0.026*"stay" + 0.024*"inability" + 0.023*"new" + 0.016*"awake" + '
  '0.015*"alert" + 0.015*"confusion" + 0.014*"neck" + 0.013*"when" + '
  '0.013*"fever" + 0.012*"pain"'),
 (5,
  '0.017*"stay" + 0.017*

In [11]:
num_topics=6

import pyLDAvis.gensim
import pickle 
import pyLDAvis

# Visualize the topics
pyLDAvis.enable_notebook()

LDAvis_data_filepath = os.path.join(sys.path[0],'results/ldavis_med_'+str(num_topics))

# # this is a bit time consuming - make the if statement True
# # if you want to execute visualization prep yourself
if 1 == 1:
    LDAvis_prepared = pyLDAvis.gensim.prepare(lda_model, doc_term_matrix, dict_)
    with open(LDAvis_data_filepath, 'wb') as f:
        pickle.dump(LDAvis_prepared, f)

# load the pre-prepared pyLDAvis data from disk
with open(LDAvis_data_filepath, 'rb') as f:
    LDAvis_prepared = pickle.load(f)

#pyLDAvis.save_html(LDAvis_prepared, './results/ldavis_prepared_'+ str(num_topics) +'.html')
pyLDAvis.save_html(LDAvis_prepared, os.path.join(sys.path[0],'results/ldavis_med_'+str(num_topics)+'.html'))

LDAvis_prepared

#### Compute Model Perplexity and Coherence Score

Let's calculate the baseline coherence score

In [12]:
from gensim.models import CoherenceModel

# Compute Coherence Score
coherence_model_lda = CoherenceModel(model=lda_model, texts=clean_corpus, dictionary=dict_, coherence='c_v')
coherence_lda = coherence_model_lda.get_coherence()
print('Coherence Score: ', coherence_lda)

  and should_run_async(code)


Coherence Score:  0.5597837127472782


** **
#### Step 6: Hyperparameter tuning
** **
First, let's differentiate between model hyperparameters and model parameters :

- `Model hyperparameters` can be thought of as settings for a machine learning algorithm that are tuned by the data scientist before training. Examples would be the number of trees in the random forest, or in our case, number of topics K

- `Model parameters` can be thought of as what the model learns during training, such as the weights for each word in a given topic.

Now that we have the baseline coherence score for the default LDA model, let's perform a series of sensitivity tests to help determine the following model hyperparameters: 
- Number of Topics (K)
- Dirichlet hyperparameter alpha: Document-Topic Density
- Dirichlet hyperparameter beta: Word-Topic Density

We'll perform these tests in sequence, one parameter at a time by keeping others constant and run them over the two difference validation corpus sets. We'll use `C_v` as our choice of metric for performance comparison 

In [13]:
# supporting function
def compute_coherence_values(corpus, dictionary, k, a, b):
    
    lda_model = gensim.models.LdaMulticore(corpus=doc_term_matrix,
                                           id2word=dictionary,
                                           num_topics=k, 
                                           random_state=100,
                                           chunksize=100,
                                           passes=10,
                                           alpha=a,
                                           eta=b)
    
    coherence_model_lda = CoherenceModel(model=lda_model, texts=clean_corpus, dictionary=dictionary, coherence='c_v')
    
    return coherence_model_lda.get_coherence()

  and should_run_async(code)


In [14]:
compute_coherence_values(clean_corpus, dict_, k=6, a=0.1, b=0.1)


  and should_run_async(code)


0.5451247971357184

In [15]:
import numpy as np
import tqdm

grid = {}
grid['Validation_Set'] = {}

# Topics range
min_topics = 2
max_topics = 11
step_size = 1
topics_range = range(min_topics, max_topics, step_size)

# Alpha parameter
alpha = list(np.arange(0.01, 1, 0.3))
alpha.append('symmetric')
alpha.append('asymmetric')

# Beta parameter
beta = list(np.arange(0.01, 1, 0.3))
beta.append('symmetric')

# Validation sets
num_of_docs = len(clean_corpus)
print(num_of_docs)
corpus_sets = [gensim.utils.ClippedCorpus(clean_corpus, int(num_of_docs*0.75)), 
               clean_corpus]
#print(corpus_sets)

corpus_title = ['75% Corpus', '100% Corpus']

model_results = {'Validation_Set': [],
                 'Topics': [],
                 'Alpha': [],
                 'Beta': [],
                 'Coherence': []
                }

# Can take a long time to run
if 1 == 1:
    pbar = tqdm.tqdm(total=(len(beta)*len(alpha)*len(topics_range)*len(corpus_title)))
    
    # iterate through validation corpuses
    for i in range(len(corpus_sets)):
        # iterate through number of topics
        for k in topics_range:
            # iterate through alpha values
            for a in alpha:
                # iterare through beta values
                for b in beta:
                    # get the coherence score for the given parameters
                    cv = compute_coherence_values(corpus=corpus_sets[i], dictionary=dict_, 
                                                  k=k, a=a, b=b)
                    
                    print(cv)
                    # Save the model results
                    model_results['Validation_Set'].append(corpus_title[i])
                    model_results['Topics'].append(k)
                    model_results['Alpha'].append(a)
                    model_results['Beta'].append(b)
                    model_results['Coherence'].append(cv)
                    
                    pbar.update(1)
    pd.DataFrame(model_results).to_csv(os.path.join(sys.path[0],'results/lda_cliSTS.csv'), index=False)
    pbar.close()

  and should_run_async(code)
  0%|          | 0/540 [00:00<?, ?it/s]

750


  0%|          | 1/540 [00:01<12:00,  1.34s/it]

0.33440397034412667


  0%|          | 2/540 [00:02<13:36,  1.52s/it]

0.3344039703441267


  1%|          | 3/540 [00:05<17:41,  1.98s/it]

0.34063577285420177


  1%|          | 4/540 [00:08<19:29,  2.18s/it]

0.3406357728542018


  1%|          | 5/540 [00:09<18:05,  2.03s/it]

0.34063577285420177


  1%|          | 6/540 [00:12<19:37,  2.21s/it]

0.30386032346952707


  1%|▏         | 7/540 [00:14<19:21,  2.18s/it]

0.30386032346952707


  1%|▏         | 8/540 [00:16<18:38,  2.10s/it]

0.33440397034412667


  2%|▏         | 9/540 [00:19<21:18,  2.41s/it]

0.33440397034412667


  2%|▏         | 10/540 [00:21<19:56,  2.26s/it]

0.33440397034412667


  2%|▏         | 11/540 [00:23<18:23,  2.09s/it]

0.30386032346952707


  2%|▏         | 12/540 [00:24<17:48,  2.02s/it]

0.30386032346952707


  2%|▏         | 13/540 [00:27<18:57,  2.16s/it]

0.30386032346952707


  3%|▎         | 14/540 [00:30<20:07,  2.30s/it]

0.30386032346952707


  3%|▎         | 15/540 [00:32<21:16,  2.43s/it]

0.30386032346952707


  3%|▎         | 16/540 [00:34<20:07,  2.30s/it]

0.30386032346952707


  3%|▎         | 17/540 [00:36<17:52,  2.05s/it]

0.30386032346952707


  3%|▎         | 18/540 [00:38<19:19,  2.22s/it]

0.30386032346952707


  4%|▎         | 19/540 [00:40<18:52,  2.17s/it]

0.30386032346952707


  4%|▎         | 20/540 [00:42<17:28,  2.02s/it]

0.30386032346952707


  4%|▍         | 21/540 [00:44<18:04,  2.09s/it]

0.30386032346952707


  4%|▍         | 22/540 [00:46<17:55,  2.08s/it]

0.30386032346952707


  4%|▍         | 23/540 [00:48<17:44,  2.06s/it]

0.30386032346952707


  4%|▍         | 24/540 [00:50<17:47,  2.07s/it]

0.30386032346952707


  5%|▍         | 25/540 [00:55<23:59,  2.79s/it]

0.30386032346952707


  5%|▍         | 26/540 [00:58<24:32,  2.86s/it]

0.2726553662373091


  5%|▌         | 27/540 [01:00<22:50,  2.67s/it]

0.3193429110690413


  5%|▌         | 28/540 [01:02<19:48,  2.32s/it]

0.5967311479119969


  5%|▌         | 29/540 [01:04<19:35,  2.30s/it]

0.5967311479119969


  6%|▌         | 30/540 [01:05<17:20,  2.04s/it]

0.5967311479119969


  6%|▌         | 31/540 [01:07<15:52,  1.87s/it]

0.5887779023010141


  6%|▌         | 32/540 [01:09<15:55,  1.88s/it]

0.7047901302945764


  6%|▌         | 33/540 [01:10<15:23,  1.82s/it]

0.679401223981816


  6%|▋         | 34/540 [01:13<16:14,  1.93s/it]

0.6708016728935183


  6%|▋         | 35/540 [01:14<15:36,  1.85s/it]

0.679401223981816


  7%|▋         | 36/540 [01:16<15:07,  1.80s/it]

0.664161714249328


  7%|▋         | 37/540 [01:18<14:30,  1.73s/it]

0.6686529644202251


  7%|▋         | 38/540 [01:20<15:48,  1.89s/it]

0.6682038898655186


  7%|▋         | 39/540 [01:21<14:53,  1.78s/it]

0.6682038898655186


  7%|▋         | 40/540 [01:23<14:50,  1.78s/it]

0.6686529644202251


  8%|▊         | 41/540 [01:25<15:48,  1.90s/it]

0.6813479344854354


  8%|▊         | 42/540 [01:27<14:10,  1.71s/it]

0.6708493717944214


  8%|▊         | 43/540 [01:30<17:27,  2.11s/it]

0.6708493717944214


  8%|▊         | 44/540 [01:32<17:07,  2.07s/it]

0.6708493717944212


  8%|▊         | 45/540 [01:33<16:26,  1.99s/it]

0.6708493717944214


  9%|▊         | 46/540 [01:35<15:13,  1.85s/it]

0.7053012058661957


  9%|▊         | 47/540 [01:36<14:10,  1.72s/it]

0.697747793879724


  9%|▉         | 48/540 [01:38<14:13,  1.73s/it]

0.6835656953743094


  9%|▉         | 49/540 [01:42<18:19,  2.24s/it]

0.6974903732884467


  9%|▉         | 50/540 [01:44<18:54,  2.31s/it]

0.697747793879724


  9%|▉         | 51/540 [01:47<21:10,  2.60s/it]

0.664161714249328


 10%|▉         | 52/540 [01:50<20:41,  2.54s/it]

0.6686529644202253


 10%|▉         | 53/540 [01:52<19:40,  2.42s/it]

0.6682038898655186


 10%|█         | 54/540 [01:54<19:07,  2.36s/it]

0.6682038898655186


 10%|█         | 55/540 [01:56<18:00,  2.23s/it]

0.6686529644202251


 10%|█         | 56/540 [01:59<19:32,  2.42s/it]

0.6851500873598259


 11%|█         | 57/540 [02:01<19:26,  2.41s/it]

0.6857527378529013


 11%|█         | 58/540 [02:03<17:14,  2.15s/it]

0.6814233124039847


 11%|█         | 59/540 [02:05<18:25,  2.30s/it]

0.5914868036002169


 11%|█         | 60/540 [02:07<17:08,  2.14s/it]

0.6857527378529013


 11%|█▏        | 61/540 [02:09<17:12,  2.16s/it]

0.6242084737946515


 11%|█▏        | 62/540 [02:11<16:35,  2.08s/it]

0.6434184197017379


 12%|█▏        | 63/540 [02:13<15:25,  1.94s/it]

0.6530946518603606


 12%|█▏        | 64/540 [02:15<15:26,  1.95s/it]

0.6678892815852557


 12%|█▏        | 65/540 [02:17<15:40,  1.98s/it]

0.6434184197017379


 12%|█▏        | 66/540 [02:20<17:09,  2.17s/it]

0.5979287085354732


 12%|█▏        | 67/540 [02:21<16:14,  2.06s/it]

0.620061808192947


 13%|█▎        | 68/540 [02:23<16:13,  2.06s/it]

0.6460978594570839


 13%|█▎        | 69/540 [02:26<16:50,  2.14s/it]

0.6403871936031056


 13%|█▎        | 70/540 [02:28<16:35,  2.12s/it]

0.6198292327140777


 13%|█▎        | 71/540 [02:30<16:20,  2.09s/it]

0.6483679426803092


 13%|█▎        | 72/540 [02:32<15:21,  1.97s/it]

0.6465863937452808


 14%|█▎        | 73/540 [02:34<16:41,  2.14s/it]

0.6478208258053957


 14%|█▎        | 74/540 [02:36<16:58,  2.19s/it]

0.6443662274045527


 14%|█▍        | 75/540 [02:38<16:30,  2.13s/it]

0.6465863937452808


 14%|█▍        | 76/540 [02:40<15:44,  2.04s/it]

0.6651167220061042


 14%|█▍        | 77/540 [02:42<15:50,  2.05s/it]

0.6596858524986483


 14%|█▍        | 78/540 [02:44<16:05,  2.09s/it]

0.6484274949356545


 15%|█▍        | 79/540 [02:47<16:17,  2.12s/it]

0.6528702719718101


 15%|█▍        | 80/540 [02:48<15:00,  1.96s/it]

0.6611187521089659


 15%|█▌        | 81/540 [02:50<15:20,  2.01s/it]

0.6246776841179752


 15%|█▌        | 82/540 [02:52<15:13,  1.99s/it]

0.620061808192947


 15%|█▌        | 83/540 [02:55<15:57,  2.09s/it]

0.6454049372930827


 16%|█▌        | 84/540 [02:56<15:08,  1.99s/it]

0.6396994499460098


 16%|█▌        | 85/540 [02:58<14:40,  1.94s/it]

0.620061808192947


 16%|█▌        | 86/540 [03:00<14:23,  1.90s/it]

0.6311062582340323


 16%|█▌        | 87/540 [03:02<13:48,  1.83s/it]

0.5947549519424125


 16%|█▋        | 88/540 [03:05<16:04,  2.13s/it]

0.6209192320876674


 16%|█▋        | 89/540 [03:07<16:48,  2.24s/it]

0.5306486999432262


 17%|█▋        | 90/540 [03:09<16:40,  2.22s/it]

0.6224926126714229


 17%|█▋        | 91/540 [03:12<16:56,  2.26s/it]

0.5760110570516798


 17%|█▋        | 92/540 [03:13<16:00,  2.14s/it]

0.5342050960770408


 17%|█▋        | 93/540 [03:16<15:54,  2.13s/it]

0.39050546823854837


 17%|█▋        | 94/540 [03:17<14:37,  1.97s/it]

0.4251303136620542


 18%|█▊        | 95/540 [03:18<13:04,  1.76s/it]

0.5515914705307129


 18%|█▊        | 96/540 [03:20<13:18,  1.80s/it]

0.5981221498962916


 18%|█▊        | 97/540 [03:22<12:42,  1.72s/it]

0.591408952426595


 18%|█▊        | 98/540 [03:24<13:24,  1.82s/it]

0.3877717748149986


 18%|█▊        | 99/540 [03:25<12:40,  1.72s/it]

0.4059907412430233


 19%|█▊        | 100/540 [03:27<12:15,  1.67s/it]

0.6096898328429725


 19%|█▊        | 101/540 [03:29<13:30,  1.85s/it]

0.5784813747753317


 19%|█▉        | 102/540 [03:31<12:53,  1.77s/it]

0.5748012661823161


 19%|█▉        | 103/540 [03:32<12:42,  1.74s/it]

0.4921739266807691


 19%|█▉        | 104/540 [03:35<13:31,  1.86s/it]

0.40405956567857393


 19%|█▉        | 105/540 [03:37<13:49,  1.91s/it]

0.5802484378759061


 20%|█▉        | 106/540 [03:39<14:33,  2.01s/it]

0.595171749757835


 20%|█▉        | 107/540 [03:41<15:22,  2.13s/it]

0.602010655825791


 20%|██        | 108/540 [03:44<16:05,  2.24s/it]

0.6080383368316444


 20%|██        | 109/540 [03:45<14:43,  2.05s/it]

0.5908589018995934


 20%|██        | 110/540 [03:48<16:01,  2.24s/it]

0.5736380588213956


 21%|██        | 111/540 [03:50<14:53,  2.08s/it]

0.5810796130839002


 21%|██        | 112/540 [03:52<14:13,  1.99s/it]

0.5804005884866086


 21%|██        | 113/540 [03:54<14:40,  2.06s/it]

0.40109405510377966


 21%|██        | 114/540 [03:55<13:21,  1.88s/it]

0.3741036309579517


 21%|██▏       | 115/540 [03:57<13:21,  1.88s/it]

0.5775980220226854


 21%|██▏       | 116/540 [03:59<13:48,  1.95s/it]

0.5369315548018426


 22%|██▏       | 117/540 [04:01<14:18,  2.03s/it]

0.6892780674006713


 22%|██▏       | 118/540 [04:03<13:56,  1.98s/it]

0.6797413299557162


 22%|██▏       | 119/540 [04:05<13:44,  1.96s/it]

0.5584653223214479


 22%|██▏       | 120/540 [04:07<12:31,  1.79s/it]

0.5865468730823871


 22%|██▏       | 121/540 [04:08<12:00,  1.72s/it]

0.5132487866643677


 23%|██▎       | 122/540 [04:10<11:44,  1.68s/it]

0.5886555492636464


 23%|██▎       | 123/540 [04:11<11:22,  1.64s/it]

0.5500426734249969


 23%|██▎       | 124/540 [04:14<12:35,  1.82s/it]

0.5623747793558972


 23%|██▎       | 125/540 [04:15<11:57,  1.73s/it]

0.49899736104869413


 23%|██▎       | 126/540 [04:17<11:43,  1.70s/it]

0.624687334035754


 24%|██▎       | 127/540 [04:18<11:44,  1.71s/it]

0.4746302303709066


 24%|██▎       | 128/540 [04:20<11:54,  1.74s/it]

0.46214710255617847


 24%|██▍       | 129/540 [04:22<11:59,  1.75s/it]

0.44874469129397226


 24%|██▍       | 130/540 [04:25<13:37,  1.99s/it]

0.5933626709191275


 24%|██▍       | 131/540 [04:26<13:27,  1.97s/it]

0.48397354547842425


 24%|██▍       | 132/540 [04:28<13:19,  1.96s/it]

0.5056402825458107


 25%|██▍       | 133/540 [04:30<12:56,  1.91s/it]

0.37342563743653456


 25%|██▍       | 134/540 [04:33<14:43,  2.18s/it]

0.40752744950843817


 25%|██▌       | 135/540 [04:35<14:01,  2.08s/it]

0.4685418635735838


 25%|██▌       | 136/540 [04:37<13:56,  2.07s/it]

0.5436339170195834


 25%|██▌       | 137/540 [04:39<13:02,  1.94s/it]

0.581277639483648


 26%|██▌       | 138/540 [04:40<12:22,  1.85s/it]

0.5541076875594925


 26%|██▌       | 139/540 [04:42<11:54,  1.78s/it]

0.5520327020501866


 26%|██▌       | 140/540 [04:44<11:56,  1.79s/it]

0.5409367911521433


 26%|██▌       | 141/540 [04:45<11:56,  1.80s/it]

0.5850829531619001


 26%|██▋       | 142/540 [04:47<12:19,  1.86s/it]

0.5203659528806872


 26%|██▋       | 143/540 [04:50<13:14,  2.00s/it]

0.46768868045254375


 27%|██▋       | 144/540 [04:51<12:31,  1.90s/it]

0.5525252980871481


 27%|██▋       | 145/540 [04:54<13:03,  1.98s/it]

0.5597837127472782


 27%|██▋       | 146/540 [04:55<12:43,  1.94s/it]

0.6409203516796791


 27%|██▋       | 147/540 [04:57<11:54,  1.82s/it]

0.6454195642001445


 27%|██▋       | 148/540 [04:59<11:45,  1.80s/it]

0.6309594062680658


 28%|██▊       | 149/540 [05:01<11:46,  1.81s/it]

0.5332393468173908


 28%|██▊       | 150/540 [05:02<10:56,  1.68s/it]

0.6119363843836907


 28%|██▊       | 151/540 [05:04<10:44,  1.66s/it]

0.5770583322439115


 28%|██▊       | 152/540 [05:05<10:33,  1.63s/it]

0.5857296950120426


 28%|██▊       | 153/540 [05:07<10:54,  1.69s/it]

0.49467479802366715


 29%|██▊       | 154/540 [05:08<10:29,  1.63s/it]

0.48997757176913126


 29%|██▊       | 155/540 [05:10<10:12,  1.59s/it]

0.5984302235993514


 29%|██▉       | 156/540 [05:12<10:32,  1.65s/it]

0.5619733552551806


 29%|██▉       | 157/540 [05:13<10:05,  1.58s/it]

0.5749597314362765


 29%|██▉       | 158/540 [05:15<10:33,  1.66s/it]

0.4831179374704376


 29%|██▉       | 159/540 [05:17<11:16,  1.78s/it]

0.49967486870489697


 30%|██▉       | 160/540 [05:19<11:11,  1.77s/it]

0.5656217160058711


 30%|██▉       | 161/540 [05:20<11:07,  1.76s/it]

0.5778113509410744


 30%|███       | 162/540 [05:22<11:06,  1.76s/it]

0.5809477778931696


 30%|███       | 163/540 [05:24<11:24,  1.82s/it]

0.479613479810262


 30%|███       | 164/540 [05:27<12:55,  2.06s/it]

0.4758704103138466


 31%|███       | 165/540 [05:29<12:20,  1.98s/it]

0.5770888533884139


 31%|███       | 166/540 [05:30<12:04,  1.94s/it]

0.5807189204556676


 31%|███       | 167/540 [05:32<11:16,  1.81s/it]

0.5816150396942934


 31%|███       | 168/540 [05:34<11:24,  1.84s/it]

0.5509756937562896


 31%|███▏      | 169/540 [05:36<11:42,  1.89s/it]

0.5236257196560743


 31%|███▏      | 170/540 [05:38<11:09,  1.81s/it]

0.5687601585964613


 32%|███▏      | 171/540 [05:39<10:46,  1.75s/it]

0.609310742386582


 32%|███▏      | 172/540 [05:41<10:55,  1.78s/it]

0.6255919970632516


 32%|███▏      | 173/540 [05:43<11:22,  1.86s/it]

0.4743540090835241


 32%|███▏      | 174/540 [05:45<11:07,  1.82s/it]

0.4889238425128962


 32%|███▏      | 175/540 [05:46<10:55,  1.80s/it]

0.6071645234760122


 33%|███▎      | 176/540 [05:48<10:53,  1.80s/it]

0.5858312825651514


 33%|███▎      | 177/540 [05:50<11:12,  1.85s/it]

0.6022325705968525


 33%|███▎      | 178/540 [05:52<10:29,  1.74s/it]

0.4497292190885391


 33%|███▎      | 179/540 [05:53<10:24,  1.73s/it]

0.5082990921954139


 33%|███▎      | 180/540 [05:55<09:58,  1.66s/it]

0.6047325195401655


 34%|███▎      | 181/540 [05:57<10:12,  1.71s/it]

0.5917465030279938


 34%|███▎      | 182/540 [05:58<09:38,  1.62s/it]

0.5737986824782215


 34%|███▍      | 183/540 [06:00<10:09,  1.71s/it]

0.6837542719563005


 34%|███▍      | 184/540 [06:02<09:58,  1.68s/it]

0.6435705065856357


 34%|███▍      | 185/540 [06:04<10:17,  1.74s/it]

0.5778339185069823


 34%|███▍      | 186/540 [06:05<10:18,  1.75s/it]

0.5923426892153587


 35%|███▍      | 187/540 [06:07<10:23,  1.77s/it]

0.6018173052987592


 35%|███▍      | 188/540 [06:09<10:26,  1.78s/it]

0.5917325504878835


 35%|███▌      | 189/540 [06:11<10:35,  1.81s/it]

0.5722355319485541


 35%|███▌      | 190/540 [06:13<10:45,  1.85s/it]

0.5988024060854131


 35%|███▌      | 191/540 [06:15<10:31,  1.81s/it]

0.5925461893332212


 36%|███▌      | 192/540 [06:16<09:57,  1.72s/it]

0.5855726523232065


 36%|███▌      | 193/540 [06:18<09:58,  1.73s/it]

0.5559214054249793


 36%|███▌      | 194/540 [06:20<10:15,  1.78s/it]

0.43817036389750263


 36%|███▌      | 195/540 [06:22<10:25,  1.81s/it]

0.6000859884138727


 36%|███▋      | 196/540 [06:24<10:38,  1.86s/it]

0.657540848913465


 36%|███▋      | 197/540 [06:26<10:55,  1.91s/it]

0.6102812920865242


 37%|███▋      | 198/540 [06:27<10:14,  1.80s/it]

0.5334172452096485


 37%|███▋      | 199/540 [06:29<10:36,  1.87s/it]

0.5115014768485038


 37%|███▋      | 200/540 [06:31<10:53,  1.92s/it]

0.6000772258454166


 37%|███▋      | 201/540 [06:33<10:29,  1.86s/it]

0.5747084434273799


 37%|███▋      | 202/540 [06:35<10:28,  1.86s/it]

0.5806436657732845


 38%|███▊      | 203/540 [06:36<09:52,  1.76s/it]

0.5724527088888464


 38%|███▊      | 204/540 [06:38<09:45,  1.74s/it]

0.6213076218716214


 38%|███▊      | 205/540 [06:40<09:37,  1.72s/it]

0.5758184434123026


 38%|███▊      | 206/540 [06:41<09:08,  1.64s/it]

0.5733362135597195


 38%|███▊      | 207/540 [06:43<08:44,  1.58s/it]

0.5550459589218315


 39%|███▊      | 208/540 [06:44<08:42,  1.57s/it]

0.5379420585851772


 39%|███▊      | 209/540 [06:46<08:35,  1.56s/it]

0.5406890346882987


 39%|███▉      | 210/540 [06:47<08:37,  1.57s/it]

0.5846603592181826


 39%|███▉      | 211/540 [06:48<08:05,  1.48s/it]

0.571290524583376


 39%|███▉      | 212/540 [06:50<08:43,  1.60s/it]

0.5469842415125015


 39%|███▉      | 213/540 [06:53<10:22,  1.90s/it]

0.5760196308297112


 40%|███▉      | 214/540 [06:55<09:58,  1.84s/it]

0.563179680189994


 40%|███▉      | 215/540 [06:56<09:38,  1.78s/it]

0.5562000174846514


 40%|████      | 216/540 [06:58<10:11,  1.89s/it]

0.5659721950521469


 40%|████      | 217/540 [07:00<09:51,  1.83s/it]

0.5205787015347396


 40%|████      | 218/540 [07:02<09:47,  1.82s/it]

0.4293627428221074


 41%|████      | 219/540 [07:04<09:34,  1.79s/it]

0.45236690782405525


 41%|████      | 220/540 [07:06<09:48,  1.84s/it]

0.5513425994936806


 41%|████      | 221/540 [07:08<10:05,  1.90s/it]

0.5632322346006182


 41%|████      | 222/540 [07:09<09:53,  1.87s/it]

0.5675651736715364


 41%|████▏     | 223/540 [07:11<09:33,  1.81s/it]

0.43826735813783363


 41%|████▏     | 224/540 [07:13<10:04,  1.91s/it]

0.42129313139106384


 42%|████▏     | 225/540 [07:15<10:15,  1.95s/it]

0.5574012161902759


 42%|████▏     | 226/540 [07:17<10:31,  2.01s/it]

0.6309658149150006


 42%|████▏     | 227/540 [07:19<09:53,  1.90s/it]

0.559055283849756


 42%|████▏     | 228/540 [07:21<10:05,  1.94s/it]

0.499568008083273


 42%|████▏     | 229/540 [07:23<09:57,  1.92s/it]

0.45329573232264625


 43%|████▎     | 230/540 [07:25<09:35,  1.86s/it]

0.5997003396403877


 43%|████▎     | 231/540 [07:26<09:01,  1.75s/it]

0.5606776614826656


 43%|████▎     | 232/540 [07:28<08:45,  1.71s/it]

0.4536091179837437


 43%|████▎     | 233/540 [07:29<08:38,  1.69s/it]

0.5307924030447843


 43%|████▎     | 234/540 [07:31<08:35,  1.68s/it]

0.567537478551206


 44%|████▎     | 235/540 [07:33<09:12,  1.81s/it]

0.537364491320701


 44%|████▎     | 236/540 [07:35<08:49,  1.74s/it]

0.5772688281511474


 44%|████▍     | 237/540 [07:36<08:38,  1.71s/it]

0.5362635543327716


 44%|████▍     | 238/540 [07:39<09:09,  1.82s/it]

0.5518597661986766


 44%|████▍     | 239/540 [07:40<09:01,  1.80s/it]

0.48862076015577977


 44%|████▍     | 240/540 [07:42<09:26,  1.89s/it]

0.5658081108691143


 45%|████▍     | 241/540 [07:44<09:14,  1.86s/it]

0.5702802266285816


 45%|████▍     | 242/540 [07:46<09:14,  1.86s/it]

0.5609646161345063


 45%|████▌     | 243/540 [07:48<09:08,  1.85s/it]

0.49853367420456907


 45%|████▌     | 244/540 [07:50<10:05,  2.05s/it]

0.40862301851446736


 45%|████▌     | 245/540 [07:53<10:12,  2.08s/it]

0.5684047785313063


 46%|████▌     | 246/540 [07:55<10:06,  2.06s/it]

0.5519752050669886


 46%|████▌     | 247/540 [07:57<09:59,  2.05s/it]

0.5579529976227005


 46%|████▌     | 248/540 [07:59<10:05,  2.07s/it]

0.47715415034051356


 46%|████▌     | 249/540 [08:01<10:19,  2.13s/it]

0.4769258001718897


 46%|████▋     | 250/540 [08:03<10:19,  2.14s/it]

0.5626221873014721


 46%|████▋     | 251/540 [08:05<10:10,  2.11s/it]

0.5685589407841249


 47%|████▋     | 252/540 [08:07<10:27,  2.18s/it]

0.5314500576909869


 47%|████▋     | 253/540 [08:09<10:02,  2.10s/it]

0.521279132951358


 47%|████▋     | 254/540 [08:11<09:56,  2.08s/it]

0.45408142051245975


 47%|████▋     | 255/540 [08:13<09:27,  1.99s/it]

0.5563320735900465


 47%|████▋     | 256/540 [08:15<08:56,  1.89s/it]

0.6059889896439439


 48%|████▊     | 257/540 [08:17<09:14,  1.96s/it]

0.5169447595580559


 48%|████▊     | 258/540 [08:19<09:04,  1.93s/it]

0.4757991925170971


 48%|████▊     | 259/540 [08:21<08:46,  1.87s/it]

0.45910762055608395


 48%|████▊     | 260/540 [08:23<09:04,  1.94s/it]

0.5720012932765977


 48%|████▊     | 261/540 [08:25<09:05,  1.96s/it]

0.5123901698478546


 49%|████▊     | 262/540 [08:26<08:49,  1.91s/it]

0.5699193774529556


 49%|████▊     | 263/540 [08:28<08:37,  1.87s/it]

0.4689465339638873


 49%|████▉     | 264/540 [08:31<09:13,  2.01s/it]

0.49322040161672287


 49%|████▉     | 265/540 [08:32<08:31,  1.86s/it]

0.5319977888697602


 49%|████▉     | 266/540 [08:34<08:53,  1.95s/it]

0.5384813554160194


 49%|████▉     | 267/540 [08:36<08:05,  1.78s/it]

0.5171340496415057


 50%|████▉     | 268/540 [08:37<08:02,  1.77s/it]

0.5646913800160058


 50%|████▉     | 269/540 [08:39<08:14,  1.83s/it]

0.5434336615219386


 50%|█████     | 270/540 [08:42<09:02,  2.01s/it]

0.535413327699327


 50%|█████     | 271/540 [08:44<09:30,  2.12s/it]

0.33440397034412667


 50%|█████     | 272/540 [08:47<09:50,  2.20s/it]

0.3344039703441267


 51%|█████     | 273/540 [08:49<09:59,  2.25s/it]

0.34063577285420177


 51%|█████     | 274/540 [08:51<09:10,  2.07s/it]

0.3406357728542018


 51%|█████     | 275/540 [08:53<09:45,  2.21s/it]

0.34063577285420177


 51%|█████     | 276/540 [08:55<08:51,  2.01s/it]

0.30386032346952707


 51%|█████▏    | 277/540 [08:57<09:26,  2.15s/it]

0.30386032346952707


 51%|█████▏    | 278/540 [09:00<09:40,  2.22s/it]

0.33440397034412667


 52%|█████▏    | 279/540 [09:02<09:47,  2.25s/it]

0.33440397034412667


 52%|█████▏    | 280/540 [09:04<09:27,  2.18s/it]

0.33440397034412667


 52%|█████▏    | 281/540 [09:06<09:51,  2.28s/it]

0.30386032346952707


 52%|█████▏    | 282/540 [09:08<09:05,  2.11s/it]

0.30386032346952707


 52%|█████▏    | 283/540 [09:10<08:23,  1.96s/it]

0.30386032346952707


 53%|█████▎    | 284/540 [09:12<09:04,  2.13s/it]

0.30386032346952707


 53%|█████▎    | 285/540 [09:15<09:47,  2.30s/it]

0.30386032346952707


 53%|█████▎    | 286/540 [09:17<09:35,  2.27s/it]

0.30386032346952707


 53%|█████▎    | 287/540 [09:20<09:42,  2.30s/it]

0.30386032346952707


 53%|█████▎    | 288/540 [09:22<10:00,  2.38s/it]

0.30386032346952707


 54%|█████▎    | 289/540 [09:24<09:04,  2.17s/it]

0.30386032346952707


 54%|█████▎    | 290/540 [09:26<09:23,  2.25s/it]

0.30386032346952707


 54%|█████▍    | 291/540 [09:28<08:40,  2.09s/it]

0.30386032346952707


 54%|█████▍    | 292/540 [09:30<08:09,  1.97s/it]

0.30386032346952707


 54%|█████▍    | 293/540 [09:32<08:02,  1.95s/it]

0.30386032346952707


 54%|█████▍    | 294/540 [09:33<07:40,  1.87s/it]

0.30386032346952707


 55%|█████▍    | 295/540 [09:36<08:15,  2.02s/it]

0.30386032346952707


 55%|█████▍    | 296/540 [09:39<09:46,  2.41s/it]

0.2726553662373091


 55%|█████▌    | 297/540 [09:40<08:45,  2.16s/it]

0.3193429110690413


 55%|█████▌    | 298/540 [09:42<07:55,  1.96s/it]

0.5967311479119969


 55%|█████▌    | 299/540 [09:45<08:53,  2.21s/it]

0.5967311479119969


 56%|█████▌    | 300/540 [09:46<08:12,  2.05s/it]

0.5967311479119969


 56%|█████▌    | 301/540 [09:48<07:56,  1.99s/it]

0.5887779023010141


 56%|█████▌    | 302/540 [09:50<07:18,  1.84s/it]

0.679401223981816


 56%|█████▌    | 303/540 [09:53<08:42,  2.20s/it]

0.679401223981816


 56%|█████▋    | 304/540 [09:55<08:02,  2.04s/it]

0.6708016728935183


 56%|█████▋    | 305/540 [09:56<07:21,  1.88s/it]

0.679401223981816


 57%|█████▋    | 306/540 [09:59<08:05,  2.07s/it]

0.664161714249328


 57%|█████▋    | 307/540 [10:01<08:13,  2.12s/it]

0.6686529644202251


 57%|█████▋    | 308/540 [10:03<08:15,  2.14s/it]

0.6682038898655186


 57%|█████▋    | 309/540 [10:05<08:34,  2.23s/it]

0.6682038898655186


 57%|█████▋    | 310/540 [10:07<07:47,  2.03s/it]

0.6686529644202251


 58%|█████▊    | 311/540 [10:09<08:05,  2.12s/it]

0.6813479344854354


 58%|█████▊    | 312/540 [10:11<07:58,  2.10s/it]

0.6708493717944214


 58%|█████▊    | 313/540 [10:13<07:56,  2.10s/it]

0.6708493717944214


 58%|█████▊    | 314/540 [10:16<08:04,  2.14s/it]

0.6708493717944212


 58%|█████▊    | 315/540 [10:18<08:29,  2.26s/it]

0.6708493717944214


 59%|█████▊    | 316/540 [10:21<08:47,  2.35s/it]

0.7053012058661957


 59%|█████▊    | 317/540 [10:24<09:56,  2.68s/it]

0.697747793879724


 59%|█████▉    | 318/540 [10:26<09:04,  2.45s/it]

0.6835656953743094


 59%|█████▉    | 319/540 [10:28<08:31,  2.32s/it]

0.6974903732884467


 59%|█████▉    | 320/540 [10:30<08:26,  2.30s/it]

0.697747793879724


 59%|█████▉    | 321/540 [10:33<08:33,  2.35s/it]

0.664161714249328


 60%|█████▉    | 322/540 [10:36<09:01,  2.48s/it]

0.6686529644202253


 60%|█████▉    | 323/540 [10:37<08:17,  2.29s/it]

0.6682038898655186


 60%|██████    | 324/540 [10:40<07:58,  2.22s/it]

0.6682038898655186


 60%|██████    | 325/540 [10:41<07:03,  1.97s/it]

0.6686529644202251


 60%|██████    | 326/540 [10:42<06:25,  1.80s/it]

0.6851500873598259


 61%|██████    | 327/540 [10:45<07:02,  1.98s/it]

0.6857527378529013


 61%|██████    | 328/540 [10:47<07:33,  2.14s/it]

0.6814233124039847


 61%|██████    | 329/540 [10:49<07:15,  2.06s/it]

0.5914868036002169


 61%|██████    | 330/540 [10:52<07:42,  2.20s/it]

0.6857527378529013


 61%|██████▏   | 331/540 [10:53<07:10,  2.06s/it]

0.6242084737946515


 61%|██████▏   | 332/540 [10:54<06:06,  1.76s/it]

0.6434184197017379


 62%|██████▏   | 333/540 [10:56<06:16,  1.82s/it]

0.6530946518603606


 62%|██████▏   | 334/540 [10:59<06:36,  1.93s/it]

0.6678892815852557


 62%|██████▏   | 335/540 [11:00<06:16,  1.84s/it]

0.6434184197017379


 62%|██████▏   | 336/540 [11:02<06:34,  1.93s/it]

0.5979287085354732


 62%|██████▏   | 337/540 [11:05<06:56,  2.05s/it]

0.620061808192947


 63%|██████▎   | 338/540 [11:07<06:55,  2.06s/it]

0.6460978594570839


 63%|██████▎   | 339/540 [11:08<06:23,  1.91s/it]

0.6403871936031056


 63%|██████▎   | 340/540 [11:10<06:14,  1.87s/it]

0.6198292327140777


 63%|██████▎   | 341/540 [11:13<07:19,  2.21s/it]

0.6483679426803092


 63%|██████▎   | 342/540 [11:15<07:09,  2.17s/it]

0.6465863937452808


 64%|██████▎   | 343/540 [11:17<07:00,  2.14s/it]

0.6478208258053957


 64%|██████▎   | 344/540 [11:19<07:00,  2.15s/it]

0.6443662274045527


 64%|██████▍   | 345/540 [11:22<07:13,  2.22s/it]

0.6465863937452808


 64%|██████▍   | 346/540 [11:24<06:56,  2.15s/it]

0.6651167220061042


 64%|██████▍   | 347/540 [11:26<07:26,  2.32s/it]

0.6596858524986483


 64%|██████▍   | 348/540 [11:29<07:19,  2.29s/it]

0.6484274949356545


 65%|██████▍   | 349/540 [11:31<07:36,  2.39s/it]

0.6528702719718101


 65%|██████▍   | 350/540 [11:34<07:34,  2.39s/it]

0.6611187521089659


 65%|██████▌   | 351/540 [11:36<06:59,  2.22s/it]

0.6258902827152918


 65%|██████▌   | 352/540 [11:38<06:41,  2.14s/it]

0.620061808192947


 65%|██████▌   | 353/540 [11:40<06:53,  2.21s/it]

0.6454049372930827


 66%|██████▌   | 354/540 [11:42<06:30,  2.10s/it]

0.6396994499460098


 66%|██████▌   | 355/540 [11:44<06:46,  2.20s/it]

0.620061808192947


 66%|██████▌   | 356/540 [11:46<06:15,  2.04s/it]

0.6311062582340323


 66%|██████▌   | 357/540 [11:48<05:56,  1.95s/it]

0.5947549519424125


 66%|██████▋   | 358/540 [11:50<05:56,  1.96s/it]

0.6209192320876674


 66%|██████▋   | 359/540 [11:51<05:46,  1.91s/it]

0.5306486999432262


 67%|██████▋   | 360/540 [11:54<06:23,  2.13s/it]

0.6224926126714229


 67%|██████▋   | 361/540 [11:56<06:03,  2.03s/it]

0.5760110570516798


 67%|██████▋   | 362/540 [11:57<05:28,  1.85s/it]

0.5342050960770408


 67%|██████▋   | 363/540 [11:59<05:14,  1.78s/it]

0.39050546823854837


 67%|██████▋   | 364/540 [12:00<04:59,  1.70s/it]

0.4251303136620542


 68%|██████▊   | 365/540 [12:02<04:54,  1.68s/it]

0.5419762432502113


 68%|██████▊   | 366/540 [12:04<05:08,  1.77s/it]

0.5981221498962916


 68%|██████▊   | 367/540 [12:06<05:04,  1.76s/it]

0.591408952426595


 68%|██████▊   | 368/540 [12:08<05:35,  1.95s/it]

0.3877717748149986


 68%|██████▊   | 369/540 [12:10<05:39,  1.99s/it]

0.4059907412430233


 69%|██████▊   | 370/540 [12:12<05:27,  1.92s/it]

0.6096898328429725


 69%|██████▊   | 371/540 [12:14<05:15,  1.87s/it]

0.5784813747753317


 69%|██████▉   | 372/540 [12:15<04:54,  1.75s/it]

0.5748012661823161


 69%|██████▉   | 373/540 [12:17<04:51,  1.75s/it]

0.5115416041320393


 69%|██████▉   | 374/540 [12:19<05:07,  1.85s/it]

0.40405956567857393


 69%|██████▉   | 375/540 [12:21<05:13,  1.90s/it]

0.5802484378759061


 70%|██████▉   | 376/540 [12:23<05:32,  2.02s/it]

0.595171749757835


 70%|██████▉   | 377/540 [12:26<05:41,  2.10s/it]

0.6020106558257909


 70%|███████   | 378/540 [12:27<05:30,  2.04s/it]

0.6080383368316444


 70%|███████   | 379/540 [12:29<05:22,  2.00s/it]

0.5908589018995934


 70%|███████   | 380/540 [12:31<05:20,  2.01s/it]

0.5736380588213956


 71%|███████   | 381/540 [12:33<04:57,  1.87s/it]

0.5810796130839002


 71%|███████   | 382/540 [12:35<04:50,  1.84s/it]

0.5804005884866086


 71%|███████   | 383/540 [12:36<04:43,  1.80s/it]

0.40109405510377966


 71%|███████   | 384/540 [12:38<04:23,  1.69s/it]

0.3741036309579517


 71%|███████▏  | 385/540 [12:39<04:14,  1.64s/it]

0.5775980220226854


 71%|███████▏  | 386/540 [12:41<04:13,  1.65s/it]

0.5369315548018426


 72%|███████▏  | 387/540 [12:44<04:59,  1.96s/it]

0.6892780674006713


 72%|███████▏  | 388/540 [12:45<04:45,  1.88s/it]

0.6797413299557162


 72%|███████▏  | 389/540 [12:48<04:55,  1.96s/it]

0.5584653223214479


 72%|███████▏  | 390/540 [12:49<04:37,  1.85s/it]

0.5865468730823871


 72%|███████▏  | 391/540 [12:51<04:26,  1.79s/it]

0.5132487866643677


 73%|███████▎  | 392/540 [12:52<04:02,  1.64s/it]

0.5886555492636464


 73%|███████▎  | 393/540 [12:54<04:00,  1.64s/it]

0.5500426734249969


 73%|███████▎  | 394/540 [12:55<03:50,  1.58s/it]

0.5593610969460235


 73%|███████▎  | 395/540 [12:56<03:34,  1.48s/it]

0.49899736104869413


 73%|███████▎  | 396/540 [12:59<04:04,  1.70s/it]

0.624687334035754


 74%|███████▎  | 397/540 [13:01<04:14,  1.78s/it]

0.4746302303709067


 74%|███████▎  | 398/540 [13:02<04:07,  1.74s/it]

0.46214710255617847


 74%|███████▍  | 399/540 [13:04<04:05,  1.74s/it]

0.44643265010344707


 74%|███████▍  | 400/540 [13:06<04:05,  1.76s/it]

0.5933626709191275


 74%|███████▍  | 401/540 [13:07<03:49,  1.65s/it]

0.48397354547842425


 74%|███████▍  | 402/540 [13:09<03:54,  1.70s/it]

0.5056402825458107


 75%|███████▍  | 403/540 [13:11<04:03,  1.78s/it]

0.37342563743653456


 75%|███████▍  | 404/540 [13:13<04:03,  1.79s/it]

0.40752744950843817


 75%|███████▌  | 405/540 [13:15<04:13,  1.88s/it]

0.4685418635735838


 75%|███████▌  | 406/540 [13:17<04:15,  1.91s/it]

0.5436339170195834


 75%|███████▌  | 407/540 [13:20<04:44,  2.14s/it]

0.581277639483648


 76%|███████▌  | 408/540 [13:21<04:18,  1.96s/it]

0.5541076875594925


 76%|███████▌  | 409/540 [13:23<04:05,  1.88s/it]

0.554057334896541


 76%|███████▌  | 410/540 [13:24<03:47,  1.75s/it]

0.5409367911521433


 76%|███████▌  | 411/540 [13:26<03:28,  1.61s/it]

0.5850829531619001


 76%|███████▋  | 412/540 [13:27<03:27,  1.62s/it]

0.5203659528806872


 76%|███████▋  | 413/540 [13:29<03:26,  1.63s/it]

0.4677328775219716


 77%|███████▋  | 414/540 [13:31<03:33,  1.69s/it]

0.5525252980871481


 77%|███████▋  | 415/540 [13:32<03:17,  1.58s/it]

0.5597837127472782


 77%|███████▋  | 416/540 [13:34<03:14,  1.57s/it]

0.6409203516796791


 77%|███████▋  | 417/540 [13:35<03:22,  1.65s/it]

0.6454195642001445


 77%|███████▋  | 418/540 [13:37<03:31,  1.74s/it]

0.6309594062680658


 78%|███████▊  | 419/540 [13:40<03:53,  1.93s/it]

0.5332393468173908


 78%|███████▊  | 420/540 [13:42<03:49,  1.91s/it]

0.6119363843836907


 78%|███████▊  | 421/540 [13:43<03:43,  1.88s/it]

0.5770583322439115


 78%|███████▊  | 422/540 [13:45<03:31,  1.79s/it]

0.5857296950120426


 78%|███████▊  | 423/540 [13:47<03:29,  1.79s/it]

0.48527327526329433


 79%|███████▊  | 424/540 [13:49<03:34,  1.85s/it]

0.4917262168255217


 79%|███████▊  | 425/540 [13:51<03:30,  1.83s/it]

0.5962903016480691


 79%|███████▉  | 426/540 [13:52<03:28,  1.83s/it]

0.5619733552551806


 79%|███████▉  | 427/540 [13:54<03:32,  1.88s/it]

0.5749597314362765


 79%|███████▉  | 428/540 [13:56<03:37,  1.94s/it]

0.4831179374704376


 79%|███████▉  | 429/540 [13:59<03:49,  2.07s/it]

0.49967486870489697


 80%|███████▉  | 430/540 [14:00<03:31,  1.92s/it]

0.5798603080639296


 80%|███████▉  | 431/540 [14:02<03:19,  1.83s/it]

0.5778113509410744


 80%|████████  | 432/540 [14:04<03:26,  1.91s/it]

0.5809477778931696


 80%|████████  | 433/540 [14:06<03:25,  1.92s/it]

0.479613479810262


 80%|████████  | 434/540 [14:08<03:28,  1.97s/it]

0.4758704103138466


 81%|████████  | 435/540 [14:10<03:23,  1.94s/it]

0.5770888533884139


 81%|████████  | 436/540 [14:12<03:20,  1.93s/it]

0.5879598137580745


 81%|████████  | 437/540 [14:13<03:08,  1.83s/it]

0.5816150396942934


 81%|████████  | 438/540 [14:15<03:02,  1.79s/it]

0.5509756937562896


 81%|████████▏ | 439/540 [14:17<03:03,  1.82s/it]

0.5317928861152877


 81%|████████▏ | 440/540 [14:19<02:59,  1.79s/it]

0.5687601585964613


 82%|████████▏ | 441/540 [14:21<02:59,  1.82s/it]

0.5863919001180398


 82%|████████▏ | 442/540 [14:22<02:57,  1.81s/it]

0.6255919970632516


 82%|████████▏ | 443/540 [14:25<03:09,  1.96s/it]

0.4743540090835241


 82%|████████▏ | 444/540 [14:27<03:01,  1.89s/it]

0.4919115696434556


 82%|████████▏ | 445/540 [14:28<02:57,  1.87s/it]

0.6071645234760122


 83%|████████▎ | 446/540 [14:30<02:54,  1.85s/it]

0.5834411216938195


 83%|████████▎ | 447/540 [14:32<02:59,  1.93s/it]

0.6038641723040057


 83%|████████▎ | 448/540 [14:34<02:40,  1.75s/it]

0.4497292190885391


 83%|████████▎ | 449/540 [14:36<02:45,  1.82s/it]

0.5082990921954139


 83%|████████▎ | 450/540 [14:37<02:43,  1.82s/it]

0.6047325195401655


 84%|████████▎ | 451/540 [14:39<02:37,  1.77s/it]

0.5917465030279938


 84%|████████▎ | 452/540 [14:41<02:29,  1.70s/it]

0.5737986824782215


 84%|████████▍ | 453/540 [14:43<02:44,  1.89s/it]

0.6746668966060422


 84%|████████▍ | 454/540 [14:45<02:41,  1.88s/it]

0.6435705065856357


 84%|████████▍ | 455/540 [14:47<02:37,  1.85s/it]

0.5778339185069822


 84%|████████▍ | 456/540 [14:48<02:32,  1.82s/it]

0.5923426892153587


 85%|████████▍ | 457/540 [14:50<02:31,  1.83s/it]

0.6018173052987592


 85%|████████▍ | 458/540 [14:52<02:34,  1.88s/it]

0.5917325504878835


 85%|████████▌ | 459/540 [14:54<02:28,  1.83s/it]

0.5722355319485541


 85%|████████▌ | 460/540 [14:56<02:37,  1.97s/it]

0.5988024060854131


 85%|████████▌ | 461/540 [14:58<02:36,  1.98s/it]

0.5925461893332212


 86%|████████▌ | 462/540 [15:00<02:25,  1.86s/it]

0.5855726523232065


 86%|████████▌ | 463/540 [15:02<02:27,  1.92s/it]

0.5542572769155925


 86%|████████▌ | 464/540 [15:04<02:27,  1.94s/it]

0.43817036389750263


 86%|████████▌ | 465/540 [15:06<02:23,  1.91s/it]

0.6000859884138727


 86%|████████▋ | 466/540 [15:07<02:15,  1.83s/it]

0.657540848913465


 86%|████████▋ | 467/540 [15:09<02:09,  1.78s/it]

0.6102812920865242


 87%|████████▋ | 468/540 [15:11<02:11,  1.83s/it]

0.5334172452096486


 87%|████████▋ | 469/540 [15:12<02:04,  1.76s/it]

0.5115014768485038


 87%|████████▋ | 470/540 [15:14<02:07,  1.83s/it]

0.6000772258454166


 87%|████████▋ | 471/540 [15:16<02:07,  1.84s/it]

0.5747084434273799


 87%|████████▋ | 472/540 [15:18<02:00,  1.78s/it]

0.5806436657732845


 88%|████████▊ | 473/540 [15:20<02:00,  1.80s/it]

0.5724527088888464


 88%|████████▊ | 474/540 [15:22<01:57,  1.78s/it]

0.5992304377199941


 88%|████████▊ | 475/540 [15:23<01:51,  1.72s/it]

0.5758184434123026


 88%|████████▊ | 476/540 [15:24<01:42,  1.61s/it]

0.5733362135597195


 88%|████████▊ | 477/540 [15:27<01:52,  1.78s/it]

0.5550459589218315


 89%|████████▊ | 478/540 [15:29<02:01,  1.95s/it]

0.5506179318472693


 89%|████████▊ | 479/540 [15:31<01:54,  1.88s/it]

0.5406890346882987


 89%|████████▉ | 480/540 [15:32<01:50,  1.83s/it]

0.5846603592181826


 89%|████████▉ | 481/540 [15:34<01:43,  1.76s/it]

0.571290524583376


 89%|████████▉ | 482/540 [15:36<01:42,  1.76s/it]

0.5469842415125015


 89%|████████▉ | 483/540 [15:38<01:40,  1.75s/it]

0.5743149675481327


 90%|████████▉ | 484/540 [15:39<01:38,  1.75s/it]

0.5584965349320856


 90%|████████▉ | 485/540 [15:41<01:31,  1.66s/it]

0.5562000174846514


 90%|█████████ | 486/540 [15:43<01:36,  1.78s/it]

0.5659721950521469


 90%|█████████ | 487/540 [15:45<01:38,  1.87s/it]

0.5203085757458388


 90%|█████████ | 488/540 [15:47<01:47,  2.07s/it]

0.4293627428221074


 91%|█████████ | 489/540 [15:50<01:46,  2.09s/it]

0.45236690782405525


 91%|█████████ | 490/540 [15:52<01:44,  2.10s/it]

0.5513425994936806


 91%|█████████ | 491/540 [15:53<01:32,  1.89s/it]

0.5632322346006182


 91%|█████████ | 492/540 [15:55<01:27,  1.81s/it]

0.5675651736715364


 91%|█████████▏| 493/540 [15:57<01:28,  1.88s/it]

0.43826735813783363


 91%|█████████▏| 494/540 [15:58<01:23,  1.81s/it]

0.42129313139106384


 92%|█████████▏| 495/540 [16:00<01:23,  1.85s/it]

0.5574012161902759


 92%|█████████▏| 496/540 [16:02<01:20,  1.82s/it]

0.6309658149150006


 92%|█████████▏| 497/540 [16:04<01:19,  1.84s/it]

0.559055283849756


 92%|█████████▏| 498/540 [16:06<01:14,  1.78s/it]

0.499568008083273


 92%|█████████▏| 499/540 [16:07<01:13,  1.79s/it]

0.45329573232264625


 93%|█████████▎| 500/540 [16:10<01:15,  1.90s/it]

0.5997003396403877


 93%|█████████▎| 501/540 [16:11<01:13,  1.88s/it]

0.5606776614826656


 93%|█████████▎| 502/540 [16:13<01:08,  1.81s/it]

0.4536091179837437


 93%|█████████▎| 503/540 [16:15<01:03,  1.72s/it]

0.5373769581769797


 93%|█████████▎| 504/540 [16:17<01:05,  1.83s/it]

0.567537478551206


 94%|█████████▎| 505/540 [16:18<01:02,  1.80s/it]

0.5373644913207011


 94%|█████████▎| 506/540 [16:20<00:59,  1.75s/it]

0.5772688281511474


 94%|█████████▍| 507/540 [16:22<00:56,  1.71s/it]

0.5362635543327716


 94%|█████████▍| 508/540 [16:23<00:52,  1.65s/it]

0.5518597661986766


 94%|█████████▍| 509/540 [16:25<00:53,  1.72s/it]

0.48862076015577977


 94%|█████████▍| 510/540 [16:27<00:51,  1.73s/it]

0.566422121027727


 95%|█████████▍| 511/540 [16:28<00:49,  1.69s/it]

0.5702802266285816


 95%|█████████▍| 512/540 [16:31<00:51,  1.84s/it]

0.5682588483701561


 95%|█████████▌| 513/540 [16:33<00:50,  1.88s/it]

0.49853367420456907


 95%|█████████▌| 514/540 [16:34<00:46,  1.79s/it]

0.40233744236119745


 95%|█████████▌| 515/540 [16:36<00:47,  1.90s/it]

0.5684047785313063


 96%|█████████▌| 516/540 [16:38<00:46,  1.92s/it]

0.5519752050669886


 96%|█████████▌| 517/540 [16:40<00:43,  1.91s/it]

0.5579529976227005


 96%|█████████▌| 518/540 [16:42<00:43,  1.96s/it]

0.47715415034051356


 96%|█████████▌| 519/540 [16:44<00:41,  1.96s/it]

0.4769258001718897


 96%|█████████▋| 520/540 [16:46<00:40,  2.04s/it]

0.5626221873014721


 96%|█████████▋| 521/540 [16:48<00:38,  2.02s/it]

0.5685589407841249


 97%|█████████▋| 522/540 [16:50<00:35,  1.95s/it]

0.5314500576909869


 97%|█████████▋| 523/540 [16:52<00:32,  1.94s/it]

0.521279132951358


 97%|█████████▋| 524/540 [16:54<00:31,  1.98s/it]

0.45408142051245975


 97%|█████████▋| 525/540 [16:56<00:30,  2.01s/it]

0.5563320735900465


 97%|█████████▋| 526/540 [16:58<00:28,  2.03s/it]

0.6059889896439439


 98%|█████████▊| 527/540 [17:00<00:25,  1.95s/it]

0.5169447595580559


 98%|█████████▊| 528/540 [17:02<00:23,  1.95s/it]

0.4757991925170971


 98%|█████████▊| 529/540 [17:04<00:21,  1.98s/it]

0.45910762055608395


 98%|█████████▊| 530/540 [17:06<00:18,  1.89s/it]

0.5710570105841994


 98%|█████████▊| 531/540 [17:07<00:16,  1.85s/it]

0.5123901698478546


 99%|█████████▊| 532/540 [17:09<00:14,  1.81s/it]

0.5699193774529556


 99%|█████████▊| 533/540 [17:11<00:12,  1.79s/it]

0.4689465339638873


 99%|█████████▉| 534/540 [17:13<00:11,  1.85s/it]

0.49433601607698463


 99%|█████████▉| 535/540 [17:15<00:09,  1.95s/it]

0.5241676751526284


 99%|█████████▉| 536/540 [17:17<00:07,  1.95s/it]

0.5376940096150454


 99%|█████████▉| 537/540 [17:19<00:05,  1.94s/it]

0.5171340496415057


100%|█████████▉| 538/540 [17:21<00:03,  1.87s/it]

0.5667476693916741


100%|█████████▉| 539/540 [17:23<00:01,  1.86s/it]

0.541978471327607


100%|██████████| 540/540 [17:24<00:00,  1.93s/it]

0.5354133276993271





**Building a finetuned LDA model**

In [41]:
# Build LDA model
lda_model = gensim.models.LdaMulticore(corpus=doc_term_matrix,
                                       id2word=dict_,
                                       num_topics=3, 
                                       random_state=100,
                                       chunksize=100,
                                       passes=10,
                                        alpha=0.91,
                                           eta=0.01,
                                       per_word_topics=True)


  and should_run_async(code)


In [42]:
# Compute Coherence Score
coherence_model_lda = CoherenceModel(model=lda_model, texts=clean_corpus, dictionary=dict_, coherence='c_v')
coherence_lda = coherence_model_lda.get_coherence()
print('Coherence Score: ', coherence_lda)

  and should_run_async(code)


Coherence Score:  0.7053012058661957


In [43]:
num_topics=3

import pyLDAvis.gensim
import pickle 
import pyLDAvis

# Visualize the topics
pyLDAvis.enable_notebook()

LDAvis_data_filepath = os.path.join(sys.path[0],'results/ldavis_med_'+str(num_topics))

# # this is a bit time consuming - make the if statement True
# # if you want to execute visualization prep yourself
if 1 == 1:
    LDAvis_prepared = pyLDAvis.gensim.prepare(lda_model, doc_term_matrix, dict_)
    with open(LDAvis_data_filepath, 'wb') as f:
        pickle.dump(LDAvis_prepared, f)

# load the pre-prepared pyLDAvis data from disk
with open(LDAvis_data_filepath, 'rb') as f:
    LDAvis_prepared = pickle.load(f)

#pyLDAvis.save_html(LDAvis_prepared, './results/ldavis_prepared_'+ str(num_topics) +'.html')
pyLDAvis.save_html(LDAvis_prepared, os.path.join(sys.path[0],'results/ldavis_med_'+str(num_topics)+'.html'))

LDAvis_prepared

  and should_run_async(code)


In [44]:
from pprint import pprint

# Print the Keyword in the 10 topics
pprint(lda_model.print_topics())


[(0,
  '0.067*"patient" + 0.033*"use" + 0.033*"barrier" + 0.031*"understanding" + '
  '0.030*"learning" + 0.024*"discharge" + 0.024*"teaching" + 0.022*"include" + '
  '0.022*"need" + 0.021*"learn"'),
 (1,
  '0.039*"other" + 0.032*"discuss" + 0.032*"procedure" + 0.026*"member" + '
  '0.026*"team" + 0.025*"participate" + 0.025*"necessity" + 0.023*"risk" + '
  '0.020*"healthcare" + 0.019*"alternative"'),
 (2,
  '0.025*"stay" + 0.023*"new" + 0.022*"inability" + 0.021*"pain" + '
  '0.016*"alert" + 0.015*"confusion" + 0.015*"awake" + 0.014*"currently" + '
  '0.011*"when" + 0.011*"feel"')]


  and should_run_async(code)


In [45]:
doc_term_matrix[:2]

  and should_run_async(code)


[[(0, 1), (1, 1), (2, 1), (3, 1)],
 [(4, 1),
  (5, 1),
  (6, 1),
  (7, 1),
  (8, 1),
  (9, 1),
  (10, 1),
  (11, 1),
  (12, 1),
  (13, 2),
  (14, 1),
  (15, 1),
  (16, 1)]]

In [46]:
lda_model[doc_term_matrix]

  and should_run_async(code)


<gensim.interfaces.TransformedCorpus at 0x7f696b918d30>

In [49]:
# printing the topic associations with the documents
count = 0
for i in lda_model[doc_term_matrix]:
    print("doc : ",count,i[0])
    count += 1

  and should_run_async(code)


doc :  0 [(0, 0.13541345), (1, 0.7292853), (2, 0.13530125)]
doc :  1 [(0, 0.86751235), (1, 0.06511459), (2, 0.06737312)]
doc :  2 [(0, 0.13948801), (1, 0.77361965), (2, 0.08689232)]
doc :  3 [(0, 0.019438937), (1, 0.02002216), (2, 0.9605389)]
doc :  4 [(0, 0.834053), (1, 0.093739), (2, 0.07220796)]
doc :  5 [(0, 0.04698486), (1, 0.057746787), (2, 0.8952684)]
doc :  6 [(0, 0.019861491), (1, 0.020468049), (2, 0.9596705)]
doc :  7 [(0, 0.074879974), (1, 0.058431193), (2, 0.8666888)]
doc :  8 [(0, 0.07229211), (1, 0.8555613), (2, 0.07214657)]
doc :  9 [(0, 0.25130394), (1, 0.091371834), (2, 0.6573242)]
doc :  10 [(0, 0.25886497), (1, 0.668568), (2, 0.07256701)]
doc :  11 [(0, 0.123919316), (1, 0.7782868), (2, 0.09779382)]
doc :  12 [(0, 0.024151996), (1, 0.025235193), (2, 0.95061284)]
doc :  13 [(0, 0.06377462), (1, 0.8804511), (2, 0.055774212)]
doc :  14 [(0, 0.85502905), (1, 0.06916985), (2, 0.075801104)]
doc :  15 [(0, 0.89995164), (1, 0.048461422), (2, 0.051587)]
doc :  16 [(0, 0.72427

doc :  481 [(0, 0.89962083), (1, 0.04695894), (2, 0.053420234)]
doc :  482 [(0, 0.8385539), (1, 0.08063582), (2, 0.08081024)]
doc :  483 [(0, 0.4109949), (1, 0.07976704), (2, 0.50923806)]
doc :  484 [(0, 0.12049876), (1, 0.60750526), (2, 0.27199593)]
doc :  485 [(0, 0.47422886), (1, 0.16945182), (2, 0.35631934)]
doc :  486 [(0, 0.82347786), (1, 0.087693974), (2, 0.08882819)]
doc :  487 [(0, 0.025118291), (1, 0.9247246), (2, 0.050157137)]
doc :  488 [(0, 0.13541345), (1, 0.7292853), (2, 0.13530126)]
doc :  489 [(0, 0.8872844), (1, 0.0564316), (2, 0.056283988)]
doc :  490 [(0, 0.5793665), (1, 0.2500644), (2, 0.17056914)]
doc :  491 [(0, 0.085065424), (1, 0.08026722), (2, 0.8346674)]
doc :  492 [(0, 0.026339987), (1, 0.028253572), (2, 0.9454065)]
doc :  493 [(0, 0.5136964), (1, 0.37879047), (2, 0.10751314)]
doc :  494 [(0, 0.15798639), (1, 0.7236074), (2, 0.11840624)]
doc :  495 [(0, 0.85043466), (1, 0.09035423), (2, 0.059211068)]
doc :  496 [(0, 0.014299135), (1, 0.015099704), (2, 0.9706

In [50]:
# printing the topic associations with the documents
import operator
lstTopicsCorpus =[]

count = 0
for i in lda_model[doc_term_matrix]:
    print("doc : ",count, i[0])
    maxTopic = max(i[0],key=operator.itemgetter(1))[0]
    lstTopicsCorpus.append(maxTopic)
    count += 1

  and should_run_async(code)


doc :  0 [(0, 0.13541345), (1, 0.7292853), (2, 0.13530125)]
doc :  1 [(0, 0.86751527), (1, 0.06511471), (2, 0.067369975)]
doc :  2 [(0, 0.1394901), (1, 0.77361757), (2, 0.086892314)]
doc :  3 [(0, 0.019438935), (1, 0.0200222), (2, 0.96053886)]
doc :  4 [(0, 0.8340282), (1, 0.09376368), (2, 0.072208144)]
doc :  5 [(0, 0.046984926), (1, 0.057766918), (2, 0.8952482)]
doc :  6 [(0, 0.019861491), (1, 0.020468207), (2, 0.9596703)]
doc :  7 [(0, 0.07488253), (1, 0.058431175), (2, 0.8666863)]
doc :  8 [(0, 0.07229209), (1, 0.8555614), (2, 0.072146565)]
doc :  9 [(0, 0.25130403), (1, 0.09137217), (2, 0.6573238)]
doc :  10 [(0, 0.25887954), (1, 0.668554), (2, 0.07256638)]
doc :  11 [(0, 0.12391972), (1, 0.7782865), (2, 0.097793825)]
doc :  12 [(0, 0.024151988), (1, 0.025234332), (2, 0.95061374)]
doc :  13 [(0, 0.06377683), (1, 0.880449), (2, 0.05577419)]
doc :  14 [(0, 0.85502875), (1, 0.06916985), (2, 0.0758014)]
doc :  15 [(0, 0.8999522), (1, 0.048461504), (2, 0.051586293)]
doc :  16 [(0, 0.72

doc :  500 [(0, 0.73098505), (1, 0.09087687), (2, 0.17813808)]
doc :  501 [(0, 0.09197015), (1, 0.069229096), (2, 0.8388008)]
doc :  502 [(0, 0.8998655), (1, 0.05066353), (2, 0.049470995)]
doc :  503 [(0, 0.862012), (1, 0.06912384), (2, 0.068864085)]
doc :  504 [(0, 0.5792644), (1, 0.1607444), (2, 0.25999117)]
doc :  505 [(0, 0.8482265), (1, 0.09225434), (2, 0.059519142)]
doc :  506 [(0, 0.07763433), (1, 0.08148837), (2, 0.84087735)]
doc :  507 [(0, 0.6699769), (1, 0.2321339), (2, 0.09788923)]
doc :  508 [(0, 0.87910634), (1, 0.05059708), (2, 0.07029654)]
doc :  509 [(0, 0.058509424), (1, 0.883137), (2, 0.058353618)]
doc :  510 [(0, 0.862093), (1, 0.067064725), (2, 0.070842355)]
doc :  511 [(0, 0.05365984), (1, 0.041129082), (2, 0.90521103)]
doc :  512 [(0, 0.84854954), (1, 0.082618184), (2, 0.06883226)]
doc :  513 [(0, 0.09199571), (1, 0.77499336), (2, 0.13301086)]
doc :  514 [(0, 0.22798136), (1, 0.11003817), (2, 0.66198045)]
doc :  515 [(0, 0.6924143), (1, 0.20201471), (2, 0.1055709

In [51]:
lstTopicsCorpus[0:5]

  and should_run_async(code)


[1, 0, 1, 2, 0]

**In this section we are merging the text corpus with the topic. So all the text rows topic will be tied up in one dataset**

In [52]:
df_Topic = pd.DataFrame(
    {'Text': corpus,
     'Cleaned_Text': clean_corpus,
     'Topic': lstTopicsCorpus
    })

  and should_run_async(code)


In [53]:
df_Topic[:5]

  and should_run_async(code)


Unnamed: 0,Text,Cleaned_Text,Topic
0,Insulin NPH Human [NOVOLIN N] 100 unit/mL susp...,"[unit, suspension, subcutaneous, direct]",1
1,"Patient arrives ambulatory, Gait steady, Hist...","[arrive, ambulatory, gait, steady, history, ob...",0
2,"Peripheral IV site, established in the right ...","[site, establish, right, forearm, use, gauge, ...",1
3,No: new confusion or inability to stay alert ...,"[new, confusion, inability, stay, alert, awake...",2
4,Spent 15 minutes with the patient and greater ...,"[spend, minute, patient, great, %, time, spend...",0


## Sentence Similarity Check

We will pick one sentence from one topic and check similarity among all the sentences

Try to find out where the test sentence is more similar

In [58]:
from sentence_transformers import SentenceTransformer

model_p = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')

  and should_run_async(code)


In [59]:
# Getting the embedding
q_emb = model_p.encode(df_Topic['Text'].tolist()[0])
p_emb = model_p.encode(df_Topic['Text'].tolist())

  and should_run_async(code)


#### Create a function to display the statistics

In [61]:
from sentence_transformers import  util

def getSimiTopic(ind,p_emb):
    # Getting the embedding
    q_emb = model_p.encode(df_Topic['Text'].tolist()[ind])
    #p_emb = model_p.encode(df_Topic['Text'].tolist())
    
    print(df_Topic['Text'].tolist()[ind])
    
    # Printing the similarity score 
    sim_score = util.pytorch_cos_sim(q_emb, p_emb).cpu().detach().numpy()[0]
    
    # Data frame create
    df_Score = pd.DataFrame(
    {'Sim': sim_score,
     'Topic': lstTopicsCorpus
    })
    
    #display(df_Score.groupby(['Topic']).agg({"Sim": ['mean','sum','min','max','count']}))
    return df_Score.groupby(['Topic']).agg({"Sim": ['mean','sum','min','max','count']})

  and should_run_async(code)


In [64]:
def getTopicScore(topic, p_emb):
    # We need a list of dataframes for each topic 
    lst_df  = []
    
    # Get the list of index in the dataframe for the selected topic
    lst_ind = df_Topic.loc[df_Topic['Topic'] == topic].index.tolist()
    
    for ind in lst_ind:
        print("added", ind)
        df = getSimiTopic(ind,p_emb)
        lst_df.append(df)
    return pd.concat(lst_df).groupby(level=0).mean()


  and should_run_async(code)


In [65]:
# topic 0
getTopicScore(0,p_emb)

  and should_run_async(code)


added 1
 Patient arrives ambulatory, Gait steady, History obtained from patient, Patient appears comfortable, Patient cooperative, alert, Skin warm.
added 4
Spent 15 minutes with the patient and greater than 50% of this time was spent counseling the patient regarding diagnosis and available treatment options.
added 14
 Explained diagnosis and treatment plan; patient expressed understanding of the content, Patient was given a copy of this note
added 15
 Patient arrives, via stretcher, via Emergency Medical Services, Unsteady gait, Lift to cart, History obtained from patient, Patient appears comfortable, Patient cooperative, alert, Oriented to person, place and time.
added 16
 Dimension 3 Emotional, Behavioral or Cognitive Conditions and Complications: The patient received a Risk Score of 1. Interference with Addiction Recovery Emotional concerns related to negative consequences and effects of addiction; patient is able to view these as part of addiction and recovery.
added 17
 Identifie

 Identified Illness as a learning need, Patients primary language is English, No Barriers to learning were identified, No interventions were used to address Barriers to Learning, Teaching methods used included Verbal Instructions, Patient verbalized an understanding of discharge teaching.
added 137
The patient has no contraindications to the vaccine as documented in the CDC Vaccine Information Statement.
added 139
 I independently interviewed and examined the patient and agree with the history, exam, assessment and plan of care.
added 141
Plan is to continue working towards work hardening goals of 50 lb occasional weight handling.
added 151
 History obtained from patient, Patient cooperative, alert, Oriented to person, place and time.
added 153
Patient was identified using two identifiers and permission was obtained from the patient to perform a fingerstick INR.
added 157
Explained diagnosis and treatment plan; patient/child/caregiver expressed understanding of the content.Caller agree

 Identified Injury as a learning need, Patients primary language is English, No Barriers to learning were identified, No interventions were used to address Barriers to Learning, Teaching methods used included Printed Patient Instructions, Patient verbalized an understanding of discharge teaching.
added 230
 Complex assessment performed, Patient arrives ambulatory, Gait steady, History obtained from patient, Patient appears, in distress due to pain, Patient cooperative, alert, Oriented to person, place and time.
added 231
 The patient had a chance to have any questions about this procedure answered, understand(s) and wish(es) to proceed.
added 233
The patient understands the information and questions answered; the patient wishes to proceed with the biopsy.
added 234
The diagnosis and treatment plans were explained and the patient expressed understanding of the content.
added 237
 Patient discharged to home, ambulating without assistance, Discharge instructions given to patient, Above pe

 Identified Illness as a learning need, No Barriers to learning were identified, No interventions were used to address Barriers to Learning, Teaching methods used included Verbal Instructions, Patient verbalized an understanding of discharge teaching.
added 314
Patient is here for the following immunization(s):  Inactivated Influenza Virus Vaccination; Tetanus and Diphtheria and Acellular Pertussis (Tdap) Vaccine
added 317
 Patient discharged to home, ambulating without assistance, driving self, unaccompanied, Above person(s) verbalized understanding of discharge instructions and follow-up care.
added 319
Explained diagnosis and treatment plan; patient/parents expressed understanding of the content.
added 322
 PATIENT EDUCATION:  Ready to learn, no apparent learning barriers were identified; learning preferences include listening.
added 324
 Disp : 100 - TAB, Directions : ONE EVERY BEDTIME, Refill : 3 Time(s), Expire Rx on : 1 YR, Route : ORAL
added 326
 Patient arrives, via stretcher,

added 461
 Patient arrives, via Emergency Medical Services, History obtained from patient, Patient appears comfortable, Patient cooperative, alert, Oriented to person, place and time.
added 465
 Patient currently uses tobacco, smokes cigarettes, smokes 1/2 packs per day, Patient denies alcohol use.
added 472
Patient will demonstrate and/or verbalize understanding of home exercise program in 6 sessions.
added 474
 Family ready to learn, no apparent learning barriers were identified; learning preferences include listening.
added 475
Description: The patient was seen for an individual session today to review progress in treatment and any changes made to the treatment plan.
added 477
Pain Scale-The Numeric pain scale was used to identify the patient's level of pain (0= no pain and 10 = worst possible pain).
added 479
I have personally interviewed and examined the patient and reviewed the medical history with the patient.
added 480
 Patient arrives, via hospital wheelchair, Gait steady, His

 Identified Illness as a learning need, Patients primary language is English, No Barriers to learning were identified, No interventions were used to address Barriers to Learning, Teaching methods used included Printed Patient Instructions, Verbal Instructions.
added 577
 Client was provided with verbal and written (  Weekend Planning,   &  ) instructions.
added 578
 Patient discharged to home, ambulating without assistance, family driving, accompanied by parent, Discharge instructions given to patient, Above person(s) verbalized understanding of discharge instructions and follow-up care.
added 580
The client verbalized understanding and consented to the plan of care and the goals established.
added 582
Patient will demonstrate and/or verbalize understanding of home exercise program in 1 sessions for improved self management of the condition.
added 585
Patient is here for the following immunization(s):  Live Intranasal Influenza Virus Vaccination
added 587
 Identified Illness as a learn

 Ready to learn, no apparent learning barriers were identified; learning preferences included listening.
added 704
 Identified Injury as a learning need, Identified Follow-up care as a learning need, No Barriers to learning were identified, No interventions were used to address Barriers to Learning, Teaching methods used included Printed Patient Instructions, Verbal Instructions, Patient verbalized an understanding of discharge teaching.
added 705
 Identified Illness as a learning need, Identified Follow-up care as a learning need, No Barriers to learning were identified, Involved Family Member or Primary Caregiver to address Barriers to Learning, Teaching methods used included Printed Patient Instructions, Verbal Instructions, Patient verbalized an understanding of discharge teaching.
added 707
The rationale for hand therapy intervention was discussed and patient agrees to above plan of care.
added 708
No: new deformity other than swelling; a grating or crackling sound or sensation on

Unnamed: 0_level_0,Sim,Sim,Sim,Sim,Sim
Unnamed: 0_level_1,mean,sum,min,max,count
Topic,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
0,0.444369,163.972214,0.00356,1.0,369
1,0.299089,62.808781,0.036853,0.544586,210
2,0.242859,41.528893,0.026311,0.570498,171


In [66]:
getTopicScore(1,p_emb)

  and should_run_async(code)


added 0
Insulin NPH Human [NOVOLIN N] 100 unit/mL suspension subcutaneous as directed by prescriber.
added 2
 Peripheral IV site, established in the right forearm, using an 18 gauge catheter, in one attempt.
added 8
 Discussed the necessity of other members of the healthcare team, both male and female, participating in the procedure.
added 10
 Given current medication regimen, the following parameters should be monitored by outpatient providers: None
added 11
 The risks and benefits of the procedure were discussed, and the patient consented to this procedure.
added 13
 Discussed goals, risks, alternatives, advanced directives, and the necessity of other members of the surgical team participating in the procedure with the patient.
added 19
 Discussed risks, goals, and necessity of other members of the team participating in the procedure with patient.
added 24
 No: joint swelling; pain of a type other than joint pain; a limp without known injury; discomfort in legs that improves with mov

No: new confusion or inability to stay alert and awake; severe lethargy or floppiness; refusing to move the neck; current or recent seizure; high-pitched cry (like a cat's cry) OR a weak whimper or moaning cry that is not consolable; purple or red rash/blotches that stay when pressed by a glass (purpuric rash) or bulging or tense fontanel (soft spot on the head) when not crying
added 219
Albuterol Aerosol 90 mcg/Actuation 2 puffs by inhalation as directed by prescriber as needed.
added 221
 Discussed acne and treatment in detail along with consent with risks, alternatives, and benefits.
added 222
Albuterol [PROVENTIL/VENTOLIN] 90 mcg/Act HFA Aerosol 1-2 puffs by inhalation every 4 hours as needed.
added 228
 Discussed advance directives and the necessity of other members of the healthcare team, both male and female, participating in the procedure.
added 229
 Cyanocobalamin [VITAMIN B12] 1,000 mcg/mL solution 1,000 micrograms intramuscular as directed by prescriber.
added 236
The indivi

 Discussed risks, goals, alternatives, and advanced directives, the necessity of other members of the healthcare team participating in the procedure with the patient (or legal representative and others present during the discussion).
added 433
 After discussion of the risks, benefits, and alternatives to treatment with cryotherapy, informed consent was obtained.
added 436
Return here or go to the nearest Emergency Department if you notice any of the problems listed below or you have other concerns.
added 445
 Discussed the risks, benefits, alternatives, and the necessity of other members of the healthcare team, both male and female, participating in the procedure.
added 462
 Discussed risks, goals, alternatives, and advanced directives, the necessity of other members of the healthcare team participating in the procedure with the patient (or legal representative and others present during the discussion).
added 464
 Patient's age is 9 years of age or older: Administer Inactivated Influen

Ipratropium-Albuterol [COMBIVENT] 18-103 mcg/Actuation Aerosol 2 puffs by inhalation four times a day.
added 702
 Discussed the risks, benefits, alternatives, and the necessity of other members of the healthcare team participating in the procedure.
added 703
 Discussed with the patient the necessity of other members of the healthcare team, both male and female, participating in the procedure if needed.
added 706
 Discussed risks, goals, alternatives, advanced directives and the necessity of other members of the healthcare team participating in the procedure.
added 713
 Mother is blood type B+, is Hepatitis B negative, HIV negative, and was found to be GBS negative.
added 715
Fluticasone Propionate [FLONASE] 50 mcg/Act Aerosol 2 sprays nasally one time daily as needed.
added 717
Albuterol [PROVENTIL] 0.083 % Neb solution 3 mL by inhalation as directed by prescriber as needed.
added 718
 Disp : 150 ml - Suspension, Sig : Give 5 ml by mouth 3 times a day for 10 days, Refill : None.
added 

Unnamed: 0_level_0,Sim,Sim,Sim,Sim,Sim
Unnamed: 0_level_1,mean,sum,min,max,count
Topic,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
0,0.299089,110.363998,-0.01934,0.577959,369
1,0.301583,63.332531,-0.008422,1.0,210
2,0.208212,35.604259,-0.028494,0.510417,171


In [67]:
getTopicScore(2,p_emb)

  and should_run_async(code)


added 3
 No: new confusion or inability to stay alert and awake; currently struggling to breathe, even while inactive or resting; currently feeling like you are going to collapse every time you stand (sit); vomit that looks like ground coffee; vomiting blood; uncontrollable or continuous rectal bleeding; black, sticky, tar-like stools; heavy vaginal bleeding or purple or red rash/blotches that stay when pressed by a glass (purpural rash)
added 5
No: new confusion or inability to stay alert and awake; newly painful neck with difficulty bending the neck; paralyzed face muscles or severe dizziness
added 6
 No: new confusion or inability to stay alert and awake; struggling to breathe even while inactive or resting; currently feeling like you are going to collapse every time you stand or sit up; vomit that looks like ground coffee; vomiting blood; uncontrollable or continuous rectal bleeding; black, sticky, tar-like stools; heavy vaginal bleeding or purple or red rash/blotches that stay whe

added 201
 Patient requires extensive assistance in the following activities:  bathing, dressing, toileting, transfer to/from bed/chair, mobility.
added 213
 No: new confusion or inability to stay alert and awake; noisy, wheezy or raspy breathing that does not clear with coughing; newly stiff or painful neck; purple or red rash/blotches that stay when pressed by a glass (purpural rash); currently feeling like you are going to collapse every time you stand (sit); muffled voice or inability to open mouth fully or complete inability to swallow
added 220
 No: new confusion or inability to stay alert and awake; noisy, wheezy or raspy breathing that does not clear with coughing; newly stiff or painful neck; purple or red rash/blotches that stay when pressed by a glass (purpural rash); currently feeling like you are going to collapse every time you stand (sit); muffled voice or inability to open mouth fully or complete inability to swallow
added 225
No: new confusion or inability to stay aler

 No: new confusion or inability to stay alert and awake; new shortness of breath; new wheezing or chest tightness; sudden swelling of the lips, tongue, or mouth; sudden onset of trouble swallowing or purple or red rash/blotches that stay when pressed by a glass (purpural rash)
added 423
No: new confusion or inability to stay alert and awake or purple or red rash/blotches that stay when pressed by a glass (purpural rash)
added 426
 D. Ongoing assessment and treatment by the multidisciplinary treatment team, including disposition planning, is underway.
added 428
 No: inability to speak or make normal sounds; new confusion or inability to stay alert and awake; sudden swelling of the lips, tongue, or mouth; currently struggling to breathe, even while inactive or resting; abrupt onset of breathing problems that came over the course of a minute or two; sudden onset of cough, choking or gagging after inhaling something into your airway; pain, pressure or tightness in the chest, jaw or arm; ch

added 613
Dimension 2 Biomedical Conditions and Complications: The patient received a Risk Score of 1. Patient demonstrates adequate ability to tolerate and cope with physical discomfort.
added 622
 She denies any signs or symptoms of bruising, bleeding complications, or thromboembolic events.
added 628
 The following should be done yearly:height, weight, visual screening and blood pressure: done this year  Injury prevention is as follows: seatbelts: yes Smoke detectors: yes  Bicycle or motorcycle helmet no (PLEASE WEAR HELMETS) Guns and violence: yes and there are no guns in the home  Other risk factors reviewed:  passive smoke: not at risk Exposure to tuberculosis: no exposure   Sunscreens: yes Brushing teeth and dental care: yes, you see a dentist regularly.
added 629
No: currently inconsolable crying; vomiting; diarrhea; not feeding normally; fever (rectal temperature) of more than 100.4
added 634
 No: localized area of red, swollen, tender and warm skin that is expanding; red stre

Unnamed: 0_level_0,Sim,Sim,Sim,Sim,Sim
Unnamed: 0_level_1,mean,sum,min,max,count
Topic,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
0,0.242859,89.614975,-0.032125,0.548266,369
1,0.208212,43.724529,-0.027127,0.563039,210
2,0.323876,55.382877,-0.017503,1.0,171
