Plan of attack: 

-Subset THE DEMOGRAPHICS dataset by: 
Gender (Male/Female)
Race 

-Merge the demographics data and the Start FEIS data by patient ID # 
-Clean data so only relevant columns are left (Demographic data + family input)

We plan firstly to look at the spectrum of responses comparing available services/client mental health (as the answers are on a scale) and turn this into numerical data in order to quantify the quality of each subset’s degree of care.

We then plan to conduct topic modeling on the column in which families discuss where care is lacking in order to find the most popular/most desired methods of care that START did not provide. 

We also plan to conduct topic modeling and sentiment analysis on the column in which families offer advice to their caregiver in order to form a rough idea of the quality of care and how it may vary across demographic groups. We also are interested to see if these responses’ sentiment scores will trend in a specific direction, indicating biases in those who actually responded to the survey.


In [97]:
# Importing modules
## helpful packages
import pandas as pd
import numpy as np
import random
import re

## nltk imports
import nltk
### uncomment and run these lines if you haven't downloaded relevant nltk add-ons yet
#nltk.download('averaged_perceptron_tagger')
#nltk.download('stopwords')
from nltk import pos_tag
from nltk.tokenize import word_tokenize, wordpunct_tokenize
from nltk.stem.snowball import SnowballStemmer
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer

## spacy imports
import spacy
### uncomment and run the below line if you haven't loaded the en_core_web_sm library yet
#! python -m spacy download en_core_web_sm
import en_core_web_sm
nlp = en_core_web_sm.load()

## vectorizer
from sklearn.feature_extraction.text import CountVectorizer

## sentiment
#!pip install vaderSentiment
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

## lda
from gensim import corpora
import gensim

## repeated printouts and wide-format text
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
pd.set_option('display.max_colwidth', None)

In [74]:
demo_df = pd.read_excel(r"../files/Dartmouth_Data_Set.xlsx")
FEIS_df = pd.read_excel(r"../files/START_FEIS_Data.xlsx")
time_df = pd.read_excel(r"../files/Dartmouth_Time_Data.xlsx")
dict_df = pd.read_excel(r"../files/Final SIRS_Data_Dictionary_V13.1 October 2020.xlsx")

In [182]:
# Cleaning the demographics dataset
demographics = demo_df[['Local ID', 'Region', 'Date Enrolled in START', 'Gender', 'Race', 'Date of birth', 'Ethnicity',
                              'Level of Intellectual Disability', 'Psychiatric diagnoses', 'Medical diagnoses', 'Other Disabilities',
                              'Funding']]

Unnamed: 0,Respondent ID # (SIRS Local ID),Start Date,End Date,What services does your family member currently receive? Check all that apply,"If other, please describe.",Where does your family member receive mental health services?,"If other, please describe",Your relationship,"If other, please describe.1",Does (name of individual) continue to live with you?,...,"In\nthe past year, did your family member use in-patient psychiatric services?","If\nyes, were the inpatient services that your family member received helpful to\nhim/her in your opinion? ?",How\nmuch help was available to you at night or on weekends if your family member\nhad a crisis?,Are\nthere options outside of the hospital for individuals experiencing a crisis to\ngo for help (i.e. crisis/hospital diversion beds)?,Who\nwas the primary source of information about your family memberâ€™s mental health\nservices?,"If other, please describe..2","During the past year, how much involvement\ndid you want to have in your family memberâ€™s treatment plan?",Was there any particular service that your\nfamily member needed that was not available?,"If yes, please describe the service.",What\nadvice would you give to service planners regarding the mental health service\nneeds of persons with IDD and their families?
0,43.0962,2019-03-11 12:34:46,2019-03-11 12:43:53,"IDD Services,Mental Health Services,Special Education",,Private Clinic,,Parent,,,...,Yes,"Some, but not as much as was needed/wanted",Very little,Very little,Other,parents,A lot,Yes,Crisis respite,None provided
1,1903,2015-11-24 13:46:51,2015-11-24 13:51:57,"HOME, COMMUNITY",,Community Mental Health Center,,Step-Parent,,,...,No,,None at all,None at all,Other,HCS PROVIDER,A lot,Yes,EMERGENCY RESPITE SERVICES & CRISIS INTERVENTION,PROVDE REFERRALS FOR EMERGENCY RESPITE SERVICES; JOB SUPPORT FOR ID CONSUMERS
2,2425,2018-11-29 16:07:48,2018-11-29 16:16:49,IDD Services,,Home/Group Home,,Parent,,,...,No,Did not know/answer,"Some, but not as much as was needed/wanted","Some, but not as much as was needed/wanted",His/her service facilitator,,Some,No,,
3,4068,2015-05-12 17:49:00,2015-05-12 18:36:14,IDD Services,,Community Mental Health Center,,Other,Staff at Group Home,,...,No,Did not know/answer,None at all,None at all,His/her service facilitator,,A lot,No,,
4,4298,2020-06-02 14:16:48,2020-06-02 14:23:03,Mental Health Services,,Private Clinic,,Parent,,,...,No,Did not know/answer,Very little,Very little,Other,START,Some,Yes,Crisis Response,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1935,unknown,2017-02-13 10:27:09,2017-02-13 10:35:39,Mental Health Services,,Other,Room and board,Other,Owner of room and board,,...,,"Some, but not as much as was needed/wanted","Some, but not as much as was needed/wanted","Some, but not as much as was needed/wanted",Your family member him/herself,,Some,No,,
1936,unknown,2015-08-07 14:18:49,2015-08-07 14:28:55,,,Community Mental Health Center,,Parent,,,...,No,,Very little,Very little,His/her psychiatrist,,Some,Yes,respite,"Medications may be helpful, keeping people busy helps a lot!"
1937,unknown,2021-02-06 10:49:20,2021-02-06 11:06:05,"IDD Services,Mental Health Services,Special Education",,Community Mental Health Center,,Parent,,Yes,...,Yes,All that was wanted/needed,Very little,None at all,Other,"Respondant: ""Me, as Raziel's mother""",A lot,Yes,"Respite, after school support, psychiatric support, and in-home supports",Take the time to get to know and give him a chance
1938,unknown,2019-10-10 14:32:55,2019-10-16 16:25:18,,,,,Parent,,,...,Yes,"Some, but not as much as was needed/wanted",Very little,Very little,Your family member him/herself,,A lot,No,,


In [185]:
# Merging datasets
merged = pd.merge(demographics, FEIS_df, how = 'inner', left_on = ['Local ID'], right_on = ['Respondent ID #  (SIRS Local ID)'])
merged_short_answer = merged[['Gender', 'Race', 'Local ID','What\nadvice would you give to service planners regarding the mental health service\nneeds of persons with IDD and their families?', "Was there any particular service that your\nfamily member needed that was not available?", "If yes, please describe the service."]]

merged_short_answer.columns = ['Gender', 'Race', 'ID', 'Advice', 'Missing Service', 'Service Needed']
merged_short_answer

# Look at type of join (changed)# female_white.head()
# female_white_subset = female_white[['Local ID','What\nadvice would you give to service planners regarding the mental health service\nneeds of persons with IDD and their families?', "Was there any particular service that your\nfamily member needed that was not available?", "If yes, please describe the service."]]
# female_white_subset.columns = ['ID', 'Advice', 'Missing Service', 'Service Needed']
# merged_subset = 

# merged = 

# merged['Local ID'].unique
# merged.shape
# merged

Unnamed: 0,Gender,Race,ID,Advice,Missing Service,Service Needed
0,Male,Other: Mexican,8008815,,No,
1,Female,"Unknown, not collected",6570649,"â€œPlease be aware of her conditions and diagnosis, so many professionals are unfamiliar with the medical history of Citlalli. It is discouraging when professionals do not know Citlalli, but make recommendations for her. Also, it is discouraging when the professionals do not take the opinions of the family seriously.â€",Yes,A counselor was not and has not been made available for the last six months.
2,Female,White,434021,,Yes,In-home behavior support
3,Male,White,6580618,Declined to answer/did not know.,Yes,"""After Trevorâ€™s psychiatrist left the office, the office also stopped taking his insurance and as a result, Trevor went without a psychiatrist for a while. Trevorâ€™s family tried their best to get him in with other psychiatrists, but struggled to find one that would treat Trevor. Through SARC, Trevor was referred to Hope Services and will begin seeing a psychiatrist there on 1.27.21."""
4,Male,"Unknown, not collected",354280,"Listen to the parents, take what parents report seriously, and provide tips, not just call the cops, have options/walk parent through it.",Yes,"At home off hour support on phone or in person/respite, have removed for the night for safety reasons."
...,...,...,...,...,...,...
1092,Male,Black or African American,1013197,,No,
1093,Female,White,1100502,,No,
1094,Female,Black or African American,1132230,,Yes,Wraparound services and continuily of care
1095,Male,White,11128011,,No,


In [196]:
# Subsetting by gender
demographics_male = merged_short_answer.loc[merged_short_answer['Gender']=='Male']
demographics_female = merged_short_answer.loc[merged_short_answer['Gender']=='Female']

# Subsetting by race
male_white = demographics_male[demographics_male['Race'] == "White"]
male_nonwhite = demographics_male[demographics_male['Race'] != "White"]

female_white = demographics_female[demographics_female['Race'] == "White"]
female_nonwhite = demographics_female[demographics_female['Race'] != "White"]


# male_white.shape
# male_nonwhite.shape

female_white.head()
# female_nonwhite.shape


Unnamed: 0,Gender,Race,ID,Advice,Missing Service,Service Needed
2,Female,White,434021,,Yes,In-home behavior support
9,Female,White,21347,,No,
10,Female,White,8146562,,Yes,"Good psychiatry, crisis help that was more hands on, caregivers/ respite workers, etc."
21,Female,White,7697408,Harlee can create fabricated stories based off information she received from mental health providers.,Yes,"Mental Health Services, individual therapy"
33,Female,White,359313,Family had no information form their assigned social worker from a community agency and felf uninformed.,No,


In [278]:
# Big function
stop_words = set(stopwords.words('english'))

snowball = SnowballStemmer(language="english")

def process(string):
    string_lower = string.lower()
    #string_lower
    tokens = word_tokenize(string_lower)
    tokenize_string = [s for s in tokens if not s.lower() in stop_words]
    #tokenize_string
    alpha_string = [re.sub('[^A-Za-z]+', '', s) for s in tokenize_string]
    #alpha_string
    stem_string = [snowball.stem(s) for s in alpha_string]
    #stem_string
    final_string = " ".join(stem_string)
    #final_string
    return final_string

def create_dtm(list_of_strings, metadata):
    vectorizer = CountVectorizer(lowercase = True)
    dtm_sparse = vectorizer.fit_transform(list_of_strings)
    dtm_dense_named = pd.DataFrame(dtm_sparse.todense(),
                columns=vectorizer.get_feature_names())
    metadata.columns = ["metadata_" + col for col in metadata.columns]
    dtm_dense_named_withid = pd.concat([metadata.reset_index(), 
                                        dtm_dense_named], axis = 1)
    return(dtm_dense_named_withid)


def top_words_topic_model(df, col):
    df = df[df[col].apply(type)==str]
    # Subset only to examined column
    subset_col = df[["ID", 'Gender', 'Race', col]]
    # Drop missing values
    subset_col = subset_col.dropna()
    
    subset_col['processed_text'] = [process(string) for string in subset_col[col]]
    
    dtm = create_dtm(list_of_strings= subset_col['processed_text'],
                metadata = 
                subset_col[["ID", 'Gender', 'Race']])
    
    topdtm = dtm[[col for col in dtm.columns
               if 'metadata' not in col and col != 'index']].sum(axis=0)


    text_raw_tokens = [wordpunct_tokenize(s) 
                    for s in 
                    subset_col['processed_text']]

    text_raw_dict = corpora.Dictionary(text_raw_tokens)

    corpus_fromdict = [text_raw_dict.doc2bow(s) 
                       for s in text_raw_tokens]

    ldamod = gensim.models.ldamodel.LdaModel(corpus_fromdict, 
                                    num_topics = 3, id2word=text_raw_dict, 
                                    passes=6, alpha = 'auto',
                                    per_word_topics = True, random_state = 2)

    topics = ldamod.print_topics(num_words = 30)
    return topics, topdtm.sort_values(ascending=False).head(30)
    topic_model = ''.join(topics)
    for topic in topics:
        topic_model_list.append(topic)
        
    return topdtm.sort_values(ascending=False).head(30), topic_model

    

In [279]:
test = top_words_topic_model(female_white, 'Advice')
test2 = top_words_topic_model(male_white, 'Advice')
test3 = top_words_topic_model(female_nonwhite, 'Advice')
test4 = top_words_topic_model(male_nonwhite, 'Advice')
test
print("_____________")
test2
print("_____________")
test3
print("_____________")
test4

([(0,
   '0.027*"servic" + 0.019*"provid" + 0.019*"famili" + 0.014*"work" + 0.014*"team" + 0.014*"need" + 0.014*"help" + 0.010*"idd" + 0.010*"peopl" + 0.010*"support" + 0.010*"better" + 0.010*"baselin" + 0.010*"communic" + 0.010*"child" + 0.010*"pay" + 0.010*"listen" + 0.010*"hard" + 0.010*"advoc" + 0.010*"like" + 0.010*"therapist" + 0.010*"find" + 0.006*"system" + 0.006*"none" + 0.006*"choos" + 0.006*"challeng" + 0.006*"time" + 0.006*"nt" + 0.006*"individu" + 0.006*"lot" + 0.006*"understand"'),
  (1,
   '0.033*"famili" + 0.033*"need" + 0.029*"servic" + 0.029*"provid" + 0.025*"help" + 0.021*"program" + 0.017*"find" + 0.017*"client" + 0.013*"health" + 0.013*"better" + 0.013*"assist" + 0.013*"understand" + 0.013*"disabl" + 0.009*"look" + 0.009*"s" + 0.009*"support" + 0.009*"get" + 0.009*"way" + 0.009*"staff" + 0.009*"go" + 0.009*"person" + 0.009*"take" + 0.005*"member" + 0.005*"inform" + 0.005*"enough" + 0.005*"mental" + 0.005*"avail" + 0.005*"medic" + 0.005*"nt" + 0.005*"fund"'),
  (2,


_____________


([(0,
   '0.028*"need" + 0.025*"servic" + 0.016*"provid" + 0.016*"none" + 0.016*"get" + 0.015*"support" + 0.015*"parent" + 0.014*"help" + 0.010*"would" + 0.010*"train" + 0.010*"famili" + 0.009*"crisi" + 0.009*"individu" + 0.009*"like" + 0.008*"issu" + 0.008*"understand" + 0.007*"care" + 0.007*"child" + 0.007*"coordin" + 0.007*"time" + 0.007*"manag" + 0.006*"better" + 0.006*"inform" + 0.005*"ask" + 0.005*"dont" + 0.005*"earli" + 0.005*"communiti" + 0.005*"case" + 0.005*"client" + 0.005*"see"'),
  (1,
   '0.053*"need" + 0.036*"servic" + 0.029*"famili" + 0.026*"provid" + 0.019*"support" + 0.017*"avail" + 0.015*"help" + 0.015*"answer" + 0.014*"access" + 0.013*"option" + 0.011*"resourc" + 0.009*"area" + 0.009*"time" + 0.008*"children" + 0.008*"care" + 0.007*"peopl" + 0.007*"work" + 0.007*"crisi" + 0.007*"respit" + 0.007*"make" + 0.006*"one" + 0.006*"lot" + 0.006*"medic" + 0.005*"listen" + 0.005*"individu" + 0.005*"opportun" + 0.005*"home" + 0.005*"take" + 0.005*"day" + 0.005*"popul"'),
  (2

_____________


([(0,
   '0.042*"need" + 0.029*"answer" + 0.023*"famili" + 0.020*"make" + 0.016*"inform" + 0.016*"servic" + 0.015*"behavior" + 0.015*"listen" + 0.014*"access" + 0.012*"support" + 0.011*"get" + 0.011*"parent" + 0.011*"know" + 0.011*"peopl" + 0.011*"help" + 0.011*"sever" + 0.011*"hab" + 0.011*"day" + 0.011*"give" + 0.011*"time" + 0.011*"group" + 0.011*"work" + 0.011*"take" + 0.010*"talk" + 0.010*"mh" + 0.010*"sure" + 0.008*"resourc" + 0.006*"program" + 0.006*"member" + 0.006*"home"'),
  (1,
   '0.059*"none" + 0.028*"provid" + 0.025*"support" + 0.021*"profession" + 0.015*"daughter" + 0.015*"citlal" + 0.015*"discourag" + 0.015*"hear" + 0.015*"call" + 0.015*"regard" + 0.015*"also" + 0.015*"school" + 0.014*"option" + 0.012*"inform" + 0.011*"famili" + 0.009*"know" + 0.009*"mani" + 0.009*"make" + 0.009*"take" + 0.009*"mom" + 0.008*"patient" + 0.008*"serious" + 0.008*"awar" + 0.008*"opinion" + 0.008*"diagnosi" + 0.008*"pleas" + 0.008*"histori" + 0.008*"condit" + 0.008*"recommend" + 0.008*"medic

_____________


([(0,
   '0.035*"none" + 0.028*"need" + 0.024*"get" + 0.022*"time" + 0.020*"help" + 0.019*"servic" + 0.017*"support" + 0.010*"advoc" + 0.010*"famili" + 0.010*"provid" + 0.009*"live" + 0.009*"keep" + 0.009*"health" + 0.009*"one" + 0.009*"advic" + 0.009*"aba" + 0.008*"learn" + 0.008*"resourc" + 0.006*"avail" + 0.006*"work" + 0.006*"make" + 0.006*"cover" + 0.006*"includ" + 0.006*"especi" + 0.006*"mani" + 0.006*"medic" + 0.006*"communic" + 0.006*"nt" + 0.006*"better" + 0.006*"educ"'),
  (1,
   '0.057*"servic" + 0.040*"need" + 0.029*"famili" + 0.027*"provid" + 0.024*"individu" + 0.017*"inform" + 0.016*"understand" + 0.015*"access" + 0.015*"listen" + 0.014*"parent" + 0.014*"make" + 0.012*"avail" + 0.011*"take" + 0.011*"help" + 0.010*"go" + 0.010*"respit" + 0.008*"support" + 0.008*"resourc" + 0.007*"crisi" + 0.007*"sure" + 0.006*"care" + 0.005*"peopl" + 0.005*"plan" + 0.005*"everyth" + 0.005*"clear" + 0.005*"home" + 0.005*"say" + 0.005*"activ" + 0.005*"experienc" + 0.005*"better"'),
  (2,
   

In [159]:
# # female_white.head()
# female_white_subset = female_white[['Local ID','What\nadvice would you give to service planners regarding the mental health service\nneeds of persons with IDD and their families?', "Was there any particular service that your\nfamily member needed that was not available?", "If yes, please describe the service."]]
# female_white_subset.columns = ['ID', 'Advice', 'Missing Service', 'Service Needed']

advice = female_white_subset[["ID", "Advice"]]
advice = advice.dropna()
#advice.head()adv
# female_white_subset.head()



stop_words = set(stopwords.words('english'))

snowball = SnowballStemmer(language="english")

def process(string):
    string_lower = string.lower()
    #string_lower
    tokens = word_tokenize(string_lower)
    tokenize_string = [s for s in tokens if not s.lower() in stop_words]
    #tokenize_string
    alpha_string = [re.sub('[^A-Za-z]+', '', s) for s in tokenize_string]
    #alpha_string
    stem_string = [snowball.stem(s) for s in alpha_string]
    #stem_string
    final_string = " ".join(stem_string)
    #final_string
    return final_string

advice['processed_text'] = [process(string) for string in advice["Advice"]]
advice


# female_white_subset['Advice']

# female_white["What\nadvice would you give to service planners regarding the mental health service\nneeds of persons with IDD and their families?"]
# female_white["Was there any particular service that your\nfamily member needed that was not available?"]
# female_white["If yes, please describe the service."]
# # what advice would you give and 

# if services are not easy to access, 

Unnamed: 0,ID,Advice,processed_text
21,7697408,Harlee can create fabricated stories based off information she received from mental health providers.,harle creat fabric stori base inform receiv mental health provid
33,359313,Family had no information form their assigned social worker from a community agency and felf uninformed.,famili inform form assign social worker communiti agenc felf uninform
41,907533C,"""Think outside the box""",think outsid box
44,240792,"Remember how overwhelming it is for the family, it never ends.",rememb overwhelm famili never end
45,136986,"none, I feel supported by all the team",none feel support team
...,...,...,...
1079,1056248,HAVE LITERATURE TO HELP FAMILIES UNDERSTAND WHAT THE DISABILITY IS.,literatur help famili understand disabl
1080,39089,,none
1083,11144011,Therapist should encourage and foster input from the team members,therapist encourag foster input team member
1088,704085W,"People with IDD/MH behavioral dysregulation being confusing to systems. Parents/caregivers request medication to address the behavioral dysregulation displayed by the person supported; however, most times the behavioral dysregulation is related to the inability to communicate/frustration and IDD. There is not a medication to treat IDD; conveying this to families and systems can be challenging. There are only medications to treat symptoms.",peopl iddmh behavior dysregul confus system parentscaregiv request medic address behavior dysregul display person support howev time behavior dysregul relat inabl communicatefrustr idd medic treat idd convey famili system challeng medic treat symptom


In [None]:
# Visualizing the function

## Visualize - may not work on jhub yet
import pyLDAvis.gensim as gensimvis
# alternate: import pyLDAvis.gensim_models as gensimvis 
import pyLDAvis
#pyLDAvis.enable_notebook()
lda_display = gensimvis.prepare(ldamod, corpus_fromdict, text_raw_dict)
pyLDAvis.display(lda_display)

### visualize
pyLDAvis.enable_notebook()
lda_display_proc = gensimvis.prepare(ldamod_proc, corpus_fromdict_proc, text_proc_dict)
pyLDAvis.display(lda_display_proc)


In [160]:
# Creating the document-term matrix 

def create_dtm(list_of_strings, metadata):
    vectorizer = CountVectorizer(lowercase = True)
    dtm_sparse = vectorizer.fit_transform(list_of_strings)
    dtm_dense_named = pd.DataFrame(dtm_sparse.todense(),
                columns=vectorizer.get_feature_names())
    metadata.columns = ["metadata_" + col for col in metadata.columns]
    dtm_dense_named_withid = pd.concat([metadata.reset_index(), 
                                        dtm_dense_named], axis = 1)
    return(dtm_dense_named_withid)


In [165]:
# your code here

# dtm_nopre = create_dtm(list_of_strings= advice['processed_text'],
#                 metadata = 
#                 advice[["ID"]])

dtm_nopre.head()

Unnamed: 0,index,metadata_ID,abil,abl,access,address,advoc,agenc,also,alway,...,way,weekend,well,whole,will,work,worker,workshop,would,written
0,21,7697408,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,33,359313,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,1,0,0,0
2,41,907533C,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,44,240792,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,45,136986,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [172]:

# def get_topwords(dtm): 
#     topdtm = dtm[[col for col in dtm.columns
#                if 'metadata' not in col and col != 'index']].sum(axis=0)
#     return topdtm.sort_values(ascending=False).head(30)


# print("Top words for Advice")
# get_topwords(dtm_nopre)

# Justifying not dropping named entities - none really came up

Top words for Advice


famili        20
servic        19
need          14
provid        14
help           9
inform         8
find           7
medic          6
none           6
program        6
support        6
health         6
better         6
member         5
make           5
peopl          5
understand     4
idd            4
system         4
client         4
get            4
mental         4
avail          4
assist         4
answer         4
sure           3
time           3
nt             3
look           3
advoc          3
dtype: int64

In [178]:
# text_raw_tokens = [wordpunct_tokenize(s) 
#                 for s in 
#                 advice['processed_text']]

# text_raw_dict = corpora.Dictionary(text_raw_tokens)

# corpus_fromdict = [text_raw_dict.doc2bow(s) 
#                    for s in text_raw_tokens]

# ldamod = gensim.models.ldamodel.LdaModel(corpus_fromdict, 
#                                 num_topics = 3, id2word=text_raw_dict, 
#                                 passes=6, alpha = 'auto',
#                                 per_word_topics = True, random_state = 2)

# topics = ldamod.print_topics(num_words = 30)
# for topic in topics:
#     print(topic)


(0, '0.027*"servic" + 0.019*"provid" + 0.019*"famili" + 0.014*"work" + 0.014*"team" + 0.014*"need" + 0.014*"help" + 0.010*"idd" + 0.010*"peopl" + 0.010*"support" + 0.010*"better" + 0.010*"baselin" + 0.010*"communic" + 0.010*"child" + 0.010*"pay" + 0.010*"listen" + 0.010*"hard" + 0.010*"advoc" + 0.010*"like" + 0.010*"therapist" + 0.010*"find" + 0.006*"system" + 0.006*"none" + 0.006*"choos" + 0.006*"challeng" + 0.006*"time" + 0.006*"nt" + 0.006*"individu" + 0.006*"lot" + 0.006*"understand"')
(1, '0.033*"famili" + 0.033*"need" + 0.029*"servic" + 0.029*"provid" + 0.025*"help" + 0.021*"program" + 0.017*"find" + 0.017*"client" + 0.013*"health" + 0.013*"better" + 0.013*"assist" + 0.013*"understand" + 0.013*"disabl" + 0.009*"look" + 0.009*"s" + 0.009*"support" + 0.009*"get" + 0.009*"way" + 0.009*"staff" + 0.009*"go" + 0.009*"person" + 0.009*"take" + 0.005*"member" + 0.005*"inform" + 0.005*"enough" + 0.005*"mental" + 0.005*"avail" + 0.005*"medic" + 0.005*"nt" + 0.005*"fund"')
(2, '0.031*"famili

In [180]:
# ## Visualize - may not work on jhub yet
# import pyLDAvis.gensim as gensimvis
# # alternate: import pyLDAvis.gensim_models as gensimvis 
# import pyLDAvis
# #pyLDAvis.enable_notebook()
# lda_display = gensimvis.prepare(ldamod, corpus_fromdict, text_raw_dict)
# pyLDAvis.display(lda_display)

# ### visualize
# pyLDAvis.enable_notebook()
# lda_display_proc = gensimvis.prepare(ldamod_proc, corpus_fromdict_proc, text_proc_dict)
# pyLDAvis.display(lda_display_proc)

ERROR: Could not find a version that satisfies the requirement pyLDAvis.gensim (from versions: none)
ERROR: No matching distribution found for pyLDAvis.gensim


ModuleNotFoundError: No module named 'pyLDAvis.gensim'