# OSMI Mental Health In Tech Survey 2016 : Inference on Cluster Analysis

_By [Michael Rosenberg](mmrosenb@andrew.cmu.edu)._

In [22]:
#imports
library(poLCA)

#constants
sigLev = 3
percentMul = 100
options(warn=-1)

In [23]:
#load in data
inferenceFrame = read.csv("../data/processed/clusterData_inference.csv")
finalMod.lcm = readRDS("../models/finalClusterModel.rds")

# Recap

# Model Study

In [24]:
priorFrame = data.frame(class = c("Class 1","Class 2","Class 3"),
                        prior = signif(finalMod.lcm$P,sigLev))
priorFrame

class,prior
Class 1,0.335
Class 2,0.267
Class 3,0.398


_Table 1: Our prior distribution over our classes._

We see that Class $3$ is more frequent than the other two, and Class $2$ is the least frequent of the three in our estimates. That being said, these classes are extremely close together, which suggests that we have relatively balanced classes.

In [25]:
displayTable <- function(givenMod,varName){
    #helper for producing table
    #first, get discrete encoding
    dEncodeFilename = paste0("../data/preprocessed/discreteEncodings/",
                             varName,".csv")
    dEncodeFrame = read.csv(dEncodeFilename)
    #get levels
    dEncodeLevels = dEncodeFrame$level
    givenModTable = givenMod$probs[[varName]]
    colnames(givenModTable) = dEncodeLevels
    #then export
    return(givenModTable)
}

In [26]:
finalMod.lcm$probs

Unnamed: 0,Pr(1),Pr(2)
class 1:,0.8430343,0.1569657
class 2:,0.733088,0.266912
class 3:,0.7499901,0.2500099

Unnamed: 0,Pr(1),Pr(2),Pr(3),Pr(4)
class 1:,0.1642364,0.2167321,0.142465,0.47656658
class 2:,8.320116e-16,2.159016e-43,0.9137403,0.08625966
class 3:,0.08546565,0.26481,0.4682599,0.18146453

Unnamed: 0,Pr(1),Pr(2),Pr(3),Pr(4)
class 1:,0.2126764,0.0601222,0.408716,0.3184853
class 2:,1.926283e-157,0.6037233,0.278854,0.1174227
class 3:,0.1061076,0.2370277,0.2539455,0.4029192

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.7781698,0.1498108,0.0720194
class 2:,0.4656429,0.4257247,0.10863243
class 3:,0.8041357,0.1101271,0.08573719

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.4935734,0.09156644,0.4148602
class 2:,0.2654382,0.53149626,0.2030655
class 3:,0.572823,0.21521471,0.2119623

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.7952877,0.1624957,0.0422166
class 2:,0.3628419,0.6049427,0.03221538
class 3:,0.6615075,0.1896335,0.14885896

Unnamed: 0,Pr(1),Pr(2),Pr(3),Pr(4),Pr(5),Pr(6)
class 1:,0.26897298,0.3065587,0.1981722,0.008404437,0.08608925,0.13180238
class 2:,0.36731656,0.3134537,0.1519005,0.012890826,0.07668416,0.07775424
class 3:,0.04916784,0.1237994,0.1259986,0.220889999,0.32316312,0.15698108

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.61298667,0.3680594,0.01895389
class 2:,0.71300178,0.2668044,0.02019384
class 3:,0.01035916,0.5283918,0.46124909

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.8871132,0.11288685,0.0
class 2:,0.9188544,0.08114562,3.9633350000000004e-247
class 3:,0.4876558,0.41595463,0.09638953

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.4222275,0.36947491,0.20829757
class 2:,0.5242416,0.40811732,0.06764111
class 3:,0.3146879,0.04691369,0.63839837

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.50966453,0.3374349,0.15290062
class 2:,0.70200861,0.2777027,0.02028869
class 3:,0.06948366,0.3496547,0.58086162

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.4941462,0.38370897,0.1221448
class 2:,0.2777335,0.56702113,0.1552454
class 3:,0.4852353,0.09492299,0.4198418

Unnamed: 0,Pr(1),Pr(2)
class 1:,1.0,1.628924e-104
class 2:,0.934654,0.06534599
class 3:,0.8378824,0.1621176

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.4684743,0.3291969,0.2023287
class 2:,0.4538477,0.2797621,0.2663902
class 3:,0.3624552,0.1799415,0.4576032

Unnamed: 0,Pr(1),Pr(2),Pr(3)
class 1:,0.4141951,0.4443144,0.141490471
class 2:,0.3784929,0.5188057,0.102701387
class 3:,0.1080653,0.8868756,0.005059122

Unnamed: 0,Pr(1),Pr(2),Pr(3),Pr(4),Pr(5)
class 1:,0.6400979,0.159331,0.1786443,0.01256227,0.009364537
class 2:,0.4571984,0.2120691,0.2089758,0.08229287,0.039463789
class 3:,0.1678252,2.15857e-31,0.6912437,3.440738e-252,0.140931092

Unnamed: 0,Pr(1),Pr(2),Pr(3),Pr(4),Pr(5)
class 1:,0.4091701,0.5059225,2.746781e-41,0.06559947,0.01930795
class 2:,0.36634273,0.4244439,0.02001484,0.06960929,0.1195892
class 3:,0.04594928,0.2806029,0.0566843,0.6167635,6.0685339999999995e-226

Unnamed: 0,Pr(1),Pr(2),Pr(3),Pr(4),Pr(5)
class 1:,0.6438238,0.1456524,0.04150696,0.1072585,0.06175832
class 2:,0.4658428,0.2119307,0.14336701,0.1522821,0.02657739
class 3:,0.2421454,0.3226602,0.17133292,0.2196491,0.04421233


We see interesting results from the following tables:

* ```empProvideMHB```

* ```knowMHB```

* ```empDiscMH```

* ```anonProtected```

* ```askLeaveDiff```

* ```negConsDiscMH```

* ```negConsDiscPH```

* ```coworkComfMHD```

* ```superComfMHD```

* ```discInterviewMH```

* ```hurtCareerMH```

* ```teamNegMH```

* ```observeBadResponseMH```

In [27]:
displayTable(finalMod.lcm,"empProvideMHB")

Unnamed: 0,Not eligible for coverage / N/A,No,Yes,I don't know
class 1:,0.1642364,0.2167321,0.142465,0.47656658
class 2:,8.320116e-16,2.159016e-43,0.9137403,0.08625966
class 3:,0.08546565,0.26481,0.4682599,0.18146453


_Table 2: Conditional Probabilities on Answers to the question "Does your employer provide mental health benefits as part of healthcare coverage?" ._

We see a start effect beginning to occur in these classes. It is apparent that Class $2$ has a higher chance of having mental health coverage than classes $1$ and $3$. However, For Class $1$, this is because of an uncertainty about available coverage ("I don't know"), while this is more evenly distributed across outcomes for class $3$. This to some extent suggests that we have a class that has good coverage, and two classes that either primarily don't or are uncertain about available coverage.

In [28]:
displayTable(finalMod.lcm,"knowMHB")

Unnamed: 0,N/A,Yes,I am not sure,No
class 1:,0.2126764,0.0601222,0.408716,0.3184853
class 2:,1.926283e-157,0.6037233,0.278854,0.1174227
class 3:,0.1061076,0.2370277,0.2539455,0.4029192


_Table 3: Conditional Probabilities on Answers to the question "Do you know the options for mental health care available under your employer-provided coverage?" ._

We see an emphasis on "I am not sure" and "No" for class $1$, an emphasis on "Yes" and "I am not sure" for class $2$, and a balanced class for class $3$. This suggests that class $1$ leans no on this question, class $2$ leans yes on this question, and class $3$ is generally uncertain on its lean for this question.

In [29]:
displayTable(finalMod.lcm,"empDiscMH")

Unnamed: 0,No,Yes,I don't know
class 1:,0.7781698,0.1498108,0.0720194
class 2:,0.4656429,0.4257247,0.10863243
class 3:,0.8041357,0.1101271,0.08573719


_Table 4: Conditional Probabilities on Answers to the question "Has your employer ever formally discussed mental health (for example, as part of a wellness campaign or other official communication)?" ._

We see that Class $1$ and Class $3$ seem to be centered on No, while class $2$ is balanced between no and yes. Thus, Class $1$ and $3$ represent a likelihood of an employer not discussing mental health, while class $2$ represents a toss-up between the two possibilities.

In [30]:
displayTable(finalMod.lcm,"anonProtected")

Unnamed: 0,I don't know,Yes,No
class 1:,0.7952877,0.1624957,0.0422166
class 2:,0.3628419,0.6049427,0.03221538
class 3:,0.6615075,0.1896335,0.14885896


_Table 5: Conditional Probabilities on answers to the question "Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources provided by your employer?" ._

We see that Class $1$ and $3$ represent a high uncertainty that their anonymity is protected, while Class $2$ leans yes on this. Thus, class $2$ represents individuals who are more sure that their anonymity will be protected if they take advantage of mental health and substance abuse treatment resources.

In [31]:
displayTable(finalMod.lcm,"askLeaveDiff")

Unnamed: 0,Very easy,Somewhat easy,Neither easy nor difficult,Very difficult,Somewhat difficult,I don't know
class 1:,0.26897298,0.3065587,0.1981722,0.008404437,0.08608925,0.13180238
class 2:,0.36731656,0.3134537,0.1519005,0.012890826,0.07668416,0.07775424
class 3:,0.04916784,0.1237994,0.1259986,0.220889999,0.32316312,0.15698108


_Table 6: "If a mental health issue prompted you to request a medical leave from work, asking for that leave would be:"_

We see that class $1$ and $2$ generally features individuals who would suggest that asking for leave for mental health reasons would be easy, while Class $3$ represents individuals who would find that process more difficult.

In [32]:
displayTable(finalMod.lcm,"negConsDiscMH")

Unnamed: 0,No,Maybe,Yes
class 1:,0.61298667,0.3680594,0.01895389
class 2:,0.71300178,0.2668044,0.02019384
class 3:,0.01035916,0.5283918,0.46124909


_Table 7: "Do you think that discussing a mental health disorder with your employer would have negative consequences?" ._

Classes $1$ and $2$ generally feature indivduals who find the process of discussing mental health with their employer to not feature negative consequences, while Class $3$ suggests that there might be negative consequences to this process.

In [33]:
displayTable(finalMod.lcm,"negConsDiscPH")

Unnamed: 0,No,Maybe,Yes
class 1:,0.8871132,0.11288685,0.0
class 2:,0.9188544,0.08114562,3.9633350000000004e-247
class 3:,0.4876558,0.41595463,0.09638953


_Table 8: "Do you think that discussing a physical health issue with your employer would have negative consequences?" ._

Interestingly, Classes $1$ and $2$ seem to also find that discussing mental health will not feature negative consequences, while Class 3 leans maybe-to-no on this context. Thus, Classes $1$ and $2$ seem rather certain that they will not feature negative consequences on either discussions of mental and physical health, while Class $3$ leans yes for mental health and leans no for physical health. In particular, what becomes rather interesting is that Classes $1$ and $2$ seem to be much more certain that they will not face negative consequences in the physical health ream than they are in the mental health realm.

In [34]:
displayTable(finalMod.lcm,"coworkComfMHD")

Unnamed: 0,Maybe,Yes,No
class 1:,0.4222275,0.36947491,0.20829757
class 2:,0.5242416,0.40811732,0.06764111
class 3:,0.3146879,0.04691369,0.63839837


_Table 9: "Would you feel comfortable discussing a mental health disorder with your coworkers?" _

What becomes clear is the effect on the no answer. We see that there is a higher emphasis on "No" progressing from classes $2$, $1$, $3.$ This may suggest that Class $2$ features those who are very likely to be uncomfortable discussing with their coworkers, Class $1$ features individuals who are somewhat likely to be uncomfortable discussing with their coworkers, and Class $3$ features individuals who are not very likely to be uncomfortable discussing with their coworkers.

In [35]:
displayTable(finalMod.lcm,"superComfMHD")

Unnamed: 0,Yes,Maybe,No
class 1:,0.50966453,0.3374349,0.15290062
class 2:,0.70200861,0.2777027,0.02028869
class 3:,0.06948366,0.3496547,0.58086162


_Table 10: "Would you feel comfortable discussing a mental health disorder with your direct supervisor(s)?" ._

We see a similar progression from our previous table. Thus, it is likely that those who are comfortable discussing mental health with their coworkers are likely to be comfortable discussing mental health with their direct supervisors.

In [36]:
displayTable(finalMod.lcm,"discInterviewMH")

Unnamed: 0,Maybe,No,Yes
class 1:,0.4141951,0.4443144,0.141490471
class 2:,0.3784929,0.5188057,0.102701387
class 3:,0.1080653,0.8868756,0.005059122


_Table 11: "Would you bring up a mental health issue with a potential employer in an interview?" ._

We see another thing occur: Class $3$ will very not likely discuss a mental health issue with a potential employer, and there seems to be more comfort in class $2$ and $1$. We see generally that most people do not say yes to this question, expressing a lean no across all classes.

In [37]:
displayTable(finalMod.lcm,"hurtCareerMH")

Unnamed: 0,Maybe,"No, I don't think it would","Yes, I think it would","No, it has not","Yes, it has"
class 1:,0.6400979,0.159331,0.1786443,0.01256227,0.009364537
class 2:,0.4571984,0.2120691,0.2089758,0.08229287,0.039463789
class 3:,0.1678252,2.15857e-31,0.6912437,3.440738e-252,0.140931092


_Table 12: "Do you feel that being identified as a person with a mental health issue would hurt your career?"._

We see the key impact is between the the "Yes, I think it would" column and the "Maybe" column. We see that Class $3$ tends to believe for sure that being identified as a person with a mental health issue will in fact hurt their career, while Classes $1$ and $2$ believe that being identified will only maybe hurt their career. This is again identifying a sense of certainty and uncertainty that exist among the three classes.

In [38]:
displayTable(finalMod.lcm,"teamNegMH")

Unnamed: 0,"No, I don't think they would",Maybe,"Yes, they do","Yes, I think they would","No, they do not"
class 1:,0.4091701,0.5059225,2.746781e-41,0.06559947,0.01930795
class 2:,0.36634273,0.4244439,0.02001484,0.06960929,0.1195892
class 3:,0.04594928,0.2806029,0.0566843,0.6167635,6.0685339999999995e-226


_Table 13: "Do you think that team members/co-workers would view you more negatively if they knew you suffered from a mental health issue?" ._

We see a that Class $3$ leans yes in this situation while classes $1$ and $2$ lean no in this situation. This is very similar to our previous question, although there is a deeper "lean no" in this context.

In [39]:
displayTable(finalMod.lcm,"observeBadResponseMH")

Unnamed: 0,No,Maybe/Not sure,"Yes, I experienced","Yes, I observed",N/A
class 1:,0.6438238,0.1456524,0.04150696,0.1072585,0.06175832
class 2:,0.4658428,0.2119307,0.14336701,0.1522821,0.02657739
class 3:,0.2421454,0.3226602,0.17133292,0.2196491,0.04421233


_Table 14: "Have you observed or experienced an unsupportive or badly handled response to a mental health issue in your current or previous workplace?" ._

The key changes we see are in the No and Maybe columns. Class $1$ leans no, while Classes $2$ and $3$ lean only more weakly no with Class $3$ leaning maybe.