Exploring the Opioid Epidemic
Data Science Spring 2019
The Opioid epidemic has been ravaging the United States and Canada in recent years and is now the leading cause of death in Americans under 50 [according to the New York Times] (https://www.nytimes.com/interactive/2017/06/05/upshot/opioid-epidemic-drug-overdose-deaths-are-rising-faster-than-ever.html). I'm curious about some of the more nuanced statistics underlying the epidemic and used the 2017 National Survey of Drug Use and Health (NSDUH) to investigate further. NSDUH surveys for a slew of information around all drug/alcohol use, mental health, and physical health. To pare down this information, I looked through the Substance Abuse and Mental Health Data Archive (SAMHDA) site, which allows users to search within the NSDUH data for more information. I selected a number of opioid-related attributes to clean and possibly use in my analysis. Opioid-related, for my use, includes heroin, pain relievers, and oxycontin. NSDUH defined a list of pain relievers within the their data collection which I cross-referenced to determine that they were all some type of opioid.
Age of First (Mis)Use
One variable that piqued my interest most were age of first (mis)use within the three drug categories (heroin, pain relievers, and oxycontin). I was curious about the age of first use due to the access that people have to pain relievers at early ages. Most people have some kind of surgery or medical procedure that necessitates taking (or at least being prescribed) pain relievers before they turn 18 and if they themselves are not prescribed it, it is likely that someone in their family will have some available. As well, because most people form their lifelong habits early in life, it's more likely that people who misuse opiates begin doing so at a young age. I was curious to see if that was reflected in the experimental data, and how it breaks down between heroin, pain killers, and oxycontin.
To look into this I did a CDF and hazard analysis of the distribution of ages of first use amongst those who had completed the survey, split between heroin, pain relievers, and oxycontin. It's important to note that all of the respondents who would answer that question are only those who have at some point used or misused opiates. This means that the hazard curve is not how hazardous heroin is to the general population at any age, but rather how hazardous heroin is to people who at some point in time misused opiates.
From these it does appear that people who misuse pain relievers and oxycontin do so at an earlier age than those who use heroin at some point in time. There could be a lot of reasons for this; there is much less negative stigma associated with pain reliever misuse compared to heroin use and people have access to pain relievers at an earlier age than heroin. Another interesting feature of these plots is that heroin becomes more hazardous between 18 and 40. I imagine that this is also due to the difference in the public perception of the different drugs. Heroin is something that most people are unlikely to do if they've made it all the way to their late 30s. At that point they probably fully understand the ramifications of the drug and fully aware of all side effects. Comparatively, not many people are aware of the dangers of prescribed opiates and their high rate of addiction, so it's possible that someone will misuse opiates later in life without knowledge of their effect.
Modeling the Distribution of Heroin Age of First Use
The CDF of heroin first use age looked to me like an normal distribution curve when graphed against the other drug first uses, so I chose to check that against the normal model CDF. The curves looked somewhat similar, but showed definite differences, particularly in the earlier part of the curve. I then tried a log-normal distribution which fit very well. Log-normal means the product of multiple factors contributes to the effect. I found this interesting because the heroin problem in our country does not seem like it should be modeled by a distribution at all because it is so uncontrolled.
Education Level of Users
Another variable of interest was how much education people who use heroin and misuse pain relievers receive. Going into this analysis, I believed that those who used would likely get less education than those who do not. My thought was that people who do use drugs and get addicted to them, if they were still in school at the time would probably not be able to continue with school and keep up with their addiction. People who have used might also have other destructive habits that would lead leaving school earlier or they could be from areas that have lower graduations rates (as where these drugs are available is certainly driven by many demographics, which also drive educational paths).
Interestingly, people who have used heroin or pain relievers at some point in time are more likely to survive in school longer than those who do not use -- the largest difference happening around the point of getting a high school degree (diploma or GED). After this tipping point, the likelihood of survival drops dramatically for heroin users and less so for non-users and pain reliever misuser. This could be due to that danger zone of heroin which begins at 18, when most people graduate high school.
Education and Age Relationship
Given the above, I wanted to better understand if there was actually a relationship between age of first use and education level. To investigate I first did a scatter plot mapping education level against age of first use for both heroin users and pain reliever misusers. The age of first heroin use does not seem to have a particularly strong relationship with the education level reached. This indicates that the steeper drop-out rate for people who have used heroin is may be due to factors other than that they have used heroin before. However, age of first pain reliever misuse seems to be very strongly tied to education level, particularly with people who reach the education level up to and including high school degrees.
I was unsure of the cause of this, so to investigate further I did another survival analysis, this time spiting people into groups did not enter high school, did not enter college, and others (some college, associates degree, bachelors+, etc.). In people who use heroin, there was not any strong difference in the groups, except that those who did not enter high school started used for the first time at a younger age compared to those who didn't start college and those who did at least some college. This makes sense even though a trend was not seen in the scatter plot because there are so few people who did not enter high school and used heroin that it would affect a survival curve, but not a scatter plot. The survival curve of people who have misused pain relievers is very different though, reflecting the strong relationship depicted in the graph. Those who have misused pain relievers and did not reach high school were not likely to "survive" without misusing pain relievers at a fairly young age (about 15). The curves for those who reach high school and college were much more similar, about both to each other and to those who use heroin (though they were shifted to an earlier age). A possible reason for the strange plot of those who misuse pain relievers and don't enter high school is that the size of the population is so small that the plot looks very extreme, but only accounts for a small portion of the general population who misuse pain relievers.