# The Opioid Epidemic In The United States

DATS 6103 - Individual Project 2 - Tristin Johnson

# I. Introduction

Today, I will be doing a complete analysis about the opioid epidemic across the United States of America from 1999-2018. Throughout this presentation, I will be covering various analytics across several factors regarding the opioid epidemic in the United States, which include total deaths by opioids, the specific total deaths by opioid, the gender figures in absolute and relative figures, the youth deaths by opioids, the rate of deaths by demographic, and finally comparing the deaths due to opioid by race with the average household income by race.  

Listed below, are the major opioid groups in which caused the overdose of a person. These opioids are classified into an International Classification of Diseases (ICD-10 Codes, created by the CDC), in which an ICD-10 code is written on a persons death certificate which coordinates to the cause of death. There are thousands of ICD-10 codes, but there are a few that correlate to different opioids. As seen below, is the listed ICD-10 codes of what opioids I will be analyzing, which classification that codes fall under, and the list of drugs that make up these opioid groups. 

The ICD-10 Codes Corresponding to Opioids:

1. Heroin -> T40.1 (heroin)
2. Prescription Opioids -> T40.2-T40.3 (oxycodone, hydrocodone, morphine, codeine, hydromorphone, methadone)
3. Syntheitc Narcotics -> T40.4 (fentanyl, tramadol)
4. Cocaine -> T40.5 (cocaine)
5. Psychostimulants -> T43.6 (methamphetamine, amphetamine, dextroamphetamine, hydrochloride, methylphenidate)
6. Benzodiazepines -> T42.4 (alprazolam, lorazepan, vallium, librium, klonopin)
7. Antidepressants -> T43.0-T43.2 (sertraline, citalopram, fluoxetine, paroxetine)


In [None]:
#import pandas for data frame, matplotlib for plotting, and various plotly libraries for different
#types of plots/customization
import pandas as pd
import matplotlib.pyplot as plt
import plotly
import chart_studio.plotly as py
import plotly.graph_objects as go
import plotly.express as px

# II. Gathering/Cleaning the Data

I was able to gather the opioid overdose information courtesy of the Centers for Disease Control and Prevention (CDC), the average household income by demographic information from the U.S. Census, and the population by demographic in the United States from the Kaiser Family Foundation (KFF, an American non-profit organization covering health policy analysis). The links to all of this information is below:

CDC Opioid Overdose Data (1999-2018):https://www.cdc.gov/nchs/products/databriefs/db356.htm

Household Income by Race: https://www.census.gov/data/tables/time-series/demo/income-poverty/historical-income-households.html

Population by Demographic in the U.S.: https://www.kff.org/other/state-indicator/distribution-by-raceethnicity/?dataView=1&currentTimeframe=1&selectedRows=%7B%22wrapups%22:%7B%22united-states%22:%7B%7D%7D%7D&sortModel=%7B%22colId%22:%22Location%22,%22sort%22:%22asc%22%7D

The rest of this section includes signing into plotly, reading in the data, and finally renaming the column names to match the years of data. One thing to mention, is that the data set that was given was highly formatted and customized and was nearly impossible to read in to a data frame because of the extra customization. So, I had to manually copy and paste all of the cell values into another sheet within the excel with no formatting so that I had just the values to work with.

In [None]:
#sign in to plotly to get various graphs

py.sign_in('tristinjohnson', 'jL9mq71V7v3BQQozuWp7')

In [None]:
#define a list for the column names in the data frame
columnNames = ['1999', '2000', '2001', '2002', '2003', '2004', '2005', '2006', '2007', '2008', '2009', '2010', 
          '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018']

#read in the excel file
excel = pd.ExcelFile('Overdose_data_1999-2018.xlsx')

#get the specific sheet names from the excel file and convert to a pandas data frame
totalOD = pd.read_excel(excel, 'Number Overdose Deaths', index_col=0)
youthOD = pd.read_excel(excel, 'Number Drug Overdose (15-24)', index_col=0)
demographicOD = pd.read_excel(excel, 'Overdose rate by Demographic', index_col=0)

#rename the column names
totalOD.columns = columnNames
youthOD.columns = columnNames
demographicOD.columns = columnNames

# III. Opioid Overdose Deaths

Part 1:

Firstly, I wanted to take a quick look at the total number of deaths due to opioids over the last 20 years, and see if there was a consistent trend among the years.

In [None]:
#plot the total deaths by an opioid overdose in the U.S.
fig = px.line(x=columnNames, y=totalOD.iloc[0], title='Total Deaths By An Opioid Overdose In The U.S.')
fig.update_layout(xaxis_title='Years', yaxis_title='Number of Deaths')
fig.show()

Part 2:

Next, I wanted to look at the individual opioid overdose deaths by opioid. This would allow a good look at which specific opioids effect the most number of people, and how that opioid has been the cause of an increase or decrease in the number of overdoses. 

In [None]:
#make a new data frame that consists of only the total opioid deaths by opioid and transpose for ease of plotting
majorDrugs = totalOD.iloc[[6, 15, 18, 27, 42, 57, 72]]
majorDrugs = majorDrugs.transpose()

In [None]:
#plot the individual opioid deaths to see how they compare against each other over time
fig1 = px.line(majorDrugs, x=majorDrugs.index, y=majorDrugs.columns, 
               title='Different Opioid Overdoses In The U.S.')

#style the layout and show the plot
fig1.update_layout(autosize=False,
                   width=1100,
                   height=600, 
                   xaxis_title='Years',
                   yaxis_title='Number of Deaths')

#show the range slider for a fixed year interval if needed
fig1.update_xaxes(rangeslider_visible=True)
fig1.show()

Part 3:

Next, I wanted to create an interactive experience in which the user can look deeper at individual opioids along with other factors that caused someone to overdose. This allows the user to see whether an overdose was due to just the opioid, or if there was a combination of other opioids, and which opioids are seen the most across the different types. Furthermore, each selection of opioid is the primary cause of overdose. For example, if a user selects Heroin to analyze, all of the figures are due to Heroin being the primary cause of death, but there could have been other opioids found in that person at the time of death. 

In [None]:
#create a copy of the totalOD data frame for a plotting function over time
totalTranspose = totalOD.copy()
totalTranspose = totalTranspose.transpose()

In [None]:
#define a function to plot the various factors of cocaine overdose deaths over time in a bar chart
#this includes cocaine overdose deaths with/without any other opioid and with/without synthetic narcotics
def cocaineGraph():
    fig5 = go.Figure(data=[go.Bar(name='Cocaine AND Any Opioid', 
                              x=totalTranspose.index, 
                              y=totalTranspose['Cocaine AND Any Opioid']),
                      go.Bar(name='Cocaine WITHOUT Any Opioid', 
                              x=totalTranspose.index,
                              y=totalTranspose['Cocaine WITHOUT Any Opioid']),
                      go.Bar(name='Cocaine AND Other Synthetic Narcotics',
                              x=totalTranspose.index,
                              y=totalTranspose['Cocaine AND Other Synthetic Narcotics']),
                      go.Bar(name='Cocaine WITHOUT Other Synthetic Narcotics',
                              x=totalTranspose.index,
                              y=totalTranspose['Cocaine WITHOUT Other Synthetic Narcotics'])])

    fig5.update_layout(barmode='stack', 
                       title='Cocaine With and Without Other Opioids/Synthetic Narcotics',
                       xaxis_title='Years', yaxis_title='Number of Deaths', width=1100, height=650)
    fig5.show()
    
#define a function to plot the two factors of prescription drug overdose deaths over time in a bar chart
#this includes prescritpion drugs with and without other synthetic narcotics
def prescriptionGraph():
    fig6 = go.Figure(data=[go.Bar(name='Prescription Opioids AND Other Synthetic Narcotics', 
                              x=totalTranspose.index, 
                              y=totalTranspose['Prescription Opioids AND Other Synthetic Narcotics']),
                      go.Bar(name='Prescription Opioids WITHOUT Other Synthetic Narcotics', 
                              x=totalTranspose.index,
                              y=totalTranspose['Prescription Opioids WITHOUT Other Synthetic Narcotics '])])

    fig6.update_layout(barmode='stack', 
                       title='Prescription Opioids With and Without Other Synthetic Narcotics',
                       xaxis_title='Years', yaxis_title='Number of Deaths', width=1100, height=650)
    fig6.show()
    
#define a function to plot the two factors of heroin overdose deaths over time in a bar chart
#this includes heroin with and without other synthetic narcotics
def heroinGraph():
    fig7 = go.Figure(data=[go.Bar(name='Heroin AND Other Synthetic Narcotics',
                                 x=totalTranspose.index,
                                 y=totalTranspose['Heroin AND Other Synthetic Narcotics']),
                          go.Bar(name='Heroin WITHOUT Other Synthetic Narcotics',
                                 x=totalTranspose.index,
                                 y=totalTranspose['Heroin WITHOUT Other Synthetic Narcotics'])])
    
    fig7.update_layout(barmode='stack',
                      title='Heroin With and Without Other Synthetic Narcotics',
                      xaxis_title='Years', yaxis_title='Number of Deaths', width=1100, height=650)
    fig7.show()
    
#define a function to plot synthetic narcotic overdoes deaths over time in a bar chart
#this includes only deaths due to synthetic narcotics
def syntheticGraph():
    fig8 = go.Figure(data=[go.Bar(name='Synthetic Narcotics',
                             x=totalTranspose.index,
                             y=totalTranspose['Other Synthetic Narcotics'])])
    
    fig8.update_layout(barmode='stack',
                      title='Other Synthetic Narcotics',
                      xaxis_title='Years', yaxis_title='Number of Deaths', width=1000, height=600)
    fig8.show()
    
#define a function to plot psychostimulant overdose deaths over time in a bar chart
#this includes various factors such as psychostimulant overdose deaths with/without any other opioid
#and with/without synthetic narcotics
def psychostimGraph():
    fig9 = go.Figure(data=[go.Bar(name='Psychostimulants AND Any Opioid',
                                 x=totalTranspose.index,
                                 y=totalTranspose['Psychostimulants With Abuse Potential AND Any Opioid']),
                          go.Bar(name='Psychostimulants WITHOUT Any Opioid',
                                x=totalTranspose.index,
                                y=totalTranspose['Psychostimulants With Abuse Potential WITHOUT Any Opioid']),
                          go.Bar(name='Psychostimulants AND Other Synthetic Narcotics',
                                x=totalTranspose.index,
                                y=totalTranspose['Psychostimulants With Abuse Potential AND Other Synthetic Narcotics']),
                          go.Bar(name='Psychostimulants WITHOUT Other Synthetic Narcotics',
                                x=totalTranspose.index,
                                y=totalTranspose['Psychostimulants With Abuse Potential WITHOUT Other Synthetic Narcotics'])])
    
    fig9.update_layout(barmode='stack',
                      title='PsychoStimulants With/Without Other Synthetic Narcotics/Opioids',
                      xaxis_title='Years', yaxis_title='Number of Deaths', width=1000, height=600)
    fig9.show()
    
#define a function to plot benzodiazepine overdose deaths over time in a bar chart
#this includes factors such as benzos with/without any other opioid and 
#with/without synthetic narcotics
def benzoGraph():
    fig10 = go.Figure(data=[go.Bar(name='Benzodiazepines AND Any Opioid',
                                 x=totalTranspose.index,
                                 y=totalTranspose['Benzodiazepines AND Any Opioid']),
                          go.Bar(name='Benzodiazepines WITHOUT Any Opioid',
                                x=totalTranspose.index,
                                y=totalTranspose['Benzodiazepines WITHOUT Any Opioid']),
                          go.Bar(name='Benzodiazepines AND Other Synthetic Narcotics',
                                x=totalTranspose.index,
                                y=totalTranspose['Benzodiazepines AND Other Synthetic Narcotics']),
                          go.Bar(name='Benzodiazepines WITHOUT Other Synthetic Narcotics',
                                x=totalTranspose.index,
                                y=totalTranspose['Benzodiazepines WITHOUT Other Synthetic Narcotics'])])
    
    fig10.update_layout(barmode='stack',
                       title='Benzodiazepines With/Without Other Synthetic Narcotics/Opioids',
                       xaxis_title='Years', yaxis_title='Number of Deaths', width=1000, height=600)
    fig10.show()
    
#define a function to plot antidepressant overdose deaths over time in a bar chart
#this includes factors such as antidepressants with/without any other opioid
#and with/without synthetic narcotics
def antidepressGraph():
    fig11 = go.Figure(data=[go.Bar(name='Antidepressants AND Any Opioid',
                                 x=totalTranspose.index,
                                 y=totalTranspose['Antidepressants AND Any Opioid']),
                          go.Bar(name='Antidepressants WITHOUT Any Opioid',
                                x=totalTranspose.index,
                                y=totalTranspose['Antidepressants WITHOUT Any Opioid']),
                          go.Bar(name='Antidepressants AND Other Synthetic Narcotics',
                                x=totalTranspose.index,
                                y=totalTranspose['Antidepressants AND Other Synthetic Narcotics']),
                          go.Bar(name='Antidepressants WITHOUT Other Synthetic Narcotics',
                                x=totalTranspose.index,
                                y=totalTranspose['Antidepressants WITHOUT Other Synthetic Narcotics'])])
    
    fig11.update_layout(barmode='stack',
                       title='Antidepressants With/Without Other Synthetic Narcotics/Opioids',
                       xaxis_title='Years', yaxis_title='Number of Deaths', width=1000, height=600)
    fig11.show()
    
#function to plot the graph corresponding to a users selection
def findGraph(user):
    if user == '1':
        prescriptionGraph()
    elif user == '2':
        cocaineGraph()
    elif user == '3':
        heroinGraph()
    elif user == '4':
        syntheticGraph()
    elif user == '5':
        psychostimGraph()
    elif user == '6':
        benzoGraph()
    elif user == '7':
        antidepressGraph()
    else:
        return 'q'


#function to define a menu for the user to analyze as many graphs as one would like
def menu():
    print('Please enter the number corresponding to the opioid that you would like to take a closer look at:')
    print()
    print('1. Prescription Opioids')
    print('2. Cocaine')
    print('3. Heroin')
    print('4. Synthetic Narcotics')
    print('5. Psychostimulants')
    print('6. Benzodiazepines')
    print('7. Antidepressants')
    print('q. Quit')
    
#main function to keep the user in a loop to look at as many graphs, and the loop will not break until
#the user enters 'q' to quit the programs
def main():
    print('Welcome To An Analytical Drug Overdose Experience')
    print()
    
    userInput = ''
    while userInput != 'q':
        menu()
        userInput = input('Enter a number (or q to Quit): ')
        response = findGraph(userInput)
        if response == 'q':
            break

In [None]:
#call main to start the program and then follow the steps for analysis
main()

# IV. Opioid Overdose by Gender

Part 1:

Firstly, I wanted to look at the total opioid deaths in male versus female figues. This would give a good idea of which gender is overdosing the most due to opioids, along with the proportion of male to female deaths. 

In [None]:
#import numpy for arrangement of labels
import numpy as np

#define a width of each bar and arrange the columns based on year
width = 0.35
x = np.arange(len(totalOD.columns))

#use a grouped bar chart to plot the total male vs. female overdose deaths due to opioids
fig12, ax = plt.subplots(figsize=(14, 10))
rects1 = ax.bar(x-width/2, totalOD.iloc[1], width, label='Women', color='hotpink')
rects2 = ax.bar(x+width/2, totalOD.iloc[2], width, label='Men', color='dodgerblue')

#customize the graph for visualization
ax.set_ylabel('Number of Deaths', fontsize=12)
ax.set_title('Total Number of Opioid Overdoses by Gender', fontsize=16)
ax.set_xticks(x)
ax.set_xticklabels(totalOD.columns)
ax.set_xlabel('Year', fontsize=12)
ax.legend()
plt.show()

Part 2:

Next, I wanted to create another interactive experience in which a user can select any two opioids on any year, and analyze the differences between the gender figures among those opioids. The user will be able to see the relative number of overdose deaths by male and female, along with the absolute total number of deaths by male and female.

In [None]:
#import subplots from plotly
from plotly.subplots import make_subplots

#function to compare any two opioids with any year for male vs. female comparison
#user recieves two pie charts with a male vs. female in percentage and a bar chart for total comparison
#must type in two opioids and a year for analytics
def genderAnalysis(opioid1, opioid2, year):
    
    #define a dictionary for the year the user selects that corresponds to the columns in the data frame
    yearDict = {1999:0, 2000:1, 2001:2, 2002:3, 2003:4, 2004:5, 2005:6, 2006:7, 2007:8, 2008:9, 
                2009:10, 2010:11, 2011:12, 2012:13, 2013:14, 2014:15, 2015:16, 2016:17, 2017:18, 2018:19}
    selection = yearDict[year]
    
    #define a dictionary for the two opioids the user selects that correspond to the rows of the opioids in the data frame
    drugDict = {'Prescription Opioids': totalOD.iloc[7:9, selection], 'Synthetic Narcotics': totalOD.iloc[16:18, selection],
                'Heroin': totalOD.iloc[19:21, selection], 'Cocaine': totalOD.iloc[28:30, selection], 
                'Psychostimulants': totalOD.iloc[43:45, selection], 'Benzodiazepines': totalOD.iloc[58:60, selection],
                'Antidepressants': totalOD.iloc[73:75, selection]}
    drugSelection1 = drugDict[opioid1]
    drugSelection2 = drugDict[opioid2]
    
    #define subplots for plotting pie charts next to each other 
    fig12 = make_subplots(rows=1, cols=2, specs=[[{'type':'domain'}, {'type': 'domain'}]], 
                          subplot_titles=[opioid1, opioid2])
    
    #add trace for both pie subplots on the opioids that the user selects, then customize
    fig12.add_trace(go.Pie(labels=totalOD.iloc[1:3].index, values=drugSelection1, name=opioid1,
                          marker_colors=['rgb(255,102,255)', 'rgb(51,153,255)']), 1, 1)
    fig12.add_trace(go.Pie(labels=totalOD.iloc[1:3].index, values=drugSelection2, name=opioid2), 1, 2)
    fig12.update_layout(title_text = opioid1 + ' Vs. ' + opioid2 + ' By Gender In ' + str(year) + ' In Relative Figures')
    fig12.show()
    
    #plot a grouped bar chart for the total male vs. female figures for ease of comparison
    fig13 = go.Figure(data=[go.Bar(name=opioid1, x=drugSelection1.index, y=drugSelection1, text=drugSelection1, 
                                   marker_color='rgb(255,102,255)'),
                           go.Bar(name=opioid2, x=drugSelection2.index, y=drugSelection2, text=drugSelection2, 
                                  marker_color='rgb(51,153,255)')])
    
    #add total death values at top of each bar and customize for visualization
    fig13.update_traces(texttemplate='%{text:.2s}', textposition='outside')
    fig13.update_layout(barmode='group',
                       title_text = opioid1 + ' Vs. ' + opioid2 + ' By Gender In ' + str(year) + ' In Absolute Figures',
                       yaxis_title='Total Number of Deaths')
    fig13.show()
    

In [None]:
#For analysis, enter two of the following opioid names with any year between 1999-2018:
#'Prescription Opioids', 'Synthetic Narcotics', 'Heroin', 'Cocaine'
#'Psychostimulants', 'Benzodiazepines', 'Antidepressants'

genderAnalysis('Benzodiazepines', 'Psychostimulants', 2010)

# V. Opioid Overdose Among The Youth

Part 1:

I then wanted to take a look at the youth figues (ages 15-24) versus the adult figures (ages 25+), and get a clear idea of how many people among the youth are overdosing due to opioids against adults. 

In [None]:
#fill the NaNs with 0's so that the plotting isn't affected
#transpose the data frame for an over time analysis
youthOD = youthOD.fillna(0)
youthODTrans = youthOD.transpose()

In [None]:
#plot the youth deaths (15-24) due to opioid overdose versus adults (25+) in a stacked barchart
fig14 = go.Figure(data=[go.Bar(name='Ages 24+ Total Overdose Deaths',
                                 x=totalTranspose.index,
                                 y=totalTranspose['Total Overdose Deaths'] - youthODTrans['Total Overdose Deaths'],
                              text=100 - (youthODTrans['Total Overdose Deaths'] / totalTranspose['Total Overdose Deaths']) * 100,
                              marker_color='rgb(34,94,168)'),
                          go.Bar(name='Ages 15-25 Total Overdose Deaths',
                                 x=youthODTrans.index,
                                 y=youthODTrans['Total Overdose Deaths'],
                                text=(youthODTrans['Total Overdose Deaths'] / totalTranspose['Total Overdose Deaths']) * 100,
                                marker_color='rgb(127,205,187)')])

#show percentages of each value on the stacked bar
fig14.update_traces(texttemplate='%%{text:.2s}', textposition='outside')

#customize bar chart for better visualization
fig14.update_layout(barmode='stack', width=1150, height=600,
                   title='Total Opioid Overdose Deaths Ages 15-24 Vs. Ages 25+',
                   xaxis_title='Years',
                   yaxis_title='Number of Deaths')
fig14.show()

# VI. Opioid Overdose By Demographic

Part 1:

Firstly, I wanted to analyze the rate (per 100,000 population of each demographic) of which each race is overdosing, and see which race has the highest number of deaths per their population. This gives us a good idea of which race is mostly affected by opioids.

In [None]:
#copy of demographic data frame, removed male and female figures, fill the NaNs with 0's
#and transpose for an over time analysis
demoTotals = demographicOD.copy()
demoTotals = demoTotals[demoTotals.index != 'Male']
demoTotals = demoTotals[demoTotals.index != 'Female']
demoTotals = demoTotals.fillna(0)
demoTotals = demoTotals.transpose()

In [None]:
#variable to get the index of the data frame (years)
x = demoTotals.index

#plot the rate per 100,000 population of each demographic in a scatter plot with lines and markers
fig15 = go.Figure()
fig15.add_trace(go.Scatter(x=x, y=demoTotals.iloc[:, 1], mode='lines+markers', name='White (Non-Hispanic)'))
fig15.add_trace(go.Scatter(x=x, y=demoTotals.iloc[:, 2], mode='lines+markers', name='Black (Non-Hispanic)'))
fig15.add_trace(go.Scatter(x=x, y=demoTotals.iloc[:, 3], mode='lines+markers', name='Asian or Pacific Islander (Non-Hispanic)'))
fig15.add_trace(go.Scatter(x=x, y=demoTotals.iloc[:, 4], mode='lines+markers', name='Hispanic'))
fig15.add_trace(go.Scatter(x=x, y=demoTotals.iloc[:, 5], mode='lines+markers', name='American Indian or Alaska Native (Non-Hispanic)'))

#customize the plot for better visualization
fig15.update_layout(title='Rate of Total Opioid Deaths By Demographic', width=1150, height=650,
                   xaxis_title='Years', yaxis_title='Rate (per 100,000 population)')
fig15.show()

Part 2:

Next, I wanted to compare the total overdose deaths by demographic against the average household income by demographic in the most recent 11 years (2008-2018). This is to see if there was any correlation between the average income and the number of overdose deaths by each race. The data that I received for the household income by each demographic was seperated into each fifth of class, meaning the income of the lower 20%, the income between 20%-40%, and so on. Therefore, I manually found the average income among all the different class levels, which gave me the total average each year by each demographic. 

In [None]:
#create a copy of the demographic data frame, and only include the most recent 11 years of data
demo = demoTotals.copy()
demo = demo.iloc[9:]

#get only the columns needed for plotting of demographic figures
newDemo = demo.copy()
newDemo = newDemo.iloc[:, 42:]
newDemo = newDemo.rename_axis('Year', axis='columns')

#read in excel file including the mean income recieved by each race
excel1 = pd.ExcelFile('Mean Income Recieved by Each Fifth and Top 5 Percent.xlsx')
income = pd.read_excel(excel1, 'average totals', index_col=0)

#merge the demographic opioid overdose data frame with the mean income data frame
demoIncome = pd.merge(newDemo, income, on=[newDemo.index, income.index])

In [None]:
#create dict for the size of each marker on plot and the color for consistency throughout plotting
prescriptionColor = dict(size=12, color='rgb(0,152,255)')
syntheticColor = dict(size=12, color='rgb(239,59,44)')
psychoColor = dict(size=12, color='rgb(65,171,93)')
cocaineColor = dict(size=12, color='rgb(128,125,186)')
heroinColor = dict(size=12, color='rgb(241,105,19)')

#set the mode to lines and markers to better see the trend
mode='lines+markers'

#store the average income of each race as a variable for ease of plotting
avgWhite = demoIncome['Average White Income']
avgBlack = demoIncome['Average Black Income']
avgHispanic = demoIncome['Average Hispanic Income']
avgAsian = demoIncome['Average Asian Income']
avgIndian = demoIncome['Average Indian Income']

#create a function to plot the opioid overdose of each race against the average household income of each
#race. This will plot any race and plot each opioid against the average house income from 2008-2018. 
#The user has the ability to view multiple races or just one and must enter
#the race the user wants to analyze
def raceVsIncome(race):
    
    #create dictionaries based on the race that the user selects in order for the plot 
    #to grab the correct column data from the data frame. Whichever race the user enters, that selected
    #dictionary will be used to plot based on the given race
    if race == 'White':
        columnDict = {'x':avgWhite, 'y1':demoIncome['Total-White-Prescription'], 
                      'y2':demoIncome['Total-White-synthetic'],'y3':demoIncome['Total-White-psycho'], 
                      'y4':demoIncome['Total-White-cocaine'], 'y5':demoIncome['Total-White-heroin']}
        
    elif race == 'Black':
        columnDict = {'x':avgBlack, 'y1':demoIncome['Total-Black-Prescritpion'], 
                      'y2':demoIncome['Total-Black-synthetic'],'y3':demoIncome['Total-Black-psycho'], 
                      'y4':demoIncome['Total-Black-cocaine'], 'y5':demoIncome['Total-Black-heroin']}
        
    elif race == 'Hispanic':
        columnDict = {'x':avgHispanic, 'y1':demoIncome['Total-Hispanic-Prescription'], 
                      'y2':demoIncome['Total-Hispanic-synthetic'],'y3':demoIncome['Total-Hispanic-psycho'], 
                      'y4':demoIncome['Total-Hispanic-cocaine'], 'y5':demoIncome['Total-Hispanic-heroin']}
        
    elif race == 'Asian or Pacific Islander':
        columnDict = {'x':avgAsian, 'y1':demoIncome['Total-Asian-Prescription'], 
                      'y2':demoIncome['Total-Asian-synthetic'],'y3':demoIncome['Total-Asian-psycho'], 
                      'y4':demoIncome['Total-Asian-cocaine'], 'y5':demoIncome['Total-Asian-heroin']}
        
    elif race == 'American Indian or Alaska Native':
        columnDict = {'x':avgIndian, 'y1':demoIncome['Total-Indian-Prescription'], 
                      'y2':demoIncome['Total-Indian-synthetic'],'y3':demoIncome['Total-Indian-psycho'], 
                      'y4':demoIncome['Total-Indian-cocaine'], 'y5':demoIncome['Total-Indian-heroin']}
    
    #plot the selected demographic from the above dictionary that the user selected with the 5 different
    #opioids against the average household income
    fig16 = go.Figure(data=[go.Scatter(name='Prescription Drugs', x=columnDict['x'], y=columnDict['y1'], 
                                       mode=mode, marker=prescriptionColor),
                           go.Scatter(name='Synthetic Narcotics', x=columnDict['x'], y=columnDict['y2'], 
                                      mode=mode, marker=syntheticColor),
                           go.Scatter(name='Psychostimulants', x=columnDict['x'], y=columnDict['y3'], 
                                      mode=mode, marker=psychoColor),
                           go.Scatter(name='Cocaine', x=columnDict['x'], y=columnDict['y4'], 
                                      mode=mode, marker=cocaineColor),
                           go.Scatter(name='Heroin', x=columnDict['x'], y=columnDict['y5'], 
                                      mode=mode, marker=heroinColor)])
    
    #customze the graph for better visualization
    fig16.update_layout(title='Opioid Overdose Vs. Average Household Income Among ' + race + ' Ethnicity (2008 - 2018)',
                        xaxis_title='Median Household Income ($USD)', yaxis_title='Number of Deaths', 
                        height=575, width=1000)
    
    fig16.show()

In [None]:
#call the function to view an individual race, multiple races, or all races against each races average 
#household income. The user can either uncomment the function calls below, or enter any of the following:
#'White', 'Black', 'Hispanic', 'Asian or Pacific Islander', 'American Indian or Alaska Native'

raceVsIncome('White')
raceVsIncome('Black')
raceVsIncome('Hispanic')
raceVsIncome('Asian or Pacific Islander')
raceVsIncome('American Indian or Alaska Native')

# VII. Future Predictions

While a majority of the data over the 20 year period has the number of overdose deaths due to opioids as increasing, there is a consistent decline from 2017 to 2018. This could be due to the fact that the awareness and negative effects opioids have become increasingly popular, meaning we can clearly see more and more people are overdosing on opioids, and hopefully this death count will begin to decline in the upcoming years. 

Among the youth, I believe that the youth figures will decrease over the next couple of years. This could be because, there have been a lot of youth celebrities overdosing on opioids in the last 2 years and has put young people on alert as they are becoming more afraid to try these opioids because of the high possibility they overdose. 

Regarding the gender figures, I believe that the gap between male and female overdoses will continue to grow as there are no signs of that gap decreasing. 

Furthermore, the demographics begin to tell a slightly more difficult story. There are very few declines in the rate of overdoses by race, which means there could be more deaths to come, especially with the sharp rise among Black and American Indian ethnicity groups. Also, looking at the average income against the total opioid overdoses by demographic, there is a clear correlation among the increase in salary and the increase in opioid deaths. This could be because since income has increased over the years, and since opioids are relatively expensive, people have the money to buy these opioids which then leads to more overdoses. 

# VIII. Conclusion

With the functions and interactive experiences above, there are multiple different comparisons and analysis that can be conducted to begin to draw some conclusions about the opioid epidemic. The one opioid that stood out the most was synthetic narcotics (fentanyl and tramadol), which has been on the rise since 2014, and could be due to the fact that a lot of these opioids have been laced with synthetic narcotics as it only takes a small amount for someone to overdose. 

Overall, I learned a lot of information regarding the opioid epidemic here in the United States, and it has definitely made me more aware of the various factors that make up this epidemic. Looking at the total figures, the gender figures, the youth figures, and the demographic figures, it has most certainly opened my eyes to the epidemic and made me realize a lot of different things. It is a sad thing to see that a majority of these figures have just been increasing year after year. The only decline in overdose deaths was from 2017 to 2018 (the last year of data), which gives me some hope in our future to abolish this opioid epidemic. 

# IX. Learning Processes

Throughout this project, I learned a lot about the different techniques and libraries within Python in order to give a clear analysis of various factors along with using Python to help clean up the data and the ease of managing these data sets for a better understanding of what the data means. I also used Plotly a lot for plotting these functions, and it was fun to play around with the different methods and libraries within Plotly. 

For the future, I would like to apply certain statistical analysis to these graphs to get a good idea of any future predictions that might occur in the upcoming years among the multiple analytics. 