# Research Question 2: What do opioids-tagged projects look like over past *10* fiscal years?

## RQ 2.1: How many opioids projects were funded in past ten fiscal years?
* Answer: 8,384 out of 957,933 (or 0.875%)
* (7965 at NIH)  

## RQ 2.2: How many project dollars went to opioids projects in past ten fiscal years?
* Answer: 3.3b out of 400b (or 0.847%)


In [3]:
import pandas as pd
import numpy as np

# cd to the directory with data
# %cd '/path/to/your/data'

# download csv(s) of project data from https://federalreporter.nih.gov/FileDownload

In [19]:
fiscal_years = ['2010','2011','2012','2013','2014','2015','2016','2017','2018']
prefix = 'FedRePORTER_PRJ_C_FY'
suffix = '.csv'

# initialize dataframe with fy09 data
file = 'FedRePORTER_PRJ_C_FY2009.csv'
df = (pd.read_csv(file,skipinitialspace=True,encoding='utf-8'))

for year in fiscal_years:
    file = prefix + year + suffix
    print('Reading in ' + file)
    df = df.append(pd.read_csv(file, skipinitialspace=True, encoding='utf-8'), ignore_index=True)

df.shape

Reading in FedRePORTER_PRJ_C_FY2010.csv
Reading in FedRePORTER_PRJ_C_FY2011.csv


  interactivity=interactivity, compiler=compiler, result=result)


Reading in FedRePORTER_PRJ_C_FY2012.csv
Reading in FedRePORTER_PRJ_C_FY2013.csv
Reading in FedRePORTER_PRJ_C_FY2014.csv
Reading in FedRePORTER_PRJ_C_FY2015.csv
Reading in FedRePORTER_PRJ_C_FY2016.csv
Reading in FedRePORTER_PRJ_C_FY2017.csv
Reading in FedRePORTER_PRJ_C_FY2018.csv


(966317, 24)

In [21]:
# new variable is 1 for rows with opioid in project term column
df['opioid'] = np.where(
    df['PROJECT_TERMS'].str.contains("opioid",case=False, na=False), 1, '')

# create a numeric version of our flag
df['opioid_num'] = pd.to_numeric(df['opioid'])

In [22]:
# RQ 2.1
print('# of opioid projects = ' + str(df.opioid.value_counts()[1]) + '\n')
print('% opioid projects of total = %' + str(
    100 * (df.opioid.value_counts()[1] / df.opioid.value_counts()[0])))

grouped = df.groupby(['AGENCY'])
grouped['opioid_num'].agg(np.sum)

# of opioid projects = 8384

% opioid projects of total = %0.8752177866301715


AGENCY
ACF           2.0
AHRQ         28.0
ALLCDC       83.0
ARS           0.0
CCCRP         0.0
CDMRP        61.0
CNRM          0.0
DVBIC         0.0
EPA           0.0
FDA          20.0
FS            0.0
IES           0.0
NASA          0.0
NIDILRR       4.0
NIFA          0.0
NIH        7965.0
NSF          36.0
VA          185.0
Name: opioid_num, dtype: float64

In [24]:
# RQ 2.2
opioid_cost = df[df.opioid == '1']['FY_TOTAL_COST'].sum()
total_cost = df['FY_TOTAL_COST'].sum()
print('Opioid project costs = $' + str(opioid_cost))
print('\n' + 'Total project costs = $' + str(total_cost))
print('\n' + 'Pct opioid costs over total = %' + str(100 * (opioid_cost/total_cost)))

Opioid project costs = $3346638620.0

Total project costs = $395102282639.0

Pct opioid costs over total = %0.8470309504786592
