# Calculating opioid-related payments and perscribing opioid costs of local doctors

## Getting Payments Data
1. Access yearly payment data from [OpenPaymentsData.cms.gov](https://openpaymentsdata.cms.gov/browse).
2. For each year (2013-2016), View Data -> Filter -> Recipient_State = 'NY'
3. Rename file to 'General_Payment_NY_`year`_.csv'.

## Getting Perscription Data
1. Access Medicare Provider Utilization and Payment Data: Part D Prescriber Summary Table CY`year` from [data.cms.gov/](https://data.cms.gov/).
2. For each year (2013-2016), View Data -> Filter -> Recipient_State = 'NY'
3. Rename file to 'Medicare_Perscriber_Summary_`year`_.csv'.


In [1]:
import agate
import csv
from collections import OrderedDict
from pprint import pprint
import simplejson as json
import cpi

Medicare categorizes a physican's opiate-related drugs according to [this methodology](https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Downloads/Opioid_Methodology.pdf). I used the following [drug names](https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Downloads/OpioidDrugList.zip) to categorize whether a drug company's payment involved opiate-related drugs. However, [NYS Health Foundation](https://nyshealthfoundation.org/wp-content/uploads/2018/06/following-the-money-pharmaceutical-payments-opioid-prescribing-june-2018.pdf) excluded some drugs from this categorization. The “Excluded” column to the right denotes a few of those drugs that I think should be filtered out according to the study’s methodology, which is "Namely, all buprenorphine drugs, including buprenorphine/naloxone formulations are excluded, as they are used for opioid addiction treatment. In addition, nonsteroidal anti-inflammatory drugs (NSAIDs) are excluded.”

In [2]:
opiates = {}
with open('Opioid_Drug_List_13_16.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        if row['Excluded'] != 'x':
            if row['Year'] in opiates:
                opiates[row['Year']].append(row['Drug Name'])
            else:
                opiates[row['Year']] = [row['Drug Name']]

From the state-wide data, I used the [Erie and Niagara counties ZIP codes](https://data.ny.gov/Government-Finance/New-York-State-ZIP-Codes-County-FIPS-Cross-Referen/juva-r6g2/data) to filter out none-local results from each dataset.

In [3]:
needed_zips = []
with open('New_York_State_ZIP_Codes-County_FIPS_Cross-Reference.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        needed_zips.append(row['ZIP Code'])
len(needed_zips)



92

In [16]:
payment_2016 = agate.TypeTester(limit=100,force={
        'Name_of_Third_Party_Entity_Receiving_Payment_or_Transfer_of_Value': agate.Text(),
        'Product_Category_or_Therapeutic_Area_1': agate.Text(),
        'Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_1': agate.Text(),
        'Associated_Drug_or_Biological_NDC_1': agate.Text(),
        'Covered_or_Noncovered_Indicator_1': agate.Text(),
        'Product_Category_or_Therapeutic_Area_2': agate.Text(),
        'Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_2': agate.Text(),
        'Associated_Drug_or_Biological_NDC_2': agate.Text(),
        'Covered_or_Noncovered_Indicator_2': agate.Text(),
        'Product_Category_or_Therapeutic_Area_3': agate.Text(),
        'Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_3': agate.Text(),
        'Associated_Drug_or_Biological_NDC_3': agate.Text(),
        'Covered_or_Noncovered_Indicator_3': agate.Text(),
        'Product_Category_or_Therapeutic_Area_4': agate.Text(),
        'Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_4': agate.Text(),
        'Associated_Drug_or_Biological_NDC_4': agate.Text(),
        'Covered_or_Noncovered_Indicator_4': agate.Text(),
        'Product_Category_or_Therapeutic_Area_5': agate.Text(),
        'Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_5': agate.Text(),
        'Associated_Drug_or_Biological_NDC_5': agate.Text(),
        'Covered_or_Noncovered_Indicator_5': agate.Text(),
        'Number_of_Payments_Included_in_Total_Amount': agate.Number(),
        'Physician_License_State_code3': agate.Text(),
        'Physician_License_State_code4': agate.Text(),
        'Physician_License_State_code5': agate.Text(),
        'Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_ID': agate.Text(),
        'Physician_Profile_ID': agate.Text()
    })
payment_2015 = agate.TypeTester(limit=100,force={
        'Name_of_Third_Party_Entity_Receiving_Payment_or_Transfer_of_Value': agate.Text(),
        'Name_of_Associated_Covered_Drug_or_Biological1': agate.Text(),
        'Name_of_Associated_Covered_Drug_or_Biological2': agate.Text(),
        'Name_of_Associated_Covered_Drug_or_Biological3': agate.Text(),
        'Name_of_Associated_Covered_Drug_or_Biological4': agate.Text(),
        'Name_of_Associated_Covered_Drug_or_Biological5': agate.Text(),
        'Number_of_Payments_Included_in_Total_Amount': agate.Number(),
        'NDC_of_Associated_Covered_Drug_or_Biological1': agate.Text(),
        'NDC_of_Associated_Covered_Drug_or_Biological2': agate.Text(),
        'NDC_of_Associated_Covered_Drug_or_Biological3': agate.Text(),
        'NDC_of_Associated_Covered_Drug_or_Biological4': agate.Text(),
        'NDC_of_Associated_Covered_Drug_or_Biological5': agate.Text(),
        'Physician_License_State_code3': agate.Text(),
        'Physician_License_State_code4': agate.Text(),
        'Physician_License_State_code5': agate.Text(),
        'Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_ID': agate.Text(),
        'Physician_Profile_ID': agate.Text(),
        'Teaching_Hospital_CCN': agate.Text(),
        'Teaching_Hospital_ID': agate.Text(),
        'Teaching_Hospital_Name': agate.Text(),
        'Name_of_Associated_Covered_Device_or_Medical_Supply1': agate.Text(),
        'Name_of_Associated_Covered_Device_or_Medical_Supply2': agate.Text(),
        'Name_of_Associated_Covered_Device_or_Medical_Supply3': agate.Text(),
        'Name_of_Associated_Covered_Device_or_Medical_Supply4': agate.Text(),
        'Name_of_Associated_Covered_Device_or_Medical_Supply5': agate.Text(),
        'Physician_License_State_code2': agate.Text(),
        'Physician_Name_Suffix': agate.Text()
    })
scripts_tester = agate.TypeTester(limit=100, force={
        'beneficiary_race_nat_ind_count': agate.Number(),
        'nppes_provider_zip5': agate.Text(),
        'nppes_provider_zip4': agate.Text(),
        'npi': agate.Text()
    })

I use docs as a by_payments key holder for each years info.
fix_dirty are those payments keys that due to slight misspelling wouldn't connect with a npi number.

In [33]:
docs = {}
fix_dirty = {'14713': '1316101934', '285771': '1306869417', '280538': '1174530299', '78134': '1710982731', '248720': '1477512259', '21013': '1457337388'}
waghmarae_specific = []

In [34]:
def inflate_wrapper(amt, year):
    try:
        floated = float(amt)
        return cpi.inflate(floated, int(year), to=2016)
    except TypeError:
        return 0

In [35]:
def merge_year_data(year):
    scripts_file = 'Medicare_Perscriber_Summary_{0}.csv'.format(year)
    print('Reading in {0}'.format(scripts_file))
    scripts = agate.Table.from_csv(scripts_file, column_types=scripts_tester)
    payments_file = 'General_Payment_NY_{0}.csv'.format(year)
    print('Reading in {0}'.format(payments_file))
    #Payment column fields changed for 2016
    old_data_format = int(year) < 2016
    if old_data_format:
        payments = agate.Table.from_csv(payments_file, column_types=payment_2015)
    else:
        payments = agate.Table.from_csv(payments_file, column_types=payment_2016)
    def check_zip(zip_code):
        #Use zip_code to filter out whether a payment/script involves a local doc.
        if '-' in zip_code:
            zip_list = zip_code.split('-')
            if zip_list[0] in needed_zips:
                return True
            else:
                return False
        else:
            if zip_code in needed_zips:
                return True
            else:
                return False
    local_payments = payments.where(lambda row: check_zip(row['Recipient_Zip_Code']))
    print('Out of {0} payments, {1} are local: {2} percent'.format(len(payments), len(local_payments), (len(local_payments)/len(payments))))
    local_scripts = scripts.where(lambda row: check_zip(row['nppes_provider_zip5']))
    print('Out of {0} perscribers, {1} are local: {2} percent'.format(len(scripts), len(local_scripts), (len(local_scripts)/len(scripts))))
    def opiate_test(row):
        # Go through columns in order to see if the listed drug/device is an opiate.
        # Probs should have a better way...
        if old_data_format:
            if row['Name_of_Associated_Covered_Drug_or_Biological1'] != None:
                upper1 = row['Name_of_Associated_Covered_Drug_or_Biological1'].upper()
                if upper1 in opiates[year]:
                    return upper1
            elif row['Name_of_Associated_Covered_Drug_or_Biological2'] != None:
                upper2 = row['Name_of_Associated_Covered_Drug_or_Biological2'].upper()
                if upper2 in opiates[year]:
                    return upper2
            elif row['Name_of_Associated_Covered_Drug_or_Biological3'] != None:
                upper3 = row['Name_of_Associated_Covered_Drug_or_Biological3'].upper()
                if upper3 in opiates[year]:
                    return upper3
            elif row['Name_of_Associated_Covered_Drug_or_Biological4'] != None:
                upper4 = row['Name_of_Associated_Covered_Drug_or_Biological4'].upper()
                if upper4 in opiates[year]:
                    return upper4
            elif row['Name_of_Associated_Covered_Drug_or_Biological5'] != None:
                upper5 = row['Name_of_Associated_Covered_Drug_or_Biological5'].upper()
                if upper5 in opiates[year]:
                    return upper5
        else:
            if row['Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_1'] == 'Drug':
                if row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_1'] != None:
                    upper1 = row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_1'].upper()
                    if upper1 in opiates[year]:
                        return upper1
            elif row['Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_2'] == 'Drug':
                if row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_2'] != None:
                    upper2 = row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_2'].upper()
                    if upper2 in opiates[year]:
                        return upper2
            elif row['Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_3'] == 'Drug':
                if row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_3'] != None:
                    upper3 = row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_3'].upper()
                    if upper3 in opiates[year]:
                        return upper3
            elif row['Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_4'] == 'Drug':
                if row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_4'] != None:
                    upper4 = row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_4'].upper()
                    if upper4 in opiates[year]:
                        return upper4
            elif row['Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_5'] == 'Drug':
                if row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_5'] != None:
                    upper5 = row['Name_of_Drug_or_Biological_or_Device_or_Medical_Supply_5'].upper()
                    if upper5 in opiates[year]:
                        return upper5
    local_opiate_payments = local_payments.compute([
        ('opiate_flag', agate.Formula(agate.Text(), lambda row: opiate_test(row)))
    ]).where(lambda row: row['opiate_flag'] != None)
    if old_data_format:
        print('Out of {0} local payments and {2} drug related, {1} were opioid-related.'.format(len(local_payments),len(local_opiate_payments), len(local_payments.where(lambda row: row['Name_of_Associated_Covered_Drug_or_Biological1'] != None))))
    else:
        print('Out of {0} local payments and {2} drug related, {1} were opioid-related.'.format(len(local_payments),len(local_opiate_payments), len(local_payments.where(lambda row: row['Indicate_Drug_or_Biological_or_Device_or_Medical_Supply_1'] == 'Drug'))))
    for payment in local_opiate_payments.rows:
        #Form doc key by local_opiate payment receiver
        doc_id = payment['Physician_Profile_ID']
        if doc_id in docs:
            if year in docs[doc_id]:
                old_total = docs[doc_id][year]['total']
                old_count = docs[doc_id][year]['count']
                old_total += inflate_wrapper(payment['Total_Amount_of_Payment_USDollars'], year)
                old_count += payment['Number_of_Payments_Included_in_Total_Amount']
                docs[doc_id][year]['total'] = old_total
                docs[doc_id][year]['count'] = old_count
            else:
                docs[doc_id][year] = {'total': inflate_wrapper(payment['Total_Amount_of_Payment_USDollars'], year), 'count': payment['Number_of_Payments_Included_in_Total_Amount']}
        else:
            docs[doc_id] = {'info': {'last_name': payment['Physician_Last_Name'], 'first_name': payment['Physician_First_Name'], 'middle_name': payment['Physician_Middle_Name'], 'zip_code': payment['Recipient_Zip_Code'], 'city': payment['Recipient_City'] }, year: {'total': inflate_wrapper(payment['Total_Amount_of_Payment_USDollars'], year), 'count': payment['Number_of_Payments_Included_in_Total_Amount'] }}
        if doc_id == '280538':
            grab_specific = {}
            grab_specific['payment_amt'] = inflate_wrapper(payment['Total_Amount_of_Payment_USDollars'], year)
            grab_specific['non_adjusted'] = payment['Total_Amount_of_Payment_USDollars']
            grab_specific['payment_count'] = payment['Number_of_Payments_Included_in_Total_Amount']
            grab_specific['manufacter'] = payment['Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_Name']
            grab_specific['manufacter_id'] = payment['Applicable_Manufacturer_or_Applicable_GPO_Making_Payment_ID']
            grab_specific['payment_date'] = payment['Date_of_Payment']
            grab_specific['payment_id'] = payment['Record_ID']
            grab_specific['opiate'] = payment['opiate_flag']
            grab_specific['payment_form'] = payment['Form_of_Payment_or_Transfer_of_Value']
            grab_specific['payment_nature'] = payment['Nature_of_Payment_or_Transfer_of_Value']
            grab_specific['travel_city'] = payment['City_of_Travel']
            grab_specific['travel_state'] = payment['State_of_Travel']
            grab_specific['payment_context'] = payment['Contextual_Information']
            grab_specific['program_year'] = payment['Program_Year']
            waghmarae_specific.append(grab_specific)
    matched_full = 0
    matched_part = 0
    potential_matches = 0
    matched_previous = 0
    for doc_id, id_info in docs.items():
        matched = False
        #First try to see if npi is in from a previous year
        if 'npi' in docs[doc_id]['info']:
            matched = local_scripts.where(lambda row: row['npi'] == docs[doc_id]['info']['npi'])
            if len(matched) == 1:
                if year in docs[doc_id]:
                    docs[doc_id][year]['opioid_claim_count'] = matched[0]['opioid_claim_count']
                    docs[doc_id][year]['opioid_drug_cost'] = inflate_wrapper(matched[0]['opioid_drug_cost'],year)
                    docs[doc_id][year]['opioid_day_supply'] = matched[0]['opioid_day_supply']
                    docs[doc_id][year]['opioid_bene_count'] = matched[0]['opioid_bene_count']
                    docs[doc_id][year]['opioid_prescriber_rate'] = matched[0]['opioid_prescriber_rate']
                else:
                    docs[doc_id][year] = {'opioid_claim_count': matched[0]['opioid_claim_count'], 'opioid_drug_cost': inflate_wrapper(matched[0]['opioid_drug_cost'], year), 'opioid_day_supply': matched[0]['opioid_day_supply'],'opioid_bene_count': matched[0]['opioid_bene_count'], 'opioid_prescriber_rate': matched[0]['opioid_prescriber_rate'] }
                matched_previous += 1
                matched = True
        #Nope, check to see if payment_id aka doc_id is in the wonky fixes
        elif doc_id in fix_dirty:
                matched = local_scripts.where(lambda row: row['npi'] == fix_dirty[doc_id])
                if len(matched) == 1:
                    if year in docs[doc_id]:
                        docs[doc_id][year]['opioid_claim_count'] = matched[0]['opioid_claim_count']
                        docs[doc_id][year]['opioid_drug_cost'] = inflate_wrapper(matched[0]['opioid_drug_cost'], year)
                        docs[doc_id][year]['opioid_day_supply'] = matched[0]['opioid_day_supply']
                        docs[doc_id][year]['opioid_bene_count'] = matched[0]['opioid_bene_count']
                        docs[doc_id][year]['opioid_prescriber_rate'] = matched[0]['opioid_prescriber_rate']
                    else:
                        docs[doc_id][year] = {'opioid_claim_count': matched[0]['opioid_claim_count'], 'opioid_drug_cost': inflate_wrapper(matched[0]['opioid_drug_cost'], year), 'opioid_day_supply': matched[0]['opioid_day_supply'],'opioid_bene_count': matched[0]['opioid_bene_count'], 'opioid_prescriber_rate': matched[0]['opioid_prescriber_rate'] }
                    matched_previous += 1
                    matched = True 
                    docs[doc_id]['info']['npi'] = fix_dirty[doc_id]
        #Finally go through local Medicaid docs to try and find a match.
        else:
            for doc in local_scripts.rows:
                if '-' in id_info['info']['zip_code']:
                    clean_zip = id_info['info']['zip_code'].split('-')[0]
                else:
                    clean_zip = id_info['info']['zip_code']
                #first check of last_name & first_name & middle_name & zip (w/o) hyphen
                if id_info['info']['last_name'] == doc['nppes_provider_last_org_name'] and id_info['info']['first_name'] == doc['nppes_provider_first_name'] and id_info['info']['middle_name'] == doc['nppes_provider_mi'] and clean_zip == doc['nppes_provider_zip5']:
                    matched_full += 1
                    matched = True
                    docs[doc_id]['info']['npi'] = doc['npi']
                    if year in docs[doc_id]:
                        docs[doc_id][year]['opioid_claim_count'] = doc['opioid_claim_count']
                        docs[doc_id][year]['opioid_drug_cost'] =  inflate_wrapper(doc['opioid_drug_cost'], year)
                        docs[doc_id][year]['opioid_day_supply'] = doc['opioid_day_supply']
                        docs[doc_id][year]['opioid_bene_count'] = doc['opioid_bene_count']
                        docs[doc_id][year]['opioid_prescriber_rate'] = doc['opioid_prescriber_rate']
                    else:
                        docs[doc_id][year] = {'opioid_claim_count': doc['opioid_claim_count'], 'opioid_drug_cost': inflate_wrapper(doc['opioid_drug_cost'], year), 'opioid_day_supply': doc['opioid_day_supply'],'opioid_bene_count': doc['opioid_bene_count'], 'opioid_prescriber_rate': doc['opioid_prescriber_rate'] }
                    break
                #second check of last_name & first_name & zip (w/o) hyphen
                elif id_info['info']['last_name'] == doc['nppes_provider_last_org_name'] and id_info['info']['first_name'] == doc['nppes_provider_first_name'] and clean_zip == doc['nppes_provider_zip5']:
                    matched_part += 1
                    matched = True
                    docs[doc_id]['info']['npi'] = doc['npi']
                    if year in docs[doc_id]:
                        docs[doc_id][year]['opioid_claim_count'] = doc['opioid_claim_count']
                        docs[doc_id][year]['opioid_drug_cost'] = inflate_wrapper(doc['opioid_drug_cost'], year)
                        docs[doc_id][year]['opioid_day_supply'] = doc['opioid_day_supply']
                        docs[doc_id][year]['opioid_bene_count'] = doc['opioid_bene_count']
                        docs[doc_id][year]['opioid_prescriber_rate'] = doc['opioid_prescriber_rate']
                    else:
                        docs[doc_id][year] = {'opioid_claim_count': doc['opioid_claim_count'], 'opioid_drug_cost': inflate_wrapper(doc['opioid_drug_cost'],year),'opioid_day_supply': doc['opioid_day_supply'],'opioid_bene_count': doc['opioid_bene_count'], 'opioid_prescriber_rate': doc['opioid_prescriber_rate'] }
                    break
        #Spit out matches to see if there's others close.
        #Might have moved offices...
        if not matched:
            same_last = local_scripts.where(lambda row: id_info['info']['last_name'] == row['nppes_provider_last_org_name']).select(['nppes_provider_last_org_name', 'nppes_provider_first_name', 'nppes_provider_mi','nppes_provider_zip5', 'nppes_provider_city', 'npi'])
            if len(same_last) != 0:
                print('Finding {0}, {1} {2} {3} {4} {5}'.format(id_info['info']['last_name'], id_info['info']['first_name'], id_info['info']['middle_name'],id_info['info']['zip_code'], id_info['info']['city'], doc_id))
                same_last.print_table()
                potential_matches += 1
    print('****Out of {0} local opiate paymenters found payment info on {1} matches and {4} already matched, {2} percent and potential clean on {3}****'.format(len(docs), (matched_full + matched_part), ((matched_full + matched_part + matched_previous)/len(docs)),potential_matches, matched_previous))    

Reading in Medicare_Perscriber_Summary_2013.csv

Reading in General_Payment_NY_2013.csv

Out of 321385 payments, 20032 are local: 0.062330226986324816 percent

Out of 85843 perscribers, 4436 are local: 0.05167573360670061 percent

Out of 20032 local payments and 15893 drug related, 304 were opioid-related.

****Out of 52 local opiate paymenters found payment info on 2 matches and 30 already matched, 0.6153846153846154 percent and potential clean on 3****

In [36]:
merge_year_data('2013')

Reading in Medicare_Perscriber_Summary_2013.csv
Reading in General_Payment_NY_2013.csv
Out of 321385 payments, 20032 are local: 0.062330226986324816 percent
Out of 85843 perscribers, 4436 are local: 0.05167573360670061 percent
Out of 20032 local payments and 15893 drug related, 304 were opioid-related.
Finding ZHOU, XIN None 14221 BUFFALO 341978
| nppes_provider_la... | nppes_provider_fi... | nppes_provider_mi | nppes_provider_zip5 | nppes_provider_city | npi        |
| -------------------- | -------------------- | ----------------- | ------------------- | ------------------- | ---------- |
| ZHOU                 | XIN                  |                   | 14051               | EAST AMHERST        | 1538377437 |
Finding RIGA, PETER None 14203 Buffalo 316256
| nppes_provider_la... | nppes_provider_fi... | nppes_provider_mi | nppes_provider_zip5 | nppes_provider_city | npi        |
| -------------------- | -------------------- | ----------------- | ------------------- | ----------------

In [37]:
merge_year_data('2014')

Reading in Medicare_Perscriber_Summary_2014.csv
Reading in General_Payment_NY_2014.csv
Out of 828332 payments, 53993 are local: 0.06518280109907622 percent
Out of 86814 perscribers, 4470 are local: 0.05148939111203262 percent
Out of 53993 local payments and 42871 drug related, 879 were opioid-related.
Finding ZHOU, XIN None 14221 BUFFALO 341978
| nppes_provider_la... | nppes_provider_fi... | nppes_provider_mi | nppes_provider_zip5 | nppes_provider_city | npi        |
| -------------------- | -------------------- | ----------------- | ------------------- | ------------------- | ---------- |
| ZHOU                 | XIN                  |                   | 14051               | EAST AMHERST        | 1538377437 |
Finding TEETER, JENNIFER MARIE 14221 WILLIAMSVILLE 339859
| nppes_provider_la... | nppes_provider_fi... | nppes_provider_mi | nppes_provider_zip5 | nppes_provider_city | npi        |
| -------------------- | -------------------- | ----------------- | ------------------- | -----

In [38]:
merge_year_data('2015')

Reading in Medicare_Perscriber_Summary_2015.csv
Reading in General_Payment_NY_2015.csv
Out of 818414 payments, 52247 are local: 0.06383932826173551 percent
Out of 89057 perscribers, 4529 are local: 0.05085507034820396 percent
Out of 52247 local payments and 41470 drug related, 896 were opioid-related.
Finding ZHOU, XIN None 14221 BUFFALO 341978
| nppes_provider_la... | nppes_provider_fi... | nppes_provider_mi | nppes_provider_zip5 | nppes_provider_city | npi        |
| -------------------- | -------------------- | ----------------- | ------------------- | ------------------- | ---------- |
| ZHOU                 | XIN                  |                   | 14051               | EAST AMHERST        | 1538377437 |
Finding RAMKUMAR, BHUVANESWARI Guntur 14225 Buffalo 652328
| nppes_provider_la... | nppes_provider_fi... | nppes_provider_mi | nppes_provider_zip5 | nppes_provider_city | npi        |
| -------------------- | -------------------- | ----------------- | ------------------- | ----

In [39]:
merge_year_data('2016')

Reading in Medicare_Perscriber_Summary_2016.csv
Reading in General_Payment_NY_2016.csv
Out of 804403 payments, 50596 are local: 0.0628988206160345 percent
Out of 91449 perscribers, 4561 are local: 0.04987479360080482 percent
Out of 50596 local payments and 33982 drug related, 587 were opioid-related.
Finding ZHOU, XIN None 14221 BUFFALO 341978
| nppes_provider_la... | nppes_provider_fi... | nppes_provider_mi | nppes_provider_zip5 | nppes_provider_city | npi        |
| -------------------- | -------------------- | ----------------- | ------------------- | ------------------- | ---------- |
| ZHOU                 | XIN                  |                   | 14051               | EAST AMHERST        | 1538377437 |
Finding BARNES, STEVEN None 14070 GOWANDA 12789
| nppes_provider_la... | nppes_provider_fi... | nppes_provider_mi | nppes_provider_zip5 | nppes_provider_city | npi        |
| -------------------- | -------------------- | ----------------- | ------------------- | ----------------

In [40]:
pprint(docs)

{'100827': {'2015': {'count': Decimal('4'),
                     'opioid_bene_count': Decimal('129'),
                     'opioid_claim_count': Decimal('600'),
                     'opioid_day_supply': Decimal('16977'),
                     'opioid_drug_cost': 51280.247787289525,
                     'opioid_prescriber_rate': Decimal('5.19'),
                     'total': 40.87927275258737},
            '2016': {'count': Decimal('2'),
                     'opioid_bene_count': Decimal('160'),
                     'opioid_claim_count': Decimal('687'),
                     'opioid_day_supply': Decimal('19185'),
                     'opioid_drug_cost': 57634.22,
                     'opioid_prescriber_rate': Decimal('5.66'),
                     'total': 30.56},
            'info': {'city': 'ELMA',
                     'first_name': 'DAWN',
                     'last_name': 'GAIS',
                     'middle_name': 'ALEXANDRA',
                     'npi': '1225095003',
                 

In [41]:
with open('export.json', 'w') as outfile:
    json.dump(docs, outfile,use_decimal=True)

In [48]:
csv_holder = [['npi', 'pay_id', 'last_name', 'first_name', 'middle_name', 'city', 'zip_code', 'pay1315', 'opioid_drug_cost1315', 'opioid_bene_count_av', 'opioid_claim_count_av', 'opioid_day_supply_av', 'opioid_prescriber_rate_av']]
for doc_id, doc_info in docs.items():
    if 'npi' in doc_info['info']:
        npi = doc_info['info']['npi']
    else:
        npi = 'Unknown'
    row = [npi,doc_id,doc_info['info']['last_name'],doc_info['info']['first_name'],doc_info['info']['middle_name'],doc_info['info']['city'],doc_info['info']['zip_code']]
    years = doc_info.keys()
    total_payment = 0
    total_opiate_cost = 0
    total_opioid_bene_count = 0
    total_opioid_claim_count = 0
    total_opioid_day_supply = 0
    total_opioid_prescriber_rate = 0
    year_count = 0
    for year in years:
        if year != 'info' and year != '2016':
            if 'total' in doc_info[year]:
                total_payment += doc_info[year]['total']
            if 'opioid_drug_cost' in doc_info[year]:
                if doc_info[year]['opioid_drug_cost'] != None:
                    total_opiate_cost += doc_info[year]['opioid_drug_cost']
                    if doc_info[year]['opioid_bene_count'] != None:
                        total_opioid_bene_count += doc_info[year]['opioid_bene_count']
                    if doc_info[year]['opioid_claim_count'] != None:
                        total_opioid_claim_count += doc_info[year]['opioid_claim_count']
                    if doc_info[year]['opioid_day_supply'] != None:
                        total_opioid_day_supply += doc_info[year]['opioid_day_supply']
                    if doc_info[year]['opioid_day_supply'] != None:
                        total_opioid_prescriber_rate += doc_info[year]['opioid_prescriber_rate']
                    year_count += 1
    row.append(total_payment)
    row.append(total_opiate_cost)
    try:
        row.append(total_opioid_bene_count/year_count)
        row.append(total_opioid_claim_count/year_count)
        row.append(total_opioid_day_supply/year_count)
        row.append(total_opioid_prescriber_rate/year_count)
    except ZeroDivisionError:
        row.append('NA')
        row.append('NA')
        row.append('NA')
        row.append('NA')
    csv_holder.append(row)

In [49]:
with open('export.csv', 'w') as csvfile:
    writer = csv.writer(csvfile)
    for row in csv_holder:
        writer.writerow(row)

In [50]:
with open('waghmarae_specific.csv', 'w') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=waghmarae_specific[0].keys())
    writer.writeheader()
    for row in waghmarae_specific:
        writer.writerow(row)

In [30]:
pprint(docs['280538'])

{'2013': {'count': Decimal('139'),
          'opioid_bene_count': Decimal('149'),
          'opioid_claim_count': Decimal('1453'),
          'opioid_day_supply': Decimal('42251'),
          'opioid_drug_cost': 565660.6468745305,
          'opioid_prescriber_rate': Decimal('54.2'),
          'total': 65266.270787784866},
 '2014': {'count': Decimal('277'),
          'opioid_bene_count': Decimal('161'),
          'opioid_claim_count': Decimal('1245'),
          'opioid_day_supply': Decimal('36134'),
          'opioid_drug_cost': 377969.4483615927,
          'opioid_prescriber_rate': Decimal('53.27'),
          'total': 127169.26049565763},
 '2015': {'count': Decimal('270'),
          'opioid_bene_count': Decimal('158'),
          'opioid_claim_count': Decimal('1101'),
          'opioid_day_supply': Decimal('31747'),
          'opioid_drug_cost': 258674.85355915403,
          'opioid_prescriber_rate': Decimal('52.4'),
          'total': 141454.16169941385},
 '2016': {'count': Decimal('107'