# Code for interacting with API of openFDA

## Plan
In this project, we are looking for device adverse events. We are particularly interested in:
    - Cause of failure
    - Date of failure (ie age of device)
We want to see which is the most common cause for given device to fail and at which stage of use this occurs.
Relevant fields could be (for full reference see https://open.fda.gov/device/event/reference/):
### Event
device_date_of_manufacturer
date_of_event, date_report, date_received
previous_use_code, remedial_action
single_use_flag
### Source
reprocessed_and_reused_flag
### Device
device.generic_name
device.expiration_date_of_device, device.device_age_text
device.implant_flag, device.date_removed_flag
device.manufacturer_d_name, device.manufacturer_d_state, device.manufacturer_d_country
### Patient
patient.sequence_number_outcome, patient.sequence_number_treatment
### Report Text
mdr_text.text, mdr_text.text_type_code
### Reporter Dependent Fields
#### By user facility / importer
report_date
event_location
manufacturer_name, manufacturer_country
manufacturer_g1_name, manufacturer_g1_state
### OpenFDA fields
device_class
### Further interesting fields:
Source: reporter_occupation_code
Device: device.device_operator

In [1]:
import numpy as np
import pandas as pd
import json
import requests

baseurl = 'https://api.fda.gov/device/event.json?'
apikey = ''
with open('apikey.txt', 'r') as myfile:
    apikey = myfile.read().replace('\n', '')

In [105]:
# Example of querying, for complete guide go to: https://open.fda.gov/api/
# query = 'search=(device.generic_name:"stent"+AND+reprocessed_and_reused_flag:"Y"+AND+date_of_event:' + \
# '["20150324"+TO+"20170324"]+AND+(_exists_:device_date_of_manufacturer+AND+_exists_:date_of_event))&limit=10'
# # query2 = '_exists_:search=()'
# q1 = baseurl + 'api_key=' + apikey + '&' + query

In [2]:
# Example of quering, for complete guide go to: https://open.fda.gov/api/
start_date = '20150324'
end_date = '20170324'
query = 'search='
limit = 20

In [3]:
list_features = ['device_date_of_manufacturer', # features to check for existance
                 'date_of_event',
                 #'date_report',
                 #'date_received',
                 'previous_use_code',
                 #'remedial_action',
                 'single_use_flag',
                 'reprocessed_and_reused_flag',
                 #'reporter_occupation_code',
                 #'device.date_received',
                 #'device.generic_name' # this allows for empty string! 
                ]

list_features_specific = ['device.openfda.device_name:"sensor"',
                  #'device.implant_flag:"Y"',
                  #'previous_use_code:"I"', # I - initial use, R - reuse, U - unknown, * - invalid data
                  #'device.manufacturer_d_country:"US"' # SZ - Switzerland
                 ]

list_device_names = ["pump",
                    "sensor",
                     "prosthesis",
                    "defibrilator",
                    "pacemaker",
                    "catheter",
                    "electrode"
                    #"wearable",
                     "stent",
                     "ray",
                     "ventilator",
                     "bed",
                     "implant",
                     "lens",
                     #"mds" # https://www.cancer.org/cancer/myelodysplastic-syndrome/about/what-is-mds.html
                     "dialysis",
                     "graft",
    
                    ]
                  
    
# adding date range
query = query+"date_of_event:[\""+start_date+"\""+"+TO+"+"\""+end_date+"\"]"
for x in list_features:
    query = query + "+AND+_exists_:" + x
for y in list_features_specific:
    query = query + "+AND+" + y

q1 = baseurl + 'api_key=' + apikey + '&' + query + '&' + 'limit=' + str(limit)

In [4]:
dq1 = requests.get(q1)
# dq1.json()['results']
data = json.loads(dq1.text)
number = data['meta']['results']['total'] # check number of matching entries
results = data['results']
number

140504

In [5]:
# Can also spare some structuring effort when loading data by calling normalization method
# dftest = pd.io.json.json_normalize(results)

In [42]:
results[3]

{'adverse_event_flag': 'N',
 'date_manufacturer_received': '20150909',
 'date_of_event': '20150731',
 'date_received': '20140827',
 'date_report': '20140801',
 'device': [{'brand_name': 'G4 PLATINUM CONTINUOUS GLUCOSE MONITORING SYSTEM',
   'catalog_number': '',
   'date_received': '20140827',
   'date_removed_flag': '',
   'device_age_text': 'DA',
   'device_availability': 'No',
   'device_evaluated_by_manufacturer': 'R',
   'device_event_key': '',
   'device_operator': 'LAY USER/PATIENT',
   'device_report_product_code': 'MDS',
   'device_sequence_number': ' 1.0',
   'expiration_date_of_device': '20150702',
   'generic_name': 'MDS',
   'implant_flag': '',
   'lot_number': '5187435',
   'manufacturer_d_address_1': '6340 SEQUENCE DRIVE',
   'manufacturer_d_address_2': '',
   'manufacturer_d_city': 'SAN DIEGO',
   'manufacturer_d_country': 'US',
   'manufacturer_d_name': 'DEXCOM, INC.',
   'manufacturer_d_postal_code': '92121',
   'manufacturer_d_state': 'CA',
   'manufacturer_d_zip_cod

In [43]:
# Fields of Interest
fois_result = ['device_date_of_manufacturer',
               'date_of_event']
fois_device = [#'generic_name', 
               'expiration_date_of_device', 
               #'device_age_text', 
               #'implant_flag', 
               #'date_removed_flag', \
               'manufacturer_d_name', 
               #'manufacturer_d_state',
               #'manufacturer_d_country'
              ]
fois_patient = [#'sequence_number_outcome',
                #'sequence_number_treatment'
              ]
fois_mdrText = ['text',
                'text_type_code']
fois_openfda = ['device_name',
                #'device_class',
                'medical_specialty_description']

# device = data['results'][0]['device'][0]
device = [x['device'][0] for x in data['results']]
# patient = data['results'][0]['patient'][0]
patient = [x['patient'][0] for x in data['results']]
# mdrText = data['results'][0]['mdr_text'][0] # there may be more items in the list! 
mdrText = [x['mdr_text'] for x in data['results']]
#mdrText = [y['text'] for y in [x['mdr_text'][0] for x in data['results']]]
# openfda = data['results'][0]['device'][0]['openfda']
openfda = [x['device'][0]['openfda'] for x in data['results']]

In [44]:
fillDic = {'mdr_text_key': '', 'patient_sequence_number': '', 'text': np.nan, 'text_type_code': np.nan}
a = [x[0] if len(x) > 0 else fillDic for x in mdrText]
b = [x[1] if len(x) > 1 else fillDic for x in mdrText] # some of them have even three entries....

In [45]:
df_results = pd.DataFrame(results, index = range(len(results)), columns = fois_result)
df_openfda = pd.DataFrame(openfda, index = range(len(results)),columns = fois_openfda)
df_device = pd.DataFrame(device, index = range(len(results)),columns = fois_device)
df_patient = pd.DataFrame(patient, index = range(len(results)),columns = fois_patient)
# df_mdrText = pd.DataFrame(mdrText, index = range(len(results)),columns = fois_mdrText)

# df = pd.concat([df_device, df_patient, df_mdrText, df_openfda], axis = 1)

In [50]:
a = pd.DataFrame(a, index = range(len(results)),columns = fois_mdrText)
b = pd.DataFrame(b, index = range(len(results)),columns = fois_mdrText)
df_mdrText = pd.concat([a, b], axis = 1)
df = pd.concat([df_results, df_device, df_patient, df_mdrText, df_openfda], axis = 1)

In [51]:
df['age_of_device_days'] = pd.to_datetime(df['date_of_event'], format='%Y%m%d') \
- pd.to_datetime(df['device_date_of_manufacturer'], format='%Y%m%d')
df.drop(['date_of_event','device_date_of_manufacturer'], axis = 1)

Unnamed: 0,expiration_date_of_device,manufacturer_d_name,text,text_type_code,text.1,text_type_code.1,device_name,medical_specialty_description,age_of_device_days
0,,TERUMO CARDIOVASCULAR SYSTEMS CORP.,"INVESTIGATION IN PROCESS, BUT NOT YET COMPLETED.",Additional Manufacturer Narrative,THE USER FACILITY'S BIOMEDICAL ENGINEER (BIOME...,Description of Event or Problem,"Sensor, Blood-Gas, In-Line, Cardiopulmonary By...",Cardiovascular,5148 days
1,,"DEXCOM, INC.",PATIENT CONTACTED DEXCOM TECHNICAL SUPPORT ON ...,Description of Event or Problem,,,"Sensor, Glucose, Invasive",Unknown,831 days
2,20150402.0,"DEXCOM, INC.",PATIENT CONTACTED DEXCOM TECHNICAL SUPPORT ON ...,Description of Event or Problem,,,"Sensor, Glucose, Invasive",Unknown,429 days
3,20150702.0,"DEXCOM, INC.",(B)(4).,Additional Manufacturer Narrative,PATIENT'S MOTHER CONTACTED DEXCOM TECHNICAL SU...,Description of Event or Problem,"Sensor, Glucose, Invasive",Unknown,394 days
4,,"DEXCOM, INC.",(B)(4). THE COMPLAINT DEVICE WAS RETURNED FOR ...,Additional Manufacturer Narrative,(B)(4).,Additional Manufacturer Narrative,"Sensor, Glucose, Invasive",Unknown,428 days
5,,MEDTRONIC MINIMED,"CURRENTLY, IT IS UNKNOWN WHETHER OR NOT THE DE...",Additional Manufacturer Narrative,IT WAS REPORTED THAT THE CUSTOMER RECEIVED BUT...,Description of Event or Problem,"Pump, Infusion, Insulin, To Be Used With Invas...",Unknown,921 days
6,,"DEXCOM, INC.",PATIENT CONTACTED DEXCOM TECHNICAL SUPPORT ON ...,Description of Event or Problem,(B)(4).,Additional Manufacturer Narrative,"Sensor, Glucose, Invasive",Unknown,207 days
7,,MEDTRONIC MINIMED,IT WAS REPORTED THAT CUSTOMER WAS HOSPITALIZED...,Description of Event or Problem,"CURRENTLY, IT IS UNKNOWN WHETHER OR NOT THE DE...",Additional Manufacturer Narrative,"Pump, Infusion, Insulin, To Be Used With Invas...",Unknown,971 days
8,20151103.0,"DEXCOM, INC.",(B)(4).,Additional Manufacturer Narrative,PATIENT CONTACTED DEXCOM TECHNICAL SUPPORT ON ...,Description of Event or Problem,"Sensor, Glucose, Invasive",Unknown,488 days
9,,MEDTRONIC MINIMED,FINDINGS: UNIT RECEIVED WITH PARTIAL MISSING S...,Additional Manufacturer Narrative,IT WAS REPORTED THAT INSULIN PUMP HAS CRACKS A...,Description of Event or Problem,"Pump, Infusion, Insulin, To Be Used With Invas...",Unknown,388 days
