<a href="https://colab.research.google.com/github/ufbfung/openfda/blob/main/Query_FDA_for_List_of_Indications_for_T2DM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## openFDA API key usage

## Using your API key
Your API key should be passed to the API as the value of the api_key parameter. Include it before other parameters, such as the search parameter. For example:

https://api.fda.gov/drug/event.json?api_key=yourAPIKeyHere&search=...

## HTTPS requests only
Alternatively your API key may be provided as a basic auth username. For example:

Authorization: Basic eW91ckFQSUtleUhlcmU6

openFDA requires you to use https://api.fda.gov for all queries to ensure secure communication.

# Setup Environment
This section will include setting up the environment that includes
- API key
- Importing relevant libraries
- Defining API endpoints from FDA
- Defining functions to retrieve data from FDA

In [23]:
# Import relevant libraries
import requests # for calling APIs
import pandas as pd # for manipulating data retrieved from APIs

In [54]:
# Set endpoints for FDA APIs
ndc_url = 'https://api.fda.gov/drug/ndc.json?search=' # can add the following for ane example 'dea_schedule:"CIV"&limit=5'
druglabel_url = 'https://api.fda.gov/drug/label.json?search=' # can add drug_interactions:caffeine&limit=5

In [9]:
import requests

# List of NDC codes for which you want to retrieve indications
ndc_codes = ['0169-2911', '63629-5698', '55154-5150-6', '55154-5150-4']

# Loop through the NDC codes and retrieve indications
for ndc in ndc_codes:
    url = f"https://api.fda.gov/drug/ndc.json?search=product_ndc:{ndc}&limit=1&api_key={api_key}"

    # Send a GET request to the FDA API
    response = requests.get(url)

    # Check if the request was successful (status code 200)
    if response.status_code == 200:
        data = response.json()

        # Check if the 'indications_and_usage' key exists in the response
        if 'indications_and_usage' in data['results'][0]:
            indications = data['results'][0]['indications_and_usage']

            # Print the NDC code and its indications
            print(f"NDC: {ndc}")
            print(f"Indications: {indications}")
            print()
        else:
            print(f"No indications found for NDC: {ndc}")
    else:
        print(f"Failed to retrieve data for NDC: {ndc}")



No indications found for NDC: 0169-2911
No indications found for NDC: 63629-5698
Failed to retrieve data for NDC: 55154-5150-6
Failed to retrieve data for NDC: 55154-5150-4


# Define Functions
This section will define the functions that we will plan to reuse throughout notebook. Notable functions will include:
- Retrieving data from an endpoint
- Retrieving data from FDA's drug endpoint
- Retrieving data from FDA's drug label endpoint
- Retrieving data for the indications column

In [39]:
# Create function to work with openFDA APIs

def get_api(url):
  response = requests.get(url)

  if response.status_code == 200:
    data = response.json()
    results = data['results']
    df = pd.json_normalize(results) # Flatten the nested JSON response

    return df

  else:
        print(f"Failed to retrieve data. Status code: {response.status_code}")

In [154]:
# Build API request to FDA's drug endpoint
ndc_url = 'https://api.fda.gov/drug/ndc.json?search=' # can add the following for ane example 'dea_schedule:"CIV"&limit=5'

# Define list of pharmaceutical classes to retrieve
pharmaceutical_classes = ['Biguanide [EPC]',
                          'Sulfonylurea [EPC]',
                          'Antihypoglycemic Agent [EPC]',
                          'GLP-1 Receptor Agonist [EPC]']

# Set a limit of results
limit = 100

# Construct search query from list of pharmaceutical classes
search_query = '(' + '+OR+'.join(['"' + pharm_class + '"' for pharm_class in pharmaceutical_classes]) + ')' + f'&limit={limit}'

# Retrieve data from API and store into dataframe
df = get_api(ndc_url + search_query)

# Set columns of interest for troubleshooting and validation
columns_to_extract = ['generic_name',
                      'brand_name',
                      'product_ndc',
                      'openfda.spl_set_id']

# Filter dataframe for only columns of interest
df = df[columns_to_extract]

# Filter for unique values in 'openfda.spl_set_id' column
df = df[df['openfda.spl_set_id'].duplicated(keep=False)]

# Create input to use in FDA's drug label API
spl_setlabels = df['openfda.spl_set_id']

# Set a limit of results - will use the same limit from previous call
# limit = 20

# Convert list of spl_setlabels to a string
spl_setlabels_str = '+OR+'.join(['"' + str(label) + '"' for label in spl_setlabels])

# Construct search query from the string of spl_setlabels
spl_search_query = '(' + spl_setlabels_str + ')' + f'&limit={limit}'

# Retrieve data from FDA's label API
labels_of_interest = get_api(druglabel_url + spl_search_query)

# Filter for only columns of interest
indications = labels_of_interest['indications_and_usage']
original_ndc = labels_of_interest['openfda.original_packager_product_ndc']
generic_name = labels_of_interest['openfda.generic_name']
brand_name = labels_of_interest['openfda.brand_name']
spl_set_id = labels_of_interest['openfda.spl_set_id']

spl_set_id

# Print the NDC code and its indications
#print('Length of Meds DF',len(df))
#print('Length of SPL indications',len(indications))


0     [07ad4366-4b21-f633-49f3-c2b35f88168d]
1     [0fdd0255-0055-65f3-b2c0-db8fbb87beae]
2     [0ff1ba75-3c86-4baf-a6b1-1bb883839866]
3     [1b18b903-eb52-4d08-b4a5-f984957eb116]
4     [2239742b-74e1-4bc2-b5a8-15c4758d6f7b]
5     [27f15fac-7d98-4114-a2ec-92494a91da98]
6     [28eefc95-d92e-4555-b6c5-2933860b0610]
7     [2aab19ce-1e5e-44ee-85e5-d8f125f9bb1a]
8     [372af566-d6ec-446f-88db-0b4cc83b577f]
9     [6d9179a8-9579-4804-a4c7-633f37aef1ee]
10    [83cb7914-a683-47bb-a713-f2bc6a596bd2]
11    [86acdd9e-8637-43e4-8bf7-b3a123b871b8]
12    [87781ea8-62ec-483c-a4dd-8c965ce59485]
13    [94a7f96e-2ed1-432d-bc6a-5840863816e1]
14    [969c5b7d-cde1-4e31-96c9-5d7f46866682]
15    [a9e7608f-540c-d90c-e053-2a95a90a4f98]
16    [bd2e1c06-424b-4222-b723-90ffdcc3983c]
17    [c9c3fa3e-af0c-42b9-a7c0-181642d2b1ea]
18    [d91d021f-f644-eac7-e053-2995a90a81c4]
19    [dc65065b-cba0-46d7-8fb5-3368f3210b4d]
20    [f4b2b88c-946c-4670-8538-84f858b7af33]
21    [f86bda10-8b25-4125-8bb1-1b5917969ff0]
22    [f8b

# Export results to CSV
The SPL files from FDA are quite lengthy and unreadable. Thus, exporting it into a CSV file and wrapping the texts within an excel is more ideal. Further, this CSV file can be used to review with subject matter exports.

In [145]:
# Export to csv file
indications.to_csv('indications.csv', index=False)