# IQVIA NLP - Adverse Events Coding (MedDRA)

## API Description
The unstructured format of biomedical Journals, abstracts in Medline, PMC, CT.gov and FDA make Adverse Event Detection hard to capture.

For Data Scientists, MedDRA NLP API solves this problem by offering a fast and accurate automated identification and coding of adverse events from a range of publications in Medline, PMC, CT.gov and TDA into standardized MedDRA concepts.

## Accessing the API
In order to consume this API, you will first need to Request access to the Adverse Events Coding (MedDRA) API
via this link:
<LINK PLACEHOLDER> .

Please refer to "API Documentation" to learn more about accessing and using the API.

## Notebook Description
This notebook is designed to extract adverse events and associated MedDRA codes from MEDLINE, PMC, FDA Drug Labels and/or Clinical Trials. Users could specify records of interest by supplying document identifiers in the 'docIds' parameter when posting the request.

### Authorization
The instructions for getting your credentials and the API endpoint URL can be found under the section "Get Started" and "How to use the API" following this link: <LINK PLACEHOLDER> .

In [12]:
import getpass

# In this demo scenario, URL for EU based customers
api_base_url = 'https://vt.eu-apim-devtest.solutions.iqvia.com/eu/fetch-meddra/api/v1/meddra/'

mkp_user = input("Marketplace clientId: ")
mkp_password = getpass.getpass("Marketplace clientSecret: ")
mkp_headers = {'clientId': mkp_user, 'clientSecret': mkp_password}

print("Thanks for inputting your user name and password!")

Marketplace clientId: e936390e2dc340fcbeebf29e8f8b14e4
Marketplace clientSecret: ········
Thanks for inputting your user name and password!


### Example: Make a request for a list of known dd documents
docid as input parameter. This example shows how to make a request to the API on MEDLINE with a list of PMIDs.

Choose the data source of your interest. Choices for data source are MEDLINE, PMC, FDA Drug Labels, and Clinical Trials. In this case, we choose MEDLINE as an example by typing "a" to the prompt.

In [13]:
# Define source data
source_dict = {"a": "medline", "b": "pmc", "c": "fdadruglabels", "d": "clinicaltrials"}
source_selected = input("Please choose the alphabetic id for the source data of your interest:\n"
      "a: MEDLINE\nb: PMC\nc: FDA Drug Labels\nd: Clinical Trials\nSource ID:")

api_endpoint_url = api_base_url + source_dict[source_selected]
print(f"URL for the endpoint you choose is {api_endpoint_url}.")

Please choose the alphabetic id for the source data of your interest:
a: MEDLINE
b: PMC
c: FDA Drug Labels
d: Clinical Trials
Source ID:a
URL for the endpoint you choose is https://vt.eu-apim-devtest.solutions.iqvia.com/eu/fetch-meddra/api/v1/meddra/medline.


Make a GET request to the URL to fetch MedDRA coded adverse events from documents corresponding to the user specified PMIDs.

Please note that PMIDs are expected docIds only for MEDLINE endpoint. For PMC endpoint, it is expected to be the PubMed Central ID number, which is 'PMC' followed by 6-9 digits, multiple PMCIDs can be specified per request. For FDA Drug Labels, the docIds expected are DailyMed Set ID, and multiple Set IDs can be specified per request; for Clinical Trials, it is the ClinicalTrials.gov registry number expected, which is 'NCT' followed by 8 digits, and multiple NCTIDs can be specified per request.

In [15]:
import requests

# Define input IDs
docids = ['33302693', '33302276', '33302636']

# Make a request
print("Posting request to extract adverse events and associated MedDRA codes from specified documents...")


print(f"Processing {docids}...")
response = requests.get(api_endpoint_url, headers=mkp_headers, params={'docIds': docids, "rowLimit": 20})
# Check the response
if response.status_code == 200:
    print("Success!")
    results_json = response.json()
else:
    raise Exception(f"Error: {response}")

print(results_json)

Posting request to extract adverse events and associated MedDRA codes from specified documents...
Processing ['33302693', '33302276', '33302636']...
Success!
[{'doc_id': '33302693', 'results': [{'Certainty': {'logical_column_id': 0, 'value': ''}, '[PT] MedDRA PT': {'logical_column_id': 1, 'value': 'Cardiac failure congestive', 'indexed_spans_outer': [[1892, 1916]], 'indexed_spans_inner': [[1892, 1916]], 'text_spans_outer': [[184, 208]], 'text_spans_inner': [[184, 208]]}, '[NID] MedDRA PT': {'logical_column_id': 2, 'value': '10007559', 'indexed_spans_outer': [[1892, 1916]], 'indexed_spans_inner': [[1892, 1916]], 'text_spans_outer': [[184, 208]], 'text_spans_inner': [[184, 208]]}, 'Flag': {'logical_column_id': 3, 'value': ''}, 'Section': {'logical_column_id': 4, 'value': 'Standard Abstract', 'indexed_spans_outer': [[1403, 2199]], 'indexed_spans_inner': [[1403, 2199]], 'text_spans_outer': [[0, 150]], 'text_spans_inner': [[0, 150]]}, 'text': {'value': 'The Impella device is used routinely 

Now that we have got the JSON responses from the Adverse Events Coding API, we could convert the useful information associated with the keys into a pandas dataframe.

In [16]:
import pandas as pd

# initiate an empty dataframe
df = pd.DataFrame()
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width", 1000)

# Retrieve main results from the JSON response, please note this cell would fail if the request failed in the last step
for results in results_json:
    df_dict = {}
    for result_dict in results["results"]:
        for key, value_dict in result_dict.items():
            df_dict[key] = value_dict['value']
    df_dict["Doc Id"] = results["doc_id"]
    df = pd.concat([df, pd.DataFrame.from_records([{**df_dict}])], ignore_index=True)

# Check the dataframe
df

Unnamed: 0,Certainty,[PT] MedDRA PT,[NID] MedDRA PT,Flag,Section,text,Doc Id
0,,Myocardial infarction,10028596,Indication,Standard Abstract,The Impella device is used routinely during co...,33302693
1,,Oxidative stress,10080562,,Standard Abstract,BACKGROUND Oxidative stress is one of the poss...,33302636


That's it! Hope you find this tutorial useful! Bye!