# IQVIA NLP - Adverse Events Coding (MedDRA)

## API Description
The unstructured format of biomedical Journals, abstracts in Medline, PMC, CT.gov and FDA make Adverse Event Detection hard to capture.

For Data Scientists, MedDRA NLP API solves this problem by offering a fast and accurate automated identification and coding of adverse events from a range of publications in Medline, PMC, CT.gov and TDA into standardized MedDRA concepts.

## Accessing the API
In order to consume this API, you will first need to Request access to the Adverse Events Coding (MedDRA) API
via this link:
<LINK PLACEHOLDER> .

Please refer to "API Documentation" to learn more about accessing and using the API.

## Notebook Description
This notebook is designed to extract adverse events and associated MedDRA codes from MEDLINE, PMC, FDA Drug Labels and/or Clinical Trials. Users could specify records of interest by supplying document identifiers in the 'docIds' parameter when posting the request.

### Authorization
The instructions for getting your credentials and the API endpoint URL can be found under the section "Get Started" and "How to use the API" following this link: <LINK PLACEHOLDER> .

In [1]:
import getpass

# Get URL and credentials from customers
api_endpoint_url = input('Please enter the API URL including the source data endpoints: ').rstrip('/')

mkp_user = input("Marketplace clientId: ")
mkp_password = getpass.getpass("Marketplace clientSecret: ")
mkp_headers = {'clientId': mkp_user, 'clientSecret': mkp_password}

print("Thanks for inputting URL, your user name and password!")

Please enter the API URL including the source data endpoints: https://vt.eu-apim-devtest.solutions.iqvia.com/eu/fetch-meddra/api/v2/meddra/clinicaltrials
Marketplace clientId: 555fd2145a8a43ba8c204ea9f846078d
Marketplace clientSecret: ········
Thanks for inputting URL, your user name and password!


### Example: Make a request for a list of known documents

Make a GET request to the URL to fetch MedDRA coded adverse events from documents corresponding to the user specified docIDs.

Please note that PMIDs are expected docIds for MEDLINE endpoint. For PMC endpoint, it is expected to be the PubMed Central ID number, which is 'PMC' followed by 6-9 digits, multiple PMCIDs can be specified per request. For FDA Drug Labels, the docIds expected are DailyMed Set ID, and multiple Set IDs can be specified per request; for Clinical Trials, it is the ClinicalTrials.gov registry number expected, which is 'NCT' followed by 8 digits, and multiple NCTIDs can be specified per request.

In [2]:
import requests
import time

# Define input IDs, here using clinicaltrials as an example
docids = ["NCT04875351", "NCT03180086"]

# Make a request
print("Posting request to extract adverse events and associated MedDRA codes from specified documents...")

print(f"Processing {docids}...")
response = requests.post(url=api_endpoint_url, headers=mkp_headers, json={'params': {'docIds': docids}, 'row_limit': 10})

# Poll the API until results are available
while response.status_code == 202:
    print('Results are not available yet. Waiting 5 seconds before polling again...')
    time.sleep(5)
    # Use the run id from the Post request to get results
    run_identifier = response.json()['id']    
    response = requests.get(url=f'{api_endpoint_url}/{run_identifier}', headers=mkp_headers)
    
# Check the response
if response.status_code == 200:
    print("Success!")
    results_json = response.json()
else:
    raise Exception(f"Unexpected status code: {response.status_code}")

print(f'Results: \n{results_json}\n')

Posting request to extract adverse events and associated MedDRA codes from specified documents...
Processing ['NCT04875351', 'NCT03180086']...
Results are not available yet. Waiting 5 seconds before polling again...
Results are not available yet. Waiting 5 seconds before polling again...
Results are not available yet. Waiting 5 seconds before polling again...
Results are not available yet. Waiting 5 seconds before polling again...
Results are not available yet. Waiting 5 seconds before polling again...
Results are not available yet. Waiting 5 seconds before polling again...
Results are not available yet. Waiting 5 seconds before polling again...
Results are not available yet. Waiting 5 seconds before polling again...
Results are not available yet. Waiting 5 seconds before polling again...
Success!
Results: 
[{'doc_id': 'NCT04875351', 'results': [{'Certainty': {'logical_column_id': 0, 'value': 'Hypothetical', 'indexed_spans_outer': [[4224, 4241]], 'indexed_spans_inner': [[4233, 4241]], 

Now that we have got the JSON responses from the Adverse Events Coding API, we could convert the useful information associated with the keys into a pandas dataframe.

In [4]:
import pandas as pd

# initiate an empty dataframe
df = pd.DataFrame()
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width", 1000)

# Retrieve main results from the JSON response, please note this cell would fail if the request failed in the last step
for inner_results in results_json:
    for result_dict in inner_results['results']:
        df_dict = {}
        for key, value_dict in result_dict.items():                                                                                                                                                                                                                                                                                                                           
            df_dict[key] = value_dict['value']                                                                                                                                                                                                                                                                                                                                
        df_dict['Doc Id'] = inner_results['doc_id']                                                                                                                                                                                                                                                                                                                           
        df = pd.concat([df, pd.DataFrame.from_records([{**df_dict}])], ignore_index=True)                                                                                                                                                                                                                                                                                     

# Print the DataFrame                                                                                                                                                                                                                                                                                                                                                         
df.head(10)

Unnamed: 0,Certainty,[PT] MedDRA PT,[NID] MedDRA PT,Flag,Section,text,Doc Id
0,Hypothetical,Breast cancer,10006187,,Detailed Description,The BCI Registry is designed as a large-scale ...,NCT04875351
1,Negated,Human epidermal growth factor receptor negative,10077481,,Inclusion Criteria,"Inclusion Criteria: - Early stage (I, II or II...",NCT04875351
2,Negated,Neoplasm,10028980,,Inclusion Criteria,"Inclusion Criteria: - Early stage (I, II or II...",NCT04875351
3,,Adenoid cystic carcinoma,10053231,,Exclusion Criteria,Exclusion Criteria: - Patient has distant meta...,NCT04875351
4,,Breast cancer,10006187,Adverse Event,Detailed Description,The BCI Registry is designed as a large-scale ...,NCT04875351
5,,Breast cancer,10006187,Indication,Exclusion Criteria,Exclusion Criteria: - Patient has distant meta...,NCT04875351
6,,Breast cancer,10006187,Indication,Exclusion Criteria,Exclusion Criteria: - Patient has distant meta...,NCT04875351
7,,Breast cancer,10006187,Indication,Inclusion Criteria,"Inclusion Criteria: - Early stage (I, II or II...",NCT04875351
8,,Breast cancer,10006187,,Brief Summary,The purpose of the Breast Cancer Index (BCI) R...,NCT04875351
9,,Breast cancer,10006187,,Brief Summary,The purpose of the Breast Cancer Index (BCI) R...,NCT04875351


That's it! Hope you find this tutorial useful! Bye!