# IQVIA NLP - MedDRA

## API Description
The unstructured format of reports relevant to safety (social media, call center notes, literature) makes Adverse Event Detection hard to capture.

For Data Scientists, MedDRA NLP API solves this problem by offering a fast and accurate automated identification and coding of adverse events from a range of unstructured documents into standardized MedDRA concepts.  

## Accessing the API
In order to consume this API, you will first need to request access to the MedDRA NLP API via this link:
https://api-marketplace.work.iqvia.com/s/communityapi/a085w00000zhFVAAA2/api-marketplaceiqvianlpmeddrapreview .

Please refer to "API Documentation" to learn more about accessing and using the API.

## Notebook Description
This notebook is designed to show users an example of using the MedDRA NLP API to identify and extract adverse events from medical documents into standardized MedDRA concepts.

### Authorization
The instructions for getting your credentials and the API endpoint URL can be found under the section "Get Started" and "How to use the API" following this link: https://api-marketplace.work.iqvia.com/s/communityapi/a085w00000zhFVAAA2/api-marketplaceiqvianlpmeddrapreview

In [1]:
import getpass
import requests

# In this demo scenario, URL for US based customers
# api_marketplace_url = 'https://vt.us-rds.solutions.iqvia.com/meddra/api/v1/meddra/'
# In this demo scenario, URL for EU based customers
api_marketplace_url = 'https://vt.eu-apim.solutions.iqvia.com/eu/meddra/api/v1/meddra/'

mkp_user = input("Marketplace clientId: ")
mkp_password = getpass.getpass("Marketplace clientSecret: ")
mkp_headers = {'clientId': mkp_user, 'clientSecret': mkp_password}

# Check credentials by making a dummy request
print("Checking your credentials, please wait...")
response = requests.post(api_marketplace_url, headers=mkp_headers, files={'file': "test"})

if response.status_code == 200:
    print("Congratulations! Your credentials are accepted!")
else:
    raise Exception(f"Error: {response}")

Marketplace clientId: 96244db5b4fd4641a366068cd4e363b6
Marketplace clientSecret: ········
Checking your credentials, please wait...
Congratulations! Your credentials are accepted!


### Example one: Make a request with text string as input
MedDRA NLP API expects the String as Request Data Type. This example shows how to make a request to the API with text strings as input.

In [2]:
import requests

# Define input text
input_text = "Local anesthetic medication was infiltrated around and into the area of interest. There was an obvious skin lesion there and this gentleman has a history of squamous cell carcinoma. A punch biopsy of the worrisome skin lesion was obtained with a portion of the normal tissue included. The predominant portion of the biopsy was of the lesion itself."

# Make a request
print("Posting text strings...")
response = requests.post(api_marketplace_url, headers=mkp_headers, files={'file': input_text})

# Check the response
if response.status_code == 200:
    print("Success!")
    body_json = response.json()
    print(f"Raw JSON response from the API is: {body_json}")
else:
    raise Exception(f"Error: {response}")


Posting text strings...
Success!
Raw JSON response from the API is: {'doc_id': 'file', 'results': [{'Certainty': {'logical_column_id': 0, 'value': ''}, '[PT] MedDRA PT': {'logical_column_id': 1, 'value': 'Skin lesion', 'original_spans_outer': [[103, 114]], 'original_spans_inner': [[103, 114]], 'indexed_spans_outer': [[4413, 4824]], 'indexed_spans_inner': [[4413, 4824]], 'text_spans_outer': [[21, 32]], 'text_spans_inner': [[21, 32]]}, '[NID] MedDRA PT': {'logical_column_id': 2, 'value': '10040882', 'original_spans_outer': [[103, 114]], 'original_spans_inner': [[103, 114]], 'indexed_spans_outer': [[4413, 4824]], 'indexed_spans_inner': [[4413, 4824]], 'text_spans_outer': [[21, 32]], 'text_spans_inner': [[21, 32]]}, '[PT] MedDRA LLT': {'logical_column_id': 3, 'value': 'Skin lesion', 'original_spans_outer': [[103, 114]], 'original_spans_inner': [[103, 114]], 'indexed_spans_outer': [[4413, 4824]], 'indexed_spans_inner': [[4413, 4824]], 'text_spans_outer': [[21, 32]], 'text_spans_inner': [[21

Now that we have got the JSON response from the MedDRA NLP API, we could convert the useful information associated with the keys into a pandas dataframe.

In [3]:
import pandas as pd

# initiate an empty dataframe
df = pd.DataFrame()
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width", 1000)

# Retrieve main results from the JSON response, please note this cell would fail if the request failed in the last step
results = body_json["results"]
for result_dict in results:
    df_dict = {}
    for key, value_dict in result_dict.items():
        df_dict[key] = value_dict['value']
    df = pd.concat([df, pd.DataFrame.from_records([{**df_dict}])], ignore_index=True)

# Check the dataframe
df

Unnamed: 0,Certainty,[PT] MedDRA PT,[NID] MedDRA PT,[PT] MedDRA LLT,[NID] MedDRA LLT,Flag,text
0,,Skin lesion,10040882,Skin lesion,10040882,,There was an obvious skin lesion there and thi...
1,,Skin lesion,10040882,Skin lesion,10040882,,... punch biopsy of the worrisome skin lesion ...
2,,Squamous cell carcinoma,10041823,Squamous cell carcinoma,10041823,,... gentleman has a history of squamous cell c...


### Example two: Make a request with a zip file as input

In [4]:
import os
import shutil
import zipfile

# Define input zip location
input_zip = os.path.join(os.path.dirname(os.getcwd()), "demo_docs/MedDRA_Codes/MedDRA_Codes_demo.zip")

# Define a directory to extract the input zip file into
input_folder = os.path.join(os.path.dirname(os.getcwd()), "demo_docs/MedDRA_Codes/MedDRA_Codes_demo")
if os.path.isdir(input_folder):
    shutil.rmtree(input_folder)
os.mkdir(input_folder)

# Extract files from the input zip into the folder
with zipfile.ZipFile(input_zip, "r") as zip_ref:
    zip_ref.extractall(input_folder)
print(f"Documents extracted to: {input_folder}")

# Make a request with all extracted files
print("Posting text files from the zip file...")
responses = []
for filename in os.listdir(input_folder):
    file_path = os.path.join(input_folder, filename)
    with open(file_path, "r") as file:
        print(f"Posting {filename}...")
        response = requests.post(api_marketplace_url, headers=mkp_headers, files={'file': file})
        if response.status_code == 200:
            print("Success! Adding response to the full results!")
            responses.append(response.json())
        else:
            print(f"Error: {response}")
            exit()
print("All done!")
print(f"JSON responses are: {responses}")

Documents extracted to: C:\Users\Hui.Feng\Documents\Git\api-marketplace-demo\demo_docs/MedDRA_Codes/MedDRA_Codes_demo
Posting text files from the zip file...
Posting M1.txt...
Success! Adding response to the full results!
Posting M2.txt...
Success! Adding response to the full results!
All done!
JSON responses are: [{'doc_id': 'M1.txt', 'results': [{'Certainty': {'logical_column_id': 0, 'value': ''}, '[PT] MedDRA PT': {'logical_column_id': 1, 'value': 'Hypothyroidism', 'original_spans_outer': [[1170, 1184]], 'original_spans_inner': [[1170, 1184]], 'indexed_spans_outer': [[48025, 48572]], 'indexed_spans_inner': [[48025, 48572]], 'text_spans_outer': [[56, 70]], 'text_spans_inner': [[56, 70]]}, '[NID] MedDRA PT': {'logical_column_id': 2, 'value': '10021114', 'original_spans_outer': [[1170, 1184]], 'original_spans_inner': [[1170, 1184]], 'indexed_spans_outer': [[48025, 48572]], 'indexed_spans_inner': [[48025, 48572]], 'text_spans_outer': [[56, 70]], 'text_spans_inner': [[56, 70]]}, '[PT] Me

Similar to Example one, you could convert the JSON output into a pandas dataframe.

In [5]:
import pandas as pd

# initiate an empty dataframe
df = pd.DataFrame()
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width", 1000)

# Retrieve main results from the JSON response, please note this cell would fail if the request failed in the last step
for body_json in responses:
    results = body_json["results"]
    for result_dict in results:
        df_dict = {}
        for key, value_dict in result_dict.items():
            df_dict[key] = value_dict['value']
        df_dict["Doc ID"] = body_json["doc_id"]
        df = pd.concat([df, pd.DataFrame.from_records([{**df_dict}])], ignore_index=True)

# Check the dataframe
df

Unnamed: 0,Certainty,[PT] MedDRA PT,[NID] MedDRA PT,[PT] MedDRA LLT,[NID] MedDRA LLT,Flag,text,Doc ID
0,,Hypothyroidism,10021114,Hypothyroidism,10021114.0,Indication,... is a consultation for the patient in regar...,M1.txt
1,,Hypothyroidism,10021114,Hypothyroidism,10021114.0,Indication,... the patient in regards to her hypothyroidi...,M1.txt
2,,Skin lesion,10040882,Skin lesion,10040882.0,,... 2017 PREOPERATIVE DIAGNOSIS : Worrisome sk...,M1.txt
3,,Skin lesion,10040882,Skin lesion,10040882.0,,POSTPROCEDURE DIAGNOSIS : Worrisome skin lesio...,M1.txt
4,,Skin lesion,10040882,Skin lesion,10040882.0,,There was an obvious skin lesion there and thi...,M1.txt
5,,Skin lesion,10040882,Skin lesion,10040882.0,,... punch biopsy of the worrisome skin lesion ...,M1.txt
6,,Squamous cell carcinoma,10041823,Squamous cell carcinoma,10041823.0,,... gentleman has a history of squamous cell c...,M1.txt
7,Family,Thyroiditis,10043778,Thyroiditis,10043778.0,,"... Graves disease , as well a sister with Has...",M2.txt
8,Hypothetical,Weight decreased,10047895,,,,"If she wanted to lose significant weight , I s...",M2.txt
9,Negated,Abdominal pain,10000081,Abdominal pain,10000081.0,,"... some loosening stools , but denies abdomin...",M2.txt


That's it! Hope you find this tutorial useful! Bye!