# IQVIA NLP - Healthcare Concepts

## API Description
NLP with the ability to recognize key Healthcare concepts, whilst recognizing context and patterns in data such as drugs, disease, smoking categorization, which are relevant to healthcare organizations.

## Accessing the API
In order to consume this API, you will first need to Request access to the Healthcare Concepts API via this link:
https://api-marketplace.work.iqvia.com/s/communityapi/a085w00000zhFVKAA2/api-marketplaceiqvianlphealthcareconceptspreview

Please refer to "API Documentation" to learn more about accessing and using the API.

## Notebook Description
This notebook is designed to show users an example of using the Healthcare Concepts NLP API to extract features related to healthcare concepts such as populations and medications found in medical records.

### Authorization
The instructions for getting your credentials and the API endpoint URL can be found under the section "Get Started" and "How to use the API" following this link: https://api-marketplace.work.iqvia.com/s/communityapi/a085w00000zhFVKAA2/api-marketplaceiqvianlphealthcareconceptspreview

In [3]:
import getpass
import requests

# In this demo scenario, URL for US based customers
# api_marketplace_url = 'https://vt.us-rds.solutions.iqvia.com/multi/api/v1/multi/'
# In this demo scenario, URL for EU based customers
api_marketplace_url = 'https://vt.eu-apim.solutions.iqvia.com/eu/multi/api/v1/multi/'

mkp_user = input("Marketplace clientId: ")
mkp_password = getpass.getpass("Marketplace clientSecret: ")
mkp_headers = {'clientId': mkp_user, 'clientSecret': mkp_password}

# Check credentials by making a dummy request
print("Checking your credentials, please wait...")
response = requests.post(api_marketplace_url, headers=mkp_headers, files={'file': "test"})

if response.status_code == 200:
    print("Congratulations! Your credentials are accepted!")
else:
    raise Exception(f"Error: {response}")

Marketplace clientId: 195e71bb99f0477eb16bb01ae36e314a
Marketplace clientSecret: ········
Checking your credentials, please wait...
Congratulations! Your credentials are accepted!


### Example one: Make a request with text string as input
Healthcare Concepts NLP API expects the String as Request Data Type. This example shows how to make a request to the API with text strings as input.

In [4]:
import requests

# Define input text
input_text = "HISTORY OF PRESENT ILLNESS:  This 60-year-old white male is referred to us by his medical physician with a complaint of recent finding of a both pancreatic lesion and lesions with left adrenal gland.  The patient's history dates back to at the end of the January of this past year when he began experiencing symptoms consistent with difficulty almost like a suffocating feeling whenever he would lie flat on his back.  He noticed whenever he would recline backwards, he would begin this feeling and it is so bad now that he can barely recline, very little before he has this feeling. He does have a history of frequent urination.  Has been followed by urologist for this.  There is no family history of pancreatic cancer.  There is a history of gallstone pancreatitis in the patient's sister. MEDICATIONS:  Include glipizide 5 mg b.i.d., metformin 500 mg b.i.d., Atacand 16 mg daily, metoprolol 25 mg b.i.d., Lipitor 10 mg daily, pantoprazole 40 mg daily, Flomax 0.4 mg daily, Detrol 4 mg daily, Zyrtec 10 mg daily, Advair Diskus 100/50 mcg one puff b.i.d., and fluticasone spray 50 mcg two sprays daily. PAST SURGICAL HISTORY:  He has not had any previous surgery."

# Make a request
print("Posting text strings...")
response = requests.post(api_marketplace_url, headers=mkp_headers, files={'file': input_text})

# Check the response
if response.status_code == 200:
    print("Success!")
    body_json = response.json()
    print(f"Raw JSON response from the API is: {body_json}")
else:
    raise Exception(f"Error: {response}")


Posting text strings...
Success!
Raw JSON response from the API is: {'doc_id': 'file', 'results': [{'concept': {'logical_column_id': 0, 'value': '0.4', 'original_spans_outer': [[963, 966]], 'original_spans_inner': [[963, 966]], 'indexed_spans_outer': [[39683, 39766]], 'indexed_spans_inner': [[39683, 39766]], 'text_spans_outer': [[25, 28]], 'text_spans_inner': [[25, 28]]}, 'preferred_term': {'logical_column_id': 0, 'value': '0.4'}, 'ontology': {'logical_column_id': 0, 'value': 'numerics'}, 'node_id': {'logical_column_id': 0, 'value': 'decimals:0.4'}, 'category': {'logical_column_id': 1, 'value': 'MeasurementValue'}, 'certainty': {'logical_column_id': 2, 'value': ''}, 'text': {'value': '... 40 mg daily , Flomax 0.4 mg daily , Detrol 4 ...'}}, {'concept': {'logical_column_id': 0, 'value': '0.4 mg daily', 'original_spans_outer': [[963, 975]], 'original_spans_inner': [[963, 975]], 'indexed_spans_outer': [[39683, 40135]], 'indexed_spans_inner': [[39683, 40135]], 'text_spans_outer': [[25, 37]

Now that we have got the JSON response from the Healthcare Concepts NLP API, we could convert the useful information associated with the keys into a pandas dataframe.

In [5]:
import pandas as pd

# initiate an empty dataframe
df = pd.DataFrame()
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width", 1000)

# Retrieve main results from the JSON response, please note this cell would fail if the request failed in the last step
results = body_json["results"]
for result_dict in results:
    df_dict = {}
    for key, value_dict in result_dict.items():
        df_dict[key] = value_dict['value']
    df = pd.concat([df, pd.DataFrame.from_records([{**df_dict}])], ignore_index=True)

# Check the dataframe
df

Unnamed: 0,concept,preferred_term,ontology,node_id,category,certainty,text
0,0.4,0.4,numerics,decimals:0.4,MeasurementValue,,"... 40 mg daily , Flomax 0.4 mg daily , Detrol..."
1,0.4 mg daily,0.4 mg every 1d,measurement,dosage-g:0.4 mg every 1d,Dosage,,"... 40 mg daily , Flomax 0.4 mg daily , Detrol..."
2,4,4,numerics,integers:4,MeasurementValue,,"... 0.4 mg daily , Detrol 4 mg daily , Zyrtec ..."
3,4 mg daily,4 mg every 1d,measurement,dosage-g:4 mg every 1d,Dosage,,"... 0.4 mg daily , Detrol 4 mg daily , Zyrtec ..."
4,5,5,numerics,integers:5,MeasurementValue,,"MEDICATIONS : Include glipizide 5 mg b.i.d. , ..."
5,5 mg b.i.d.,5 mg every 12h,measurement,dosage-g:5 mg every 12h,Dosage,,"MEDICATIONS : Include glipizide 5 mg b.i.d. , ..."
6,10,10,numerics,integers:10,MeasurementValue,,"... 25 mg b.i.d. , Lipitor 10 mg daily , panto..."
7,10,10,numerics,integers:10,MeasurementValue,,"... 4 mg daily , Zyrtec 10 mg daily , Advair D..."
8,10 mg daily,10 mg every 1d,measurement,dosage-g:10 mg every 1d,Dosage,,"... 25 mg b.i.d. , Lipitor 10 mg daily , panto..."
9,10 mg daily,10 mg every 1d,measurement,dosage-g:10 mg every 1d,Dosage,,"... 4 mg daily , Zyrtec 10 mg daily , Advair D..."


### Example two: Make a request with a zip file as input

In [6]:
import os
import shutil
import zipfile

# Define input zip location
input_zip = os.path.join(os.getcwd(), "demo_docs/HealthcareConcepts/HealthcareConcepts_demo.zip")

# Define a directory to extract the input zip file into
input_folder = os.path.join(os.getcwd(), "demo_docs/HealthcareConcepts/HealthcareConcepts_demo")
if os.path.isdir(input_folder):
    shutil.rmtree(input_folder)
os.mkdir(input_folder)

# Extract files from the input zip into the folder
with zipfile.ZipFile(input_zip, "r") as zip_ref:
    zip_ref.extractall(input_folder)
print(f"Documents extracted to: {input_folder}")

# Make a request with all extracted files
print("Posting text files from the zip file...")
responses = []
for filename in os.listdir(input_folder):
    file_path = os.path.join(input_folder, filename)
    with open(file_path, "r") as file:
        print(f"Posting {filename}...")
        response = requests.post(api_marketplace_url, headers=mkp_headers, files={'file': file})
        if response.status_code == 200:
            print("Success! Adding response to the full results!")
            responses.append(response.json())
        else:
            print(f"Error: {response}")
            exit()
print("All done!")
print(f"JSON responses are: {responses}")

Documents extracted to: C:\Users\Hui.Feng\Documents\Git\api-marketplace-demo\demo_docs/HealthcareConcepts/HealthcareConcepts_demo
Posting text files from the zip file...
Posting M1.txt...
Success! Adding response to the full results!
Posting M2.txt...
Success! Adding response to the full results!
All done!
JSON responses are: [{'doc_id': 'M1.txt', 'results': [{'concept': {'logical_column_id': 0, 'value': '5-0 nylon suture', 'original_spans_outer': [[887, 903]], 'original_spans_inner': [[897, 903]], 'indexed_spans_outer': [[36300, 36916]], 'indexed_spans_inner': [[36710, 36916]], 'text_spans_outer': [[30, 46]], 'text_spans_inner': [[40, 46]]}, 'preferred_term': {'logical_column_id': 0, 'value': 'Sutures'}, 'ontology': {'logical_column_id': 0, 'value': 'nlm'}, 'node_id': {'logical_column_id': 0, 'value': 'D013537'}, 'category': {'logical_column_id': 1, 'value': 'MedicalEquipment'}, 'certainty': {'logical_column_id': 2, 'value': ''}, 'text': {'value': '... the area were closed with 5-0 ny

Similar to Example one, you could convert the JSON output into a pandas dataframe.

In [None]:
import pandas as pd

# initiate an empty dataframe
df = pd.DataFrame()
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width", 1000)

# Retrieve main results from the JSON response, please note this cell would fail if the request failed in the last step
for body_json in responses:
    results = body_json["results"]
    for result_dict in results:
        df_dict = {}
        for key, value_dict in result_dict.items():
            df_dict[key] = value_dict['value']
        df_dict["Doc ID"] = body_json["doc_id"]
        df = pd.concat([df, pd.DataFrame.from_records([{**df_dict}])], ignore_index=True)

# Check the dataframe
df.head(10)

That's it! Hope you find this tutorial useful! Bye!