# IQVIA NLP - Social Determinants of Health (SDoH)

## API Description
An increasing focus on health equity and awareness of the role of social determinants of health has created a growing need for population health analysis and predictive analytics. Social Determinants of Health (SDoH) are often documented in patients unstructured medical records but are poorly represented in structured data, therefore systematically surfacing this information is challenging.

This API transforms unstructured medical records into structured normalized SDoH information, organized into classes and groups.

## Accessing the API
In order to consume this API, you will first need to Request access to the SDoH API via this link:
https://api-marketplace.work.iqvia.com/s/communityapi/a085w00000ytJqIAAU/api-marketplaceiqvianlpsocialdeterminantsofhealthpreview

Please refer to "API Documentation" to learn more about accessing and using the API.

## Notebook Description
This notebook is designed to show users an example of using the Social Determinents of Health NLP API to extract information such as Healthcare Systems, Education, Environment, Social Context, Food Insecurity, and Economic Stability. Each SDoH has several sub-classifications that are identified via this endpoint.

### Authorization
The instructions for getting your credentials and the API endpoint URL can be found under the section "Get Started" and "How to use the API" following this link: https://api-marketplace.work.iqvia.com/s/communityapi/a085w00000ytJqIAAU/api-marketplaceiqvianlpsocialdeterminantsofhealthpreview

In [3]:
import getpass
import requests

# In this demo scenario, URL for US based customers
# api_marketplace_url = 'https://vt.us-rds.solutions.iqvia.com/sdoh/api/v1/sdoh/'
# In this demo scenario, URL for EU based customers
api_marketplace_url = 'https://vt.eu-apim.solutions.iqvia.com/eu/sdoh/api/v1/sdoh/'

mkp_user = input("Marketplace clientId: ")
mkp_password = getpass.getpass("Marketplace clientSecret: ")
mkp_headers = {'clientId': mkp_user, 'clientSecret': mkp_password}

# Check credentials by making a dummy request
print("Checking your credentials, please wait...")
response = requests.post(api_marketplace_url, headers=mkp_headers, files={'file': "test"})

if response.status_code == 200:
    print("Congratulations! Your credentials are accepted!")
else:
    raise Exception(f"Error: {response}")

Marketplace clientId: ee0c518489ae44ca82599a5632295cf4
Marketplace clientSecret: ········
Checking your credentials, please wait...
Congratulations! Your credentials are accepted!


### Example one: Make a request with text string as input
SDoH NLP API expects the String as Request Data Type. This example shows how to make a request to the API with text strings as input.

In [4]:
import requests

# Define input text
input_text = "She said she was living with her husband. She reported a chronic history of mild sadness or depression, which was relatively stable. When asked about her current psychological experience, she said that she was somewhat sad, but not dwelling on things. She denied any history of suicidal ideation or homicidal ideation. She denied alcohol or illicit drug use."

# Make a request
print("Posting text strings...")
response = requests.post(api_marketplace_url, headers=mkp_headers, files={'file': input_text})

# Check the response
if response.status_code == 200:
    print("Success!")
    body_json = response.json()
    print(f"Raw JSON response from the API is: {body_json}")
else:
    raise Exception(f"Error: {response}")


Posting text strings...
Success!
Raw JSON response from the API is: {'doc_id': 'file', 'results': [{'topic_class': {'logical_column_id': 0, 'value': 'Depression'}, 'topic': {'logical_column_id': 1, 'value': 'Depression NOS', 'original_spans_outer': [[92, 102]], 'original_spans_inner': [[92, 102]], 'indexed_spans_outer': [[3969, 4332]], 'indexed_spans_inner': [[3969, 4332]], 'text_spans_outer': [[31, 41]], 'text_spans_inner': [[31, 41]]}, 'polarity': {'logical_column_id': 2, 'value': 'TRUE', 'original_spans_outer': [[92, 102]], 'original_spans_inner': [[92, 102]], 'indexed_spans_outer': [[3969, 4332]], 'indexed_spans_inner': [[3969, 4332]], 'text_spans_outer': [[31, 41]], 'text_spans_inner': [[31, 41]]}, 'text': {'value': '... history of mild sadness or depression , which was relatively stable ...'}, 'suggested_codes': {'logical_column_id': 3, 'value': [{'ontology': 'snomed', 'code': '35489007', 'description': 'Depressive disorder (disorder)', 'url': 'https://snomedbrowser.com/Codes/Det

Now that we have got the JSON response from the SDoH NLP API, we could convert the useful information associated with the keys into a pandas dataframe.

In [5]:
import pandas as pd

# initiate an empty dataframe
df = pd.DataFrame()
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width", 1000)

# Retrieve main results from the JSON response, please note this cell would fail if the request failed in the last step
results = body_json["results"]
for result_dict in results:
    df_dict = {}
    for key, value_dict in result_dict.items():
        df_dict[key] = value_dict['value']
    df = pd.concat([df, pd.DataFrame.from_records([{**df_dict}])], ignore_index=True)

# Check the dataframe
df

Unnamed: 0,topic_class,topic,polarity,text,suggested_codes
0,Depression,Depression NOS,True,"... history of mild sadness or depression , wh...","[{'ontology': 'snomed', 'code': '35489007', 'd..."
1,Depression,Signs and symptoms of depression,False,She denied any history of suicidal ideation or...,[]
2,Depression,Signs and symptoms of depression,True,... a chronic history of mild sadness or depre...,"[{'ontology': 'snomed', 'code': '394924000', '..."
3,Living Condition,Live alone,False,She said she was living with her husband .,"[{'ontology': 'snomed', 'code': '365481000', '..."
4,Substance Abuse,Alcohol abuse,False,She denied alcohol or illicit drug use .,"[{'ontology': 'snomed', 'code': '228273003', '..."


### Example two: Make a request with a zip file as input

In [None]:
import os
import shutil
import zipfile

# Define input zip location
input_zip = os.path.join(os.path.dirname(os.getcwd()), "demo_docs/SDoH/SDoH_demo.zip")

# Define a directory to extract the input zip file into
input_folder = os.path.join(os.path.dirname(os.getcwd()), "demo_docs/SDoH/SDoH_demo")
if os.path.isdir(input_folder):
    shutil.rmtree(input_folder)
os.mkdir(input_folder)

# Extract files from the input zip into the folder
with zipfile.ZipFile(input_zip, "r") as zip_ref:
    zip_ref.extractall(input_folder)
print(f"Documents extracted to: {input_folder}")

# Make a request with all extracted files
print("Posting text files from the zip file...")
responses = []
for filename in os.listdir(input_folder):
    file_path = os.path.join(input_folder, filename)
    with open(file_path, "r") as file:
        print(f"Posting {filename}...")
        response = requests.post(api_marketplace_url, headers=mkp_headers, files={'file': file})
        if response.status_code == 200:
            print("Success! Adding response to the full results!")
            responses.append(response.json())
        else:
            print(f"Error: {response}")
            exit()
print("All done!")
print(f"JSON responses are: {responses}")

Documents extracted to: C:\Users\Hui.Feng\Documents\Git\api-marketplace-demo\demo_docs/SDoH/SDoH_demo
Posting text files from the zip file...
Posting M1.txt...
Success! Adding response to the full results!
Posting M10.txt...
Success! Adding response to the full results!
Posting M11.txt...
Success! Adding response to the full results!
Posting M12.txt...
Success! Adding response to the full results!
Posting M13.txt...
Success! Adding response to the full results!
Posting M14.txt...
Success! Adding response to the full results!
Posting M15.txt...


Similar to Example one, you could convert the JSON output into a pandas dataframe.

In [19]:
import pandas as pd

# initiate an empty dataframe
df = pd.DataFrame()
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width", 1000)

# Retrieve main results from the JSON response, please note this cell would fail if the request failed in the last step
for body_json in responses:
    results = body_json["results"]
    for result_dict in results:
        df_dict = {}
        for key, value_dict in result_dict.items():
            df_dict[key] = value_dict['value']
        df_dict["Doc ID"] = body_json["doc_id"]
        df = pd.concat([df, pd.DataFrame.from_records([{**df_dict}])], ignore_index=True)

# Check the dataframe
df.head(10)

Unnamed: 0,topic_class,topic,polarity,text,suggested_codes
0,Employment Status,Not employed,False,He works as a payroll representative and previ...,"[{'ontology': 'snomed', 'code': '224363007', '..."
1,Substance Abuse,Tobacco abuse,False,... HISTORY : He is a non-cigarette smoker and...,"[{'ontology': 'snomed', 'code': '8392000', 'de..."
2,Weight Range Category,Obesity NOS,True,... GENERAL : Presents as an obese 60-year-old...,"[{'ontology': 'snomed', 'code': '414915002', '..."
3,Weight Range Category,Obesity NOS,True,ABDOMEN : Obese .,"[{'ontology': 'snomed', 'code': '414915002', '..."
4,Depression,Signs and symptoms of depression,True,"... , reflux in 2000 , insomnia , but no snori...","[{'ontology': 'snomed', 'code': '394924000', '..."
5,Limited English,Limited English NOS,False,... many many years and speaks fluent English ...,"[{'ontology': 'snomed', 'code': '161147007', '..."
6,Living Condition,Live alone,False,SOCIAL HISTORY : She lives with her husband .,"[{'ontology': 'snomed', 'code': '365481000', '..."
7,Substance Abuse,Alcohol abuse,False,She is a nonsmoker and no history of drug or a...,"[{'ontology': 'snomed', 'code': '228273003', '..."
8,Substance Abuse,Illicit drug abuse,False,She is a nonsmoker and no history of drug or a...,"[{'ontology': 'snomed', 'code': '707848009', '..."
9,Substance Abuse,Tobacco abuse,False,She is a nonsmoker and no history of drug ...,"[{'ontology': 'snomed', 'code': '8392000', 'de..."


That's it! Hope you find this tutorial useful! Bye!