# FHIR LOINC to RadLex and RadLex to LOINC API JSON data extractions

### Introduction
* Notebook by Adam Lang
* This is a demo of a project I completed at Nuance/Microsoft in 2023.


In this notebook I wrote Python code to extract JSON data from the HL7-FHIR LOINC-Radlex mappings that I had pulled down from the LOINC website. I needed to extract the tokens for mapping them to EHR user interface strings as well as real world data strings.

LOINC Terminology Service using HL7® FHIR®
- LOINC provides APIs for programmatically mapping FHIR to LOINC terminology sets and concepts. 
- Link: https://loinc.org/fhir/


We will use the LOINC to RadLex and RadLex to LOINC Concept Mappings to extract the "display" JSON value object to obtain the LOINC/RadLex terminology standards.

### Step 1 - Install libraries

In [1]:
!pip install requests
!pip install pandas





[notice] A new release of pip available: 22.3.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip





[notice] A new release of pip available: 22.3.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


### Step 2 - Import the required libraries and fetch the JSON data

In [2]:
import requests
import pandas as pd
import json




# Extraction for LOINC Parts and RadLex ID codes

We will do the following:
- import libraries
- read in JSON data from file
- check if JSON data loaded
- loop through the JSON structures and extract the sub elements of the JSON methods that we need for our dataset to a list.
- Export this to a CSV.

In [4]:
#import libraries
import json
import pandas as pd

In [5]:
# Read JSON data from file
with open("FHIR_loincParts_to_radlex.json", "r") as file:
    data = json.load(file)

In [18]:
# Get first 5 key-value pairs from the dictionary
first_five_items = list(data.items())[:5]

# Convert the data back to a dictionary
first_five_dict = dict(first_five_items)

# Convert the data back to a JSON string with indentation
json_string = json.dumps(first_five_dict, indent=4)

# Print the JSON string
print(json_string)

{
    "resourceType": "Bundle",
    "id": "81ff758a-8a23-467b-a516-0798d8615723",
    "meta": {
        "lastUpdated": "2023-08-22T17:49:43.890+00:00"
    },
    "type": "searchset",
    "total": 6
}


In [19]:
#create list to store values

display_values = []

# Loop through each group in the JSON data

for group in data.get("entry", []):
    for element in group.get("resource", {}).get("group", []):
        for sub_element in element.get("element", []):
            code = sub_element.get("code")
            display = sub_element.get("display")
            # Loop through each target in the element
            for target in sub_element.get("target", []):
                # Extract the "code" value for the target
                target_code = target.get("code")
                # Extract the "display" value for the target
                target_display = target.get("display")
                # Append the extracted data to the list
                display_values.append({"display": display, "RID code": code, "LP code": target_code})


In [20]:
# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(display_values)


In [21]:
print(df[0:10])

         display    RID code   LP code
0             US  LP207608-3  RID10326
1        Doppler  LP207609-1  RID10375
2             IV  LP200078-6  RID11160
3         Pelvis  LP199998-8   RID2507
4  Penis vessels  LP208051-5  RID49825
5              W  LP200088-5  RID49853
6    Vasodilator  LP432424-2  RID50652
7             NM  LP208891-4  RID10330
8         Tl-201  LP208673-6  RID11753
9          Views  LP221404-9  RID49567


In [22]:
# Group by "display" and aggregate "RID code" and "LP code" values
agg_df = df.groupby("display").agg({"RID code": ', '.join, "LP code": ', '.join}).reset_index()

# Rename the columns
agg_df.columns = ["display", "RID code", "LP code"]

# Display the resulting DataFrame
print(agg_df)


                       display   
0                      1 level  \
1                       1 or 2   
2                    1.5H post   
3       10 degree caudal angle   
4     10 degree cephalic angle   
...                        ...   
1093              true lateral   
1094                    tunnel   
1095                   upright   
1096                   vessels   
1097        {Imaging modality}   

                                               RID code   
0     LP264232-2, LP264232-2, LP264232-2, LP264232-2...  \
1     LP263793-4, LP263793-4, LP263793-4, LP263793-4...   
2     LP248916-1, LP248916-1, LP248916-1, LP248916-1...   
3     LP220544-3, LP220544-3, LP220544-3, LP220544-3...   
4     LP220545-0, LP220545-0, LP220545-0, LP220545-0...   
...                                                 ...   
1093  LP220609-4, LP220609-4, LP220609-4, LP220609-4...   
1094  LP220610-2, LP220610-2, LP220610-2, LP220610-2...   
1095  LP220611-0, LP220611-0, LP220611-0, LP220611-0...   
1

In [11]:
#print first few rows
print(agg_df[0:10])

                    display   
0                   1 level  \
1                    1 or 2   
2                 1.5H post   
3    10 degree caudal angle   
4  10 degree cephalic angle   
5                  10M post   
6                  15M post   
7                   18F-FDG   
8                   18F-NaF   
9                   1H post   

                                            RID code   
0  LP264232-2, LP264232-2, LP264232-2, LP264232-2...  \
1  LP263793-4, LP263793-4, LP263793-4, LP263793-4...   
2  LP248916-1, LP248916-1, LP248916-1, LP248916-1...   
3  LP220544-3, LP220544-3, LP220544-3, LP220544-3...   
4  LP220545-0, LP220545-0, LP220545-0, LP220545-0...   
5  LP248205-9, LP248205-9, LP248205-9, LP248205-9...   
6  LP248917-9, LP248917-9, LP248917-9, LP248917-9...   
7  LP231799-0, LP231799-0, LP231799-0, LP231799-0...   
8  LP212086-5, LP212086-5, LP212086-5, LP212086-5...   
9  LP248919-5, LP248919-5, LP248919-5, LP248919-5...   

                                         

### To CSV

In [23]:
### To CSV
df.to_csv('FHIR_loincParts_to_radlex_codes_extracted.csv', index=False)