# Finding FHIR CodeSystem Versions

This notebook searches for a FHIR `CodeSystem` resource in a local folder structure like `~/.fhir/packages/<package>#<version>/package/`, matching a given canonical URL and version. It uses `pathlib` for modern path handling.

## Prerequisites
- Python 3.x
- FHIR packages unzipped in `~/.fhir/packages/` (e.g., `hl7.terminology.r5#6.2.0/package/CodeSystem-v2-0004.json`)

In [36]:
# Import required libraries
from pathlib import Path
import json
from typing import Optional
import pandas as pd

In [37]:

# Define the function
def find_codesystem_version(root_folder: str, canonical_url: str, target_version: str) -> Optional[dict]:
    """
    Search through FHIR package files in '~/.fhir/packages/<package>#<version>/package/'
    to find a CodeSystem matching the given canonical URL and version.

    Args:
        root_folder (str): Path to the root folder (e.g., '~/.fhir/packages').
        canonical_url (str): The canonical URL of the CodeSystem (e.g., 'http://terminology.hl7.org/CodeSystem/v2-0004').
        target_version (str): The desired version of the CodeSystem (e.g., '6.2.0').

    Returns:
        Optional[dict]: The matching CodeSystem resource as a dictionary, or None if not found.
    """
    # Convert root_folder to a Path object and expand the user directory (~)
    root_path = Path(root_folder).expanduser()
    print(root_path)
    
    # Check if root folder exists
    if not root_path.is_dir():
        print(f"Root folder {root_path} does not exist.")
        return None
    
    # Iterate over package directories (e.g., hl7.terminology.r5#6.2.0)
    for package_dir in root_path.glob('hl7.terminology.r4*'):
        if package_dir.is_dir():
            # Target the 'package' subfolder
            target_path = package_dir / 'package'
            print(f'target_path = {target_path}')
            if target_path.is_dir():
                # Iterate through JSON files in the 'package' folder
                for file_path in target_path.iterdir():
                    if file_path.name.startswith('CodeSystem-') and file_path.suffix == '.json':
                        try:
                            # Read the JSON file
                            with file_path.open('r', encoding='utf-8') as f:
                                resource = json.load(f)
                                
                                # Check if it's a CodeSystem resource
                                if resource.get('resourceType') == 'CodeSystem':
                                    resource_url = resource.get('url', '')
                                    resource_version = resource.get('version', '')
                                    
                                    # Handle versioned canonical URL (e.g., 'http://...|6.2.0')
                                    if '|' in canonical_url:
                                        expected_url, expected_version = canonical_url.split('|', 1)
                                        if (resource_url == expected_url and resource_version == expected_version):
                                          return resource
                                    # Match unversioned URL and explicit version
                                    elif (resource_url == canonical_url and resource_version == target_version):
                                      return resource
                        except (json.JSONDecodeError, IOError) as e:
                            print(f"Error reading {file_path}: {e}")
                            continue
    
    # Return None if no matching CodeSystem is found
    print(f"No CodeSystem found for {canonical_url} version {target_version}")
    return None

In [38]:
# Function to compare concepts between two versions of a CodeSystem
def compare_concepts(cs1: dict, cs2: dict) -> pd.DataFrame:
    """
    Compare concepts between two versions of a CodeSystem and return a DataFrame of differences.

    Args:
        cs1 (dict): First (older) CodeSystem resource.
        cs2 (dict): Second (newer) CodeSystem resource.

    Returns:
        pd.DataFrame: Table showing added, removed, or changed concepts.
    """
    # Extract concepts, defaulting to empty list if not present
    concepts1 = {c['code']: c for c in cs1.get('concept', [])}
    concepts2 = {c['code']: c for c in cs2.get('concept', [])}

    # List to store differences
    differences = []

    # Check for removed or changed concepts (in cs1 but not cs2 or different)
    for code in concepts1:
        if code not in concepts2:
            differences.append({
                'Code': code,
                'Status': 'Removed',
                'Display (Old)': concepts1[code].get('display', ''),
                'Display (New)': '',
                'Definition (Old)': concepts1[code].get('definition', ''),
                'Definition (New)': ''
            })
        elif concepts1[code] != concepts2[code]:
            differences.append({
                'Code': code,
                'Status': 'Changed',
                'Display (Old)': concepts1[code].get('display', ''),
                'Display (New)': concepts2[code].get('display', ''),
                'Definition (Old)': concepts1[code].get('definition', ''),
                'Definition (New)': concepts2[code].get('definition', '')
            })

    # Check for added concepts (in cs2 but not cs1)
    for code in concepts2:
        if code not in concepts1:
            differences.append({
                'Code': code,
                'Status': 'Added',
                'Display (Old)': '',
                'Display (New)': concepts2[code].get('display', ''),
                'Definition (Old)': '',
                'Definition (New)': concepts2[code].get('definition', '')
            })

    # Create DataFrame
    df = pd.DataFrame(differences)
    return df if not df.empty else pd.DataFrame(columns=['Code', 'Status', 'Display (Old)', 'Display (New)', 'Definition (Old)', 'Definition (New)'])

## Compare Two Versions of a CodeSystem\n",

Specify the root folder, canonical URL, and two versions to compare. This example uses `http://terminology.hl7.org/CodeSystem/v2-0004` with hypothetical versions `6.1.0` and `6.2.0`."

In [45]:
# Define parameters
folder = "~/.fhir/packages"
cs_code = 'v2-0201'
url = f"http://terminology.hl7.org/CodeSystem/{cs_code}"
version1 = "3.0.0"  # Older version
version2 = "4.0.0"  # Newer version

print(f'old codesystem = {url}|{version1}')
print(f'new codesystem = {url}|{version2}')
# Find the two CodeSystem versions
cs1 = find_codesystem_version(folder, url, version1)
cs2 = find_codesystem_version(folder, url, version2)

# Compare concepts if both versions are found
if cs1 and cs2:
    print(f"Comparing {cs1['url']} v{cs1['version']} with v{cs2['version']}")
    diff_table = compare_concepts(cs1, cs2)
    display(diff_table)  # Nicer table output in Jupyter
else:
    if not cs1:
        print(f"Failed to find {url} version {version1}")
    if not cs2:
        print(f"Failed to find {url} version {version2}")

old codesystem = http://terminology.hl7.org/CodeSystem/v2-0201|3.0.0
new codesystem = http://terminology.hl7.org/CodeSystem/v2-0201|4.0.0
/Users/ehaas/.fhir/packages
target_path = /Users/ehaas/.fhir/packages/hl7.terminology.r4#5.0.0/package
/Users/ehaas/.fhir/packages
target_path = /Users/ehaas/.fhir/packages/hl7.terminology.r4#5.0.0/package
target_path = /Users/ehaas/.fhir/packages/hl7.terminology.r4#5.3.0/package
target_path = /Users/ehaas/.fhir/packages/hl7.terminology.r4#6.0.2/package
Comparing http://terminology.hl7.org/CodeSystem/v2-0201 v3.0.0 with v4.0.0


Unnamed: 0,Code,Status,Display (Old),Display (New),Definition (Old),Definition (New)


## Notes
- **Folder Structure**: Assumes a structure like `~/.fhir/packages/hl7.terminology.r5#6.2.0/package/CodeSystem-v2-0004.json`.
- **Error Handling**: The function reports JSON or I/O errors for specific files.
- **Customization**: Modify `folder`, `url`, and `version` in the cell above to match your needs.

Run the cells in sequence to test the function with your local FHIR packages!