In [None]:
import pandas as pd
import json
from shared_functions import get_access_token, call_gpt_with_prompt

## Prompt Decomposition Level: None

Each dataframe row in none_pd_df represents information for one patient.

__<u>Recommended dataframe columns</u>__

__Patient MRN__: Patient identifier

__Appended Eligible Rad Report Texts__: Complete eligible radiology report texts appended together (Index metastasis report text + text from subsequent eligible radiology reports within 30 days of index report)

__Volume__: Volume classification of metastatic disease (High or Low); This is the output returned by GPT-4


| Patient MRN | Appended Eligible Rad Report Texts | Volume |
|------|------|------|
| XXXXXXX | "Appended text here" | H or L |

The run_none_pd function uses GPT-4 to directly infer volume classification from all eligible radiology report texts.

In [None]:
def run_none_pd(none_pd_df):
    token = get_access_token()
    for index, row in none_pd_df.iterrows():
    
        report_string = row['Appended Eligible Rad Report Texts']

        prompt_string = """
        Based on the provided radiology report(s), 
        TASK: Identify if the report(s) indicate(s) evidence of high volume metastatic disease from prostate cancer. 

        Definition of high volume metastatic disease is the presence of distant visceral metastasis 
        or the presence of four or more bone metastatic lesions total with at least one of these metastatic lesions being outside of the vertebral column and pelvis.  

        Disregard lymph node metastasis. 
        [Do not consider lymph adenopathy, lymph node enlargement, lymph node metastasis when defining high volume metastatic disease]. 

        Disregard sites of regional visceral metastasis (such as the prostate, bladder, urethra, ureter, seminal vesicles, rectum).

        Output format: 'Yes' or 'No'.
        Do not include any other information, even the question string, in the response. Strictly follow given output format.
        """
        
        # System content specifies the role that the LLM should take on.
        system_content = "Being an oncologist with specific expertise in prostate cancer, " \
        "classify the volume of disease for a patient with high or low volume metastatic prostate cancer." 
        
        # User content specifies details of specific task instructions.
        user_content =  f"Here is a compilation of one or more radiology reports from a patient with metastatic prostate cancer." \
        f"Each individual report is separated by '----------': {report_string}." \
        f"{prompt_string}"

        output = call_gpt_with_prompt(system_content, user_content, token)

        # Re-fetch access token in case it expires before the for loop completes.
        if output == "invalid response":
            time.sleep(2)
            token1 = get_access_token()
            output = call_gpt_with_prompt(system_content, user_content, token1)       

        cleaned_output = output.strip().strip("'").strip('"')
        if cleaned_output == 'Yes':
            none_pd_df.loc[index, 'Volume'] = 'H'
        elif cleaned_output == 'No':
            none_pd_df.loc[index, 'Volume'] = 'L'
        else:
            none_pd_df.loc[index, 'Volume'] = f"Output Format Error, Expected 'Yes' or 'No', Got: {output}"            
            
    return none_pd_df

<u>__Collaborative LLM Experiments__</u>

The collaborative LLM experiments were run using the "PD Level: None" framework. Instead of utilizing GPT-4 only, we utilized the same prompt with Gemini-2.0-flash, as well as GPT-4o. 

For the collaborative LLM experiment using GPT-4 + Gemini-2.0-flash, we took volume classifications of consensus between these two LLMs and evaluated accuracy of these classifications.

The same was done for the GPT-4 + GPT-4o collaborative experiment.

## Prompt Decomposition Level: Low

Each dataframe row in low_pd_df represents information for one patient.

<u>__Recommended dataframe columns__</u>

__Patient MRN__: Patient identifier

__Appended Eligible Rad Report Texts__: Complete eligible radiology report texts appended together (Index metastasis report text + text from subsequent eligible radiology reports within 30 days of index report)

__Sites of Bone and Visceral Metastasis__: JSON formatted list of each bone metastatic site, with respective number of lesions at each site, and each visceral metastatic site across all eligible reports. This is the first ouput returned by GPT-4 in this framework.

__Volume__: Volume classification of metastatic disease (High or Low); This is the final output returned by GPT-4


| Patient MRN | Appended Eligible Rad Report Texts | Sites of Bone and Visceral Metastasis | Volume |
|------|------|------|------|
| XXXXXXX | "Appended text here" | {JSON formatted list of sites} | H or L |

### Prompt L1 - Identify sites of bone metastasis and sites of visceral metastasis.

The run_low_pd_1 function uses GPT-4 to extract sites of bone metastasis and sites of visceral metastasis from all eligible radiology reports.

In [None]:
def run_low_pd_1(low_pd_df):
    token = get_access_token()    
    for index, row in low_pd_df.iterrows():

        report_string = row['Appended Eligible Rad Report Texts']

        prompt_string = """
        Based on the provided radiology report(s), 
        TASK: Firstly, identify the sites of bone metastasis and sites of distant visceral metastasis from prostate cancer. 
        Distinguish between bone metastasis within the vertebral column + pelvis and bone metastasis outside of the vertebral column + pelvis.
        Secondly, identify an integer number of metastatic lesions at each bone metastatic site. 
        If the number of metastatic lesions are unclear but it is implied that there are multiple, then say "multiple".
        Do not include any lymph nodes as sites of visceral metastasis.
        Do not include regional sites of visceral metastasis (such as the prostate, bladder, urethra, ureter, seminal vesicles, rectum).
        NOTE: Only include a site of metastasis if the report(s) indicate(s) a high or definite likelihood of metastasis to that site.

        Output format: 

        Bone metastatic site within the vertebral column and pelvis #1: ___
        Number of metastatic lesions at this site: ___

        Bone metastatic site outside of the vertebral column and pelvis #1: ___
        Number of metastatic lesions at this site: ___

        Visceral metastatic site #1: ___

        Example output #1: 

        Bone metastatic site within the vertebral column and pelvis #1: Left iliac crest
        Number of metastatic lesions at this site: Multiple

        Bone metastatic site within the vertebral column and pelvis #2: T12 vertebral body
        Number of metastatic lesions at this site: 2

        Bone metastatic site outside of the vertebral column and pelvis #1: Right humerus
        Number of metastatic lesions at this site: 1

        Visceral metastatic site #1: Upper lobe of right lung


        If there is general mention of "diffuse skeletal metastases", then -

        Example output #2: 

        Bone metastatic site within the vertebral column and pelvis #1: Diffuse skeletal metastases
        Number of metastatic lesions at this site: multiple

        Bone metastatic site outside of the vertebral column and pelvis #1: Diffuse skeletal metastases
        Number of metastatic lesions at this site: multiple

        Visceral metastatic site #1: none


        If there are no bone or visceral metastatic sites, then -

        Example output #3: "There are no bone or visceral metastatic sites."


        Do not include any other information, even the question string, in the response. Strictly follow the given output formats.
        """

        system_content = "Being an oncologist with specific expertise in prostate cancer, " \
        "identify sites of bone metastasis and visceral metastasis from prostate cancer."

        user_content = "Here is a compilation of one or more radiology reports from a patient with metastatic prostate cancer." \
        f"Each individual report is separated by '----------': {report_string}." \
        f"{prompt_string}"

        output = call_gpt_with_prompt(system_content, user_content, token)

        if output == "invalid response":
            time.sleep(2)
            token1 = get_access_token()
            output = call_gpt_with_prompt(system_content, user_content, token1)          

        low_pd_df.loc[index, 'Sites of Bone and Visceral Metastasis'] = output       

    return low_pd_df

### Prompt L2 - Using metastatic sites outputted from L1, classify as high or low volume.

The run_low_pd_2 function incorporates the first prompt (L1) of the "PD: Low" experimental framework for volume classification by GPT-4.

In [None]:
def run_low_pd_2(low_pd_df):
    token = get_access_token()
    for index, row in low_pd_df.iterrows():

        sites_string = row['Sites of Bone and Visceral Metastasis']

        prompt_string = """
        TASK: Identify if the provided list of bone metastatic sites and visceral metastatic sites 
        indicate(s) evidence of high volume metastatic disease from prostate cancer. Go step by step.

        Step 1: If there is a visceral metastatic site in the list, output is 'Yes'. Skip steps 2 and 3.

        Step 2: If there is the presence of four or more bone metastatic lesions total with 
        at least one of these metastatic lesions being outside of the vertebral bodies and pelvis, output is 'Yes'. Skip step 3.

        Step 3: If step 1 and 2 are 'No', then the output is 'No'.

        Output format: 'Yes' or 'No'.
        Do not include any other information, even the question string, in the response. Strictly follow given output format.
        """
        system_content = "Being an oncologist with specific expertise in prostate cancer, " \
        "classify the volume of disease for a patient with high or low volume metastatic prostate cancer." 
        
        user_content = "Here is a list of bone metastatic sites and visceral metastatic sites, " \
        f"alongside the number of metastatic lesions at each bone metastatic site: {sites_string}. " \
        f"{prompt_string}"
        
        output = call_gpt_with_prompt(system_content, user_content, token)

        if output == "invalid response":
            time.sleep(2)
            token1 = get_access_token()
            output = call_gpt_with_prompt(system_content, user_content, token1)        

        cleaned_output = output.strip().strip("'").strip('"')
        if cleaned_output == 'Yes':
            low_pd_df.loc[index, 'Volume'] = 'H'
        elif cleaned_output == 'No':
            low_pd_df.loc[index, 'Volume'] = 'L'
        else:
            low_pd_df.loc[index, 'Volume'] = f"Output Format Error, Expected 'Yes' or 'No', Got: {output}"       

    return low_pd_df

## Prompt Decomposition Level: High

Each dataframe row in high_pd_df represents information for one patient.

<u>__Recommended dataframe columns__</u>

__Patient MRN__: Patient identifier

__Appended Eligible Rad Report Texts__: Complete eligible radiology report texts appended together (Index metastasis report text + text from subsequent eligible radiology reports within 30 days of index report)

__Metastatic Sites: V&P__: JSON formatted list of each bone metastatic site within the vertebral column and pelvis, with respective number of lesions at each site. This is the first ouput returned by GPT-4 in this framework

__Metastatic Sites: Outside V&P__: JSON formatted list of each bone metastatic site outside of the vertebral column and pelvis, with respective number of lesions at each site. This is the second ouput returned by GPT-4 in this framework

__Metastatic Sites: Visceral__: JSON formatted list of each visceral metastatic site. This is the third output returned by GPT-4 in this framework

__Volume__: Volume classification of metastatic disease (High or Low); This is the final output returned by GPT-4


| Patient MRN | Appended Eligible Rad Report Texts | Metastatic Sites: V&P | Metastatic Sites: Outside V&P | Metastatic Sites: Visceral | Volume |
|------|------|------|------|------|------|
| XXXXXXX | "Appended text here" | {JSON formatted list of sites} | {JSON formatted list of sites} | {JSON formatted list of sites} | H or L |

### Prompt H1a - Identify sites of bone metastasis within the vertebral column and pelvis.

The run_high_pd_1a function uses GPT-4 to extract sites of bone metastasis within the vertebral column and pelvis, and the number of lesions at each site, from all eligible radiology reports.

In [None]:
def run_high_pd_1a(high_pd_df):
    token = get_access_token()    
    for index, row in high_pd_df.iterrows():

        report_string = row['Appended Eligible Rad Report Texts']

        prompt_string = """
        Based on the provided radiology report(s), 
        TASK: 
        Firstly, identify only the sites of bone metastasis within the vertebral column and pelvis (v&p).
        If there is general mention of diffuse skeletal metastases or diffuse osseous metastases throughout, then say "diffuse skeletal metastasis".
        Important note: Do not include any bones within the skull (head), the ribs, the femoral head, and the femur.

        Secondly, identify an integer number of metastatic lesions at each of these metastatic sites. 
        If the number of metastatic lesions is unclear but it is implied that there are multiple, then use 4.

        Important note: Only include a site of metastasis if the report(s) indicate(s) a high or definite likelihood of metastasis to that site. 
        Do not include indeterminate sites of metastasis.

        Output in JSON format: 

        {
        "v&p #1": ___,
        "number of lesions in v&p #1": ___
        }

        Example #1 of output: 

        {
        "v&p #1": "left iliac crest",
        "number of lesions in v&p #1": 4,
        "v&p #2": "T12 pedicle",
        "number of lesions in v&p #2": 2
        }

        Example #2 of output: 

        {
        "v&p #1": "diffuse skeletal metastasis",
        "number of lesions in v&p #1": 4
        }

        Example #3 of output: 

        {
        "v&p #1": "none",
        "number of lesions in v&p #1": 0
        }

        Do not include any other information, even the question string, in the response. Strictly follow the given output format in JSON.
        """

        system_content = "Being an oncologist with specific expertise in prostate cancer, " \
        "identify sites of bone metastasis within the vertebral column and pelvis." 

        user_content = "Here is a compilation of one or more radiology reports from a patient with metastatic prostate cancer. " \
        f"Each individual report is separated by '----------': {report_string}. " \
        f"{prompt_string}" 

        output = call_gpt_with_prompt(system_content, user_content, token)

        if output == "invalid response":
            time.sleep(2)
            token1 = get_access_token()
            output = call_gpt_with_prompt(system_content, user_content, token1)

        high_pd_df.loc[index, 'Metastatic Sites: V&P'] = output  

    return high_pd_df

### Prompt H1b - Identify sites of bone metastasis outside of the vertebral column and pelvis.

The run_high_pd_1b function uses GPT-4 to extract sites of bone metastasis outside of the vertebral column and pelvis, and the number of lesions at each site, from all eligible radiology reports.

In [None]:
def run_high_pd_1b(high_pd_df):
    token = get_access_token()
    for index, row in high_pd_df.iterrows():

        report_string = row['Appended Eligible Rad Report Texts']

        prompt_string = """
        Based on the provided radiology report(s), 

        Go step by step and eventually use the following output format in JSON:

        {
        "outside v&p #1": ___,
        "number of lesions outside v&p #1": ___
        }

        Step 1: If the report(s) indicates extensive metastasis throughout most of the skeleton, then simply output 'diffuse skeletal metastasis' (use output example #1).
        [Examples of phrases indicating extensive metastasis throughout most of the skeleton include: 
        'diffuse osseous metastases throughout the skeleton', 'innumerable foci of uptake in the skeleton', 'superscan apppearance',
        'scattered sclerotic lesions throughout the skeleton', 'widespread osseous metastatic disease', 'uptake throughout the axial and appendicular skeleton', 
        'diffuse osteoblastic disease', 'extensive sclerotic osseous lesions', 'sclerosis throughout the visualized skeleton', 'widespread bone metastatic lesions']
        Note: Even if the  exact locations of the metastasis is not specified, if the report(s) indicates extensive metastasis throughout most of the skeleton, 
        output "diffuse skeletal metastasis" (use output example #1).

        Output example #1:

        {
        "outside v&p #1": "diffuse skeletal metastasis",
        "number of lesions outside v&p #1": 4
        }

        Step 2: Otherwise, identify only the sites of bone metastasis outside of the vertebral column and pelvis (outside v&p).
        Also, identify an integer number of metastatic lesions at each of these metastatic sites. 
        If the number of metastatic lesions is unclear but it is implied that there are multiple, then use 4 as the integer number of metastatic lesions.

        Output example #2: 

        {
        "outside v&p #1": "left 6th rib",
        "number of lesions outside v&p #1": "1",
        "outside v&p #2": "humerus",
        "number of lesions outside v&p #2": 1
        }

        Important rules to follow:
        Inclusion criteria
        - Only include a site of bone metastasis outside of the vertebral column and pelvis if the report(s) strongly suggest metastasis to that site.

        Exclusion criteria
        - Do not include sites that are 'nonspecific' or 'indeterminate' for metastasis.
        - Do not include sites with 'radiotracer uptake', 'focal uptake', 'focal activity', 'focal area of sclerosis' unless the report(s) strongly suggest metastasis to that site.
        - Do not include sites that are fractures not from metastasis.
        - Do not include any sites within the vertebral column and pelvis, such as C1-C7, T1-T12, L1-L5 or any pelvic bones.

        If there is no bone metastasis outside the vertebral column and pelvis, follow output example #3.

        Output example #3: 

        {
        "outside v&p #1": "none",
        "number of lesions outside v&p #1": 0
        }

        Do not include any other information, even the question string, in the response. Strictly follow the given output format in JSON.
        """

        system_content = "Being an oncologist with specific expertise in prostate cancer, " \
        "identify sites of bone metastasis outside of the vertebral column and pelvis." 

        user_content = "Here is a compilation of one or more radiology reports from a patient with metastatic prostate cancer. " \
        f"Each individual report is separated by '----------': {report_string}. " \
        f"{prompt_string}" 

        output = call_gpt_with_prompt(system_content, user_content, token)

        if output == "invalid response":
            time.sleep(2)
            token1 = get_access_token()
            output = call_gpt_with_prompt(system_content, user_content, token1)

        high_pd_df.loc[index, 'Metastatic Sites: Outside V&P'] = output  

    return high_pd_df

### Prompt H1c - Identify sites of visceral metastasis outside of the vertebral column and pelvis.

The run_high_pd_1c function uses GPT-4 to extract sites of visceral metastasis from all eligible radiology reports.

In [None]:
def run_high_pd_1c(high_pd_df):
    token = get_access_token()    
    for index, row in high_pd_df.iterrows():

        report_string = row['Appended Eligible Rad Report Texts']

        prompt_string = """
        Based on the provided radiology report(s), 
        TASK: 
        Identify only the sites of distant visceral metastasis from prostate cancer.

        Important notes:
        - Only include a site of metastasis if the report(s) indicate(s) a high or definite likelihood of metastasis to that site. Do not include indeterminate sites of metastasis.
        - Do not include lymph node metastasis.
        - Disregard sites of regional visceral metastasis (such as the prostate, bladder, urethra, ureter, seminal vesicles, rectum).

        Output in JSON format: 

        {
        "visceral #1": ___
        }

        Example #1 of output: 

        {
        "visceral #1": "upper left lung"
        }

        Example #2 of output: 

        {
        "visceral #1": "none"
        }

        Example #3 of output: 

        {
        "visceral #1": "lower left lobe of lung",
        "visceral #2": "right hepatic lobe"
        }

        Do not include any other information, even the question string, in the response. Strictly follow the given output format in JSON.
        """

        system_content = "Being an oncologist with specific expertise in prostate cancer, " \
        "identify sites of visceral metastasis." 

        user_content = "Here is a compilation of one or more radiology reports from a patient with metastatic prostate cancer. " \
        f"Each individual report is separated by '----------': {report_string}. " \
        f"{prompt_string}" 

        output = call_gpt_with_prompt(system_content, user_content, token)

        if output == "invalid response":
            time.sleep(2)
            token1 = get_access_token()
            output = call_gpt_with_prompt(system_content, user_content, token1)

        high_pd_df.loc[index, 'Metastatic Sites: Visceral'] = output  

    return high_pd_df

### Prompt H2 - Using metastatic sites outputted from H1a, H1b, and H1c, classify as high or low volume.

The run_high_pd_2 function uses GPT-4 to take in metastatic sites and number of lesions at each site outputted from prior prompts H1a, H1b, and H1c to make the final volume classification.

In [None]:
def run_high_pd_2(high_pd_df):
    token = get_access_token()
    for index, row in high_pd_df.iterrows():

        list1 = row['Metastatic Sites: Visceral']
        list2 = row['Metastatic Sites: V&P']
        list3 = row['Metastatic Sites: Outside V&P']

        prompt_string = """
        TASK: Identify if the provided JSON-formatted lists of visceral metastatic sites and bone metastatic sites indicate evidence of high-volume metastatic disease from prostate cancer.
        Go step by step, but only output 'Yes' or 'No' at the end without explaining the individual steps.

        Step 1: Examine List 1 for "visceral #1": "value". If the value of "visceral #1" is a visceral site (such as lung, liver, spleen, adrenal gland, brain, peritoneum), output 'Yes' 
        and end the process. Skip remaining steps. 
        Otherwise, if the value of "visceral #1" is "none", proceed to Step 2.

        Step 2: Add together all of the "number of lesions in v&p #X" values for List #2.

        Step 3: Add together all of the "number of lesions outside v&p #X" values for List #3.

        Step 4: If BOTH Condition A and Condition B are met [Condition A: the combined total number of lesions in List 2 + List 3 is at least 4, Condition B: the total number of lesions in List 3 is at least 1], output 'Yes' and end the process. Skip Step 5. 
        Otherwise, proceed to Step 5.
        IMPORTANT: Carefully check for both Condition A and Condition B.

        Step 5: Output 'No'.

        Output format: 'Yes' or 'No'.
        Do not include any other information, even the question string, in the response. Strictly follow the given output format.
        """
        system_content = "Being an oncologist with specific expertise in prostate cancer, " \
        "classify the volume of disease for a patient with high or low volume metastatic prostate cancer." 
        
        user_content = "Here are three JSON formatted lists of sites of metastasis - " \
        f"List 1 (visceral metastasis): {list1} " \
        f"List 2 (bone metastasis within vertebral column and pelvis (v&p)): {list2} " \
        f"List 3 (bone metastasis outside of the vertebral column and pelvis (outside v&p)): {list3} " \
        "from radiology reports of a patient with metastatic prostate cancer: " \
        f"{prompt_string}"

        output = call_gpt_with_prompt(system_content, user_content, token)

        if output == "invalid response":
            time.sleep(2)
            token1 = get_access_token()
            output = call_gpt_with_prompt(system_content, user_content, token1)   

        cleaned_output = output.strip().strip("'").strip('"')
        if cleaned_output == 'Yes':
            high_pd_df.loc[index, 'Volume'] = 'H'
        elif cleaned_output == 'No':
            high_pd_df.loc[index, 'Volume'] = 'L'
        else:
            high_pd_df.loc[index, 'Volume'] = f"Output Format Error, Expected 'Yes' or 'No', Got: {output}"    

    return high_pd_df

## Prompt Decomposition Level: High with Rule-Based Programming (RBP)

Each dataframe row in high_pd_rbp_df represents information for one patient.

<u>__Recommended dataframe columns (same as high PD)__</u>

__Patient MRN__: Patient identifier

__Appended Eligible Rad Report Texts__: Complete eligible radiology report texts appended together (Index metastasis report text + text from subsequent eligible radiology reports within 30 days of index report)

__Metastatic Sites: V&P__: JSON formatted list of each bone metastatic site within the vertebral column and pelvis, with respective number of lesions at each site. This is the first ouput returned by GPT-4 in this framework

__Metastatic Sites: Outside V&P__: JSON formatted list of each bone metastatic site outside of the vertebral column and pelvis, with respective number of lesions at each site. This is the second ouput returned by GPT-4 in this framework

__Metastatic Sites: Visceral__: JSON formatted list of each visceral metastatic site. This is the third output returned by GPT-4 in this framework

__Volume__: Volume classification of metastatic disease (High or Low); This is the final output returned by GPT-4


| Patient MRN | Appended Eligible Rad Report Texts | Metastatic Sites: V&P | Metastatic Sites: Outside V&P | Metastatic Sites: Visceral | Volume |
|------|------|------|------|------|------|
| XXXXXXX | "Appended text here" | {JSON formatted list of sites} | {JSON formatted list of sites} | {JSON formatted list of sites} | H or L |

Note: In the "High PD with RBP" experimental framework, prompts H1a, H1b, and H1c from the "High PD" experimental framework remain the same. The change is solely in that prompt H2 is replaced by a RBP algorithm.

In [None]:
# is_none_value() checks if "visceral #1" has value of "none"
def is_none_value(json_string):
    data = json.loads(json_string) 
    return data.get("visceral #1") == "none"

In [None]:
# sum_lesions() finds the sum of number of metastatic lesions in a category
def sum_lesions(json_string):
    
    # Parse the JSON string into a dictionary
    data = json.loads(json_string)
    
    # Calculate the total lesions
    total_lesions = sum(
        4 if value == "multiple" else int(value)  # Map "multiple" to 4, otherwise convert to int
        for key, value in data.items()
        if key.startswith("number of lesions")
    )
    
    return total_lesions


In [None]:
# extract_json() extracts a json formatted string within a string that may contain text outside of the json string
def extract_json(text):
    try:
        # Regular expression to match JSON objects
        json_pattern = r'\{.*?\}'
        match = re.search(json_pattern, text, re.DOTALL)
        if match:
            # Extract the JSON string
            json_string = match.group()
            # Parse it to verify valid JSON
            json_data = json.loads(json_string)
            return json.dumps(json_data)
        else:
            return None
    except json.JSONDecodeError:
        return None

The run_rbp function uses a rule-based programming algorithm to take in metastatic sites and number of lesions at each site outputted from prior prompts H1a, H1b, and H1c to make the final volume classification.

In [None]:
def run_rbp(high_pd_df):
    for index, row in high_pd_df.iterrows():   
        patient_mrn = row['Patient MRN']
        visceral = extract_json(row['Metastatic Sites: Visceral'])
        vp = extract_json(row['Metastatic Sites: V&P'])
        outside_vp = extract_json(row['Metastatic Sites: Outside V&P'])
        
        # First, check if there is a visceral metastatic site. If so, there is no need for further checks, and volume of disease is high.
        if is_none_value(visceral) == False:
            output = 'H'
        
        # Otherwise, if there are at least 4 total bone metastatic lesions with at least 1 of these being outside of the vertebral column and pelvis,
        # then volume of disease is still high.
        # Else, volume of disease is low.
        else:
            sum_lesions_vp = sum_lesions(vp)
            sum_lesions_outside_vp = sum_lesions(outside_vp)
            sum_lesions_bone = sum_lesions_vp + sum_lesions_outside_vp
            if (sum_lesions_bone >= 4 and sum_lesions_outside_vp > 0):
                output = 'H'
            else:
                output = 'L'
        high_pd_df.loc[index, 'Volume'] = output
    return high_pd_df