In [1]:
import json
import argparse
from dotenv import load_dotenv
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableSerializable
from langchain_openai import ChatOpenAI

# Load environment variables
load_dotenv("../.env")

False

In [None]:
3. 

In [91]:
PROMPT_TEXT = """
You are an intelligent assistant tasked with identifying the line number where the first round of conversation ends in a transcript between a Doctor and a Patient.

It is very important that you only output a number:
- **Case 1: First conversation does not end**: If there is no end to the first conversation, output 999.
- **Case 2: First conversation ends**: If the first conversation ends, output the line number of the last line of the first conversation.

Follow these steps:
<steps>
1. Identify the first conversation's end, if applicable:
   - Look for a closing statement by the Doctor followed by a confirmation from the Patient indicating the conversation has reached a conclusion.
   - The last line of the first conversation is always spoken by the Patient.
   - The last line of the first conversation is always an odd line number.
   - The last line of the first conversation will not contain a greeting or welcoming phrase typically used to start conversations.
2. Identify the second conversation's start, if applicable:
   - Look for a Doctor greeting the Patient again, e.g., "Hello [Patient name], ello Javier Rivas, I'm here to follow up on your health since our last conversation. How have you been feeling recently? Any new or worsening symptoms you'd like to discuss today?
   - The first line of the second conversation is always spoken by the Doctor.
   - The second conversation always starts with an even line number.
3. If **Case 1 (first conversation does not end)** applies:
   - Output 999.
4. If **Case 2 (first conversation ends)** applies:
   - Confirm that the end of the first conversation appears immediately before the start of the second conversation and output the line number of the last line of the first conversation.
   - Otherwise, output 999.
</steps>
    
The transcript is provided below, surrounded by triple quotes:
'''
{transcript}
'''

Use this example to understand **Case 1: First conversation does not end**:
<example 1>
**Transcript:**
0: Doctor: Hello Michelle Graham, I'm here to check on how you're feeling today. Let's go over how you've been doing since your discharge.
1: Patient: I'm not feeling too bad, just trying to take it easy. It's been a bit of a struggle adjusting after being in the hospital.
2: Doctor: I'm glad to hear you're trying to take it easy, Michelle. Can you tell me how you've been feeling today? Have you noticed any new or worsening symptoms?
3: Patient: I don't remember exactly how I've been feeling today. It's hard to say if there have been any new symptoms. Maybe it's just the usual fatigue and shortness of breath.
4: Doctor: Thank you for sharing, Michelle. Let's delve a bit deeper into your symptoms. Have you experienced any shortness of breath? If yes, does it occur at rest, when walking, or when climbing stairs?
5: Patient: I can't quite describe this. I'm not sure what you're asking. Let me think... I guess I feel a bit out of breath sometimes when I'm walking around. It's hard to say exactly when it happens.
6: Doctor: Thank you for sharing, Michelle. Have you had sudden shortness of breath that wakes you up at night recently?
7: Patient: I'd rather not say if I've had shortness of breath at night. Can we talk about something else?
8: Doctor: I understand, Michelle. Let's focus on another aspect. Do you need to prop yourself up with pillows to breathe comfortably while lying down?
9: Patient: I'm not sure if I need pillows to prop myself up. I don't think that's important right now. Is this really necessary to discuss?
10: Doctor: I apologize for any discomfort, Michelle. Let's move on to another topic. Could you please provide your latest vital signs, starting with your temperature?
11: Patient: I can't quite remember my exact temperature, but I think it was around 98.2. I'm not sure if that's normal or not.
12: Doctor: Thank you for sharing your temperature, Michelle. Next, could you please provide your heart rate?
13: Patient: I'm not sure about my heart rate, but I think it was around 89. I'm unfamiliar with what's considered normal for that.
14: Doctor: Thank you for sharing your heart rate, Michelle. Lastly, could you please provide your blood pressure readings?

**Expected Output:**
999
</example 1>

Use this example to understand **Case 2: First conversation ends**:
<example 2>
**Transcript:**
0: Doctor: Hello Kevin Morris, I'm here to check on how you're feeling today. Let's go over how you've been doing since your discharge.
1: Patient: Um... I've been feeling a bit tired lately, Doctor. And my legs are still swollen.
2: Doctor: I'm sorry to hear you're feeling tired, Kevin, and that your legs are still swollen. Let's address these symptoms further. Have you experienced any shortness of breath recently, especially when at rest, walking, or climbing stairs?
3: Patient: Um... I haven't had shortness of breath at rest, but I do feel a bit winded when I walk around the house. It's been a bit harder to move about.
4: Doctor: Thank you for sharing, Kevin. It's important to monitor these symptoms closely. Have you experienced sudden shortness of breath that wakes you up at night, known as paroxysmal nocturnal dyspnea?
5: Patient: No, Doctor, I haven't been waking up at night feeling short of breath. It's mostly during the day when I notice it more.
6: Doctor: Based on your responses, Kevin, it seems you're experiencing some shortness of breath with exertion during the day but not at night. Are you needing to prop yourself up with pillows to breathe comfortably while lying down?
7: Patient: No, Doctor, I don't need pillows to prop myself up at night. I usually sleep fine once I find a good position.
8: Doctor: Thank you for sharing that, Kevin. Have you noticed any swelling in your ankles or legs recently?
9: Patient: Yes, Doctor, my legs have been quite swollen, especially in the evenings. It's uncomfortable to move around when they're like that.
10: Based on our conversation, Kevin, it seems you are mainly experiencing tiredness and leg swelling. Is there anything else you would like to share?
11: Patient: No, Doctor, I think that covers everything for now. Thank you for checking in on me.
12: Doctor: Hello, Kevin. I'm here to check on how you're feeling today. Can you tell me how you've been feeling since your discharge? Have you noticed any new or worsening symptoms?
13: Patient: Um... I haven't had any new symptoms, Doctor. Just the usual tiredness and leg swelling.
14: Doctor: I'm sorry to hear you're still experiencing tiredness and leg swelling, Kevin. Let's address these symptoms further.

**First Conversation End:**
11: Patient: No, Doctor, I think that covers everything for now. Thank you for checking in on me.

**Second Conversation Start:**
12: Doctor: Hello, Kevin. I'm here to check on how you're feeling today. Can you tell me how you've been feeling since your discharge? Have you noticed any new or worsening symptoms?

**Expected Output:**
11
</example 2>
"""

In [92]:
def create_prompt_template() -> PromptTemplate:
    """
    Create a prompt template for the language model.

    Returns:
        PromptTemplate: A LangChain PromptTemplate object.
    """
    return PromptTemplate(
        input_variables=["doctor", "patient"],
        template=PROMPT_TEXT,
    )

def create_chain() -> RunnableSerializable:
    """
    Create a LangChain chain for the transcript extractor..

    Returns:
        RunnableSerializable: A LangChain chain for the transcript extractor.
    """
    prompt = create_prompt_template()
    model = ChatOpenAI(model_name="gpt-4o", temperature=0)
    output_parser = StrOutputParser()
    chain = prompt | model | output_parser
    return chain

def detect_first_conversation_end(chain, transcript):
    """
    Detect the line number where the first conversation ends.

    Args:
        chain (RunnableSerializable): The LangChain chain for detecting the conversation end.
        transcript (str): The transcript as a single string.

    Returns:
        int: The line number where the first conversation ends.
    """
    result = chain.invoke({"transcript": transcript})
    return int(result.strip())

def enumerate_transcript(transcript_lines):
    """
    Enumerates the transcripts for easier processing.

    Args:
        transcripts (dict): Dictionary of transcripts.
    """
    enumerated_lines = [f"{i}: {line}" for i, line in enumerate(transcript_lines)]
    transcript_text = "\n".join(enumerated_lines)
    return transcript_text

def process_transcript(transcript, debug=False):
    """
    Process a single transcript to extract the first round of conversation.

    Args:
        transcript (str): The transcript to process.

    Returns:
        str: The processed transcript.
    """
    chain = create_chain()
    enumerated_text = enumerate_transcript(transcript)
    end_line_number = detect_first_conversation_end(chain, enumerated_text)
    if debug:
        print(f"DEBUG INFO: End of first conversation at line {end_line_number}")
        print("-" * 150)
        if end_line_number != 999:
            print(transcript[end_line_number])
        print("-" * 150)
    if end_line_number == 999:
        return transcript
    else:
        return transcript[:end_line_number + 1]

def process_transcripts(patients_dict, transcript_field='chat_transcript', debug=False):
    """
    Process the dictionary of patients to extract the first round of conversation for each.

    Args:
        patients_dict (dict): Dictionary of patients. Must contain the following keys:
            - id: The ID of the patient.
            - name: The name of the patient.
            - prompt: The prompt for the patient.
            - chat_transcript: The chat transcript for the patient.
        transcript_field (str): The key in the patients_dict containing the chat transcript.

    Returns:
        dict: Dictionary with shortened conversations.
    """
    processed_transcripts = {}

    for key, value in patients_dict.items():
        transcript_lines = value[transcript_field]
        if debug:
            print("\n")
            print(f"Processing transcript for {value['name']}...")
            print("=" * 150)
            for i, line in enumerate(transcript_lines):
                print(f"{i}: {line}")
            print("\n")
        first_conversation = process_transcript(transcript_lines, debug=debug)
        processed_transcripts[key] = {
            "id": value['id'],
            "name": value['name'],
            "prompt": value['prompt'],
            "chat_transcript": first_conversation
        }

    return processed_transcripts

def load_transcripts(file_path: str) -> dict:
    """
    Load the transcripts from the JSON file.

    Returns:
        dict: Dictionary of transcripts.
    """
    with open(file_path, "r") as file:
        transcripts = json.load(file)
    return transcripts

def save_transcripts(file_path: str, transcripts: dict):
    """
    Save the processed transcripts to a JSON file.

    Args:
        file_path (str): Path to the JSON file where the transcripts will be saved.
        transcripts (dict): Dictionary of processed transcripts.
    """
    with open(file_path, "w") as file:
        json.dump(transcripts, file, indent=4)

In [64]:
def filter_transcripts(data, num_keys):
    """
    Filters the input JSON data to only include the first n keys.
    
    Args:
        data (dict): The original JSON data as a dictionary.
        num_keys (int): The number of keys to include in the filtered data.
    
    Returns:
    dict: A dictionary containing only the first n keys from the input data.
    """
    # Convert the keys to a list and slice the first n
    first_keys = list(data.keys())[:num_keys]
    
    # Create a new dictionary with only the first n keys
    filtered_data = {key: data[key] for key in first_keys}
    
    return filtered_data

In [93]:
# Test for transcripts_1.5 baseline
processed_transcripts_2_baseline = process_transcripts(
    patients_dict=load_transcripts('../../data/patients/patients_1.5_baseline.json'),
    transcript_field='chat_transcript_full',
    debug=True
)



Processing transcript for Patricia Reynolds...
0: Doctor: Hello Patricia Reynolds, I'm here to check on how you're feeling today. Let's go over how you've been doing since your discharge.
1: Patient: Um... I've been feeling okay since I got home. But, ah... I've been having some chest pain, coughing, and feeling short of breath at times.
2: Doctor: I'm sorry to hear that, Patricia. Let's address your symptoms one by one. Have you had any sudden shortness of breath that wakes you up at night?


3: Patient: No, I haven't had any sudden shortness of breath at night. It's mainly during the day when I'm up and moving around that I notice it more.
4: Doctor: Thank you for clarifying that, Patricia. Do you need to prop yourself up with pillows to breathe comfortably while lying down?
5: Patient: Yes, sometimes I do need to prop myself up with extra pillows to feel more comfortable when lying down. It seems to help a bit with the breathing.
6: Doctor: It sounds like you are experiencing orth

In [90]:
# Test for transcripts_1.5 reluctant
processed_transcripts_2_reluctant = process_transcripts(
    patients_dict=load_transcripts('../../data/patients/patients_1.5_reluctant.json'),
    transcript_field='chat_transcript_full',
    debug=True
)



Processing transcript for Jennifer Nelson...
0: Doctor: Hello Jennifer Nelson, I'm here to check on how you're feeling today. Let's go over how you've been doing since your discharge.
1: Patient: Hi, Doctor. I'm feeling okay, I guess. Just taking it easy at home.
2: Doctor: I'm glad to hear that you're taking it easy at home, Jennifer. Can you tell me if you've noticed any new or worsening symptoms since you were discharged from the hospital?
3: Patient: I can't quite describe this. I've been having some pain in my right hip area, but it's not too bad.
4: Doctor: I'm sorry to hear about the pain in your right hip area, Jennifer. Let's make a note of that. Have you experienced any shortness of breath recently?
5: Patient: I'm not sure what you mean. Sometimes I feel like I'm a bit out of breath, but it's hard to say if it's getting worse.
6: Doctor: Thank you for sharing that, Jennifer. Have you experienced any shortness of breath at rest, when walking, or when climbing stairs?
7: Pat