# # Generative AI in Healthcare: A Tutorial with OpenAI and Synthetic Data
#
# This tutorial demonstrates how to apply generative AI (using OpenAI API) to healthcare tasks with synthetic data. We cover:
# - Summarizing medical information and evaluating it
# - Converting tabular data to text with prompt engineering
# - Disease prediction using LLMs
# - Simplifying medical notes for patients
#
# We'll use the Synthea dataset (100 patients) and explore in-context learning, few-shot learning, chain-of-thought, and tree-of-thought reasoning.
#
# **Prerequisites**:
# - Python, pandas, nltk, rouge-score, openai
# - OpenAI API key (environment variable OPENAI_API_KEY)
# - Synthea data: https://synthea.mitre.org/downloads (7 MB, 100 patients)

# ## Step 0: Setup
# Install and import required packages. Configure the OpenAI API client.

In [1]:
# +
import pandas as pd
import nltk
from rouge_score import rouge_scorer
import os
import openai
from openai import OpenAI
import time

os.environ["OPENAI_API_KEY"] = "PUT YOUR OWN KEY"
# Now initialize the OpenAI client
from openai import OpenAI
client = OpenAI()  # No need to pass api_key as it's now in environment

# Download NLTK data
nltk.download('punkt')

# Initialize OpenAI client
# client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

C:\Users\harsh\anaconda3\envs\DLTFPT\lib\site-packages\numpy\.libs\libopenblas.FB5AE2TYXYH2IJRDKGDGQ3XBKLKTF43H.gfortran-win_amd64.dll
C:\Users\harsh\anaconda3\envs\DLTFPT\lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dll
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\harsh\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [2]:
# Function to call OpenAI API with error handling and rate limit management
def call_openai(prompt, model="gpt-4o", max_tokens=500):
    """
    Call OpenAI API with retry logic for rate limits
    
    Args:
        prompt: The text prompt to send to the API
        model: The OpenAI model to use (default: gpt-4o)
        max_tokens: Maximum number of tokens in the response
        
    Returns:
        The text response from the model
    """
    max_retries = 3
    retry_count = 0
    
    while retry_count < max_retries:
        try:
            response = client.chat.completions.create(
                model=model,
                messages=[{"role": "user", "content": prompt}],
                max_tokens=max_tokens,
                temperature=0.7
            )
            return response.choices[0].message.content.strip()
            
        except openai.RateLimitError as e:
            retry_count += 1
            if retry_count < max_retries:
                wait_time = 2 ** retry_count  # Exponential backoff
                print(f"Rate limit hit. Retrying in {wait_time} seconds...")
                time.sleep(wait_time)
            else:
                print("Max retries reached. Rate limit error persists.")
                raise
                
        except Exception as e:
            print(f"Error calling OpenAI API: {e}")
            raise
            
    return "Failed to get response after multiple retries"
    

# ## Step 1: Load and Preprocess Synthetic Data
# We use the Synthea dataset, which includes CSV files like `patients.csv`, `conditions.csv`, and `encounters.csv`. We'll load and merge relevant data.


In [3]:
# +
# Load Synthea data (adjust paths as needed)
data_path = 'C:\\Users\\harsh\\Desktop\\HBK008\\Holland Bloorview\\projects\\AI4H_UT\\synthea_data'
patients = pd.read_csv(os.path.join(data_path, 'patients.csv'))
conditions = pd.read_csv(os.path.join(data_path, 'conditions.csv'))
encounters = pd.read_csv(os.path.join(data_path, 'encounters.csv'))

# Merge conditions with encounters
# Use left_on and right_on to properly match columns
data = conditions.merge(
    encounters, 
    left_on=['PATIENT', 'ENCOUNTER'],
    right_on=['PATIENT', 'Id'], 
    how='left', 
    suffixes=('_condition', '_encounter')
)

# Merge with patients data
data = data.merge(
    patients, 
    left_on='PATIENT', 
    right_on='Id', 
    how='left', 
    suffixes=('', '_patient')
)

# Calculate age (a simple approximation)
# Note: In a real application, you'd want to calculate this more precisely
data['AGE'] = 2025 - pd.to_datetime(data['BIRTHDATE']).dt.year

# Display sample patient record with relevant columns
sample_patient = data.iloc[0]
print("Sample patient record:")
print(sample_patient[['DESCRIPTION_condition', 'REASONDESCRIPTION', 'AGE', 'GENDER']])

# Optional: Save a small subset for faster processing during development
sample_data = data.head(50)
print(f"Sample data shape: {sample_data.shape}")

Sample patient record:
DESCRIPTION_condition    Housing unsatisfactory (finding)
REASONDESCRIPTION                                     NaN
AGE                                                    31
GENDER                                                  M
Name: 0, dtype: object
Sample data shape: (50, 50)


# ## Step 2: Summarize Medical Information
# Summarize a patient's medical history and evaluate using ROUGE scores.
# **Prompt Engineering**:
# - **In-context learning**: 1 example summary.
# - **Few-shot learning**: 2 example summaries.

In [4]:
# +
# Prepare data for a single patient
patient_id = data['PATIENT'].iloc[0]
patient_data = data[data['PATIENT'] == patient_id][['DESCRIPTION_condition', 'REASONDESCRIPTION']].dropna().head(5)
medical_text = '. '.join(patient_data['DESCRIPTION_condition'].astype(str) + ': ' + patient_data['REASONDESCRIPTION'].astype(str))

# In-context prompt (1 example)
in_context_prompt = f"""
Summarize the following medical history in 2-3 sentences.

Example:
Input: Hypertension: Controlled with medication. Diabetes: Managed with insulin.
Summary: The patient has hypertension, which is controlled with medication, and diabetes, managed with insulin.

Input: {medical_text}
Summary:
"""

# Few-shot prompt (2 examples)
few_shot_prompt = f"""
Summarize the following medical history in 2-3 sentences.

Example 1:
Input: Hypertension: Controlled with medication. Diabetes: Managed with insulin.
Summary: The patient has hypertension, which is controlled with medication, and diabetes, managed with insulin.

Example 2:
Input: Asthma: Uses inhaler. Allergies: Seasonal, treated with antihistamines.
Summary: The patient has asthma managed with an inhaler and seasonal allergies treated with antihistamines.

Input: {medical_text}
Summary:
"""
# Call OpenAI API
print("Generating summaries using OpenAI API...")
in_context_summary = call_openai(in_context_prompt)
few_shot_summary = call_openai(few_shot_prompt)

print("\nIn-context summary (1 example):")
print(in_context_summary)

print("\nFew-shot summary (2 examples):")
print(few_shot_summary)

# Define a reference summary (for evaluation purposes)
# In practice, this could be created by a medical professional
reference_summary = "The patient has multiple conditions including various diagnoses managed with appropriate treatments."

# Evaluate with ROUGE
scorer = rouge_scorer.RougeScorer(['rouge1', 'rougeL'], use_stemmer=True)
in_context_scores = scorer.score(reference_summary, in_context_summary)
few_shot_scores = scorer.score(reference_summary, few_shot_summary)

print("\nROUGE Evaluation:")
print(f"In-context ROUGE-1: {in_context_scores['rouge1'].fmeasure:.4f}, ROUGE-L: {in_context_scores['rougeL'].fmeasure:.4f}")
print(f"Few-shot ROUGE-1: {few_shot_scores['rouge1'].fmeasure:.4f}, ROUGE-L: {few_shot_scores['rougeL'].fmeasure:.4f}")
# -


Generating summaries using OpenAI API...

In-context summary (1 example):
The patient has a fracture of the bone, specifically a closed fracture of the hip, and has experienced a sprain, particularly of the ankle. Additionally, the patient has had acute viral pharyngitis.

Few-shot summary (2 examples):
The patient has a closed fracture of the hip, an ankle sprain, and acute viral pharyngitis.

ROUGE Evaluation:
In-context ROUGE-1: 0.1364, ROUGE-L: 0.1364
Few-shot ROUGE-1: 0.2143, ROUGE-L: 0.2143


# ## Step 3: Convert Tabular Data to Text
# Convert patient data (e.g., conditions, age, gender) into narrative text.
# **Prompt Engineering**:
# - **Chain-of-thought**: Step-by-step reasoning.
# - **Tree-of-thought**: Multiple narrative styles.

In [5]:
# +
# Sample tabular data for one patient
tabular_data = data[['PATIENT', 'DESCRIPTION_condition', 'AGE', 'GENDER']].head(1).to_dict(orient='records')[0]
tabular_input = f"Patient ID: {tabular_data['PATIENT']}, Condition: {tabular_data['DESCRIPTION_condition']}, Age: {tabular_data['AGE']}, Gender: {tabular_data['GENDER']}"

# Chain-of-thought prompt
cot_prompt = f"""
Convert the following tabular data into a narrative. First, identify the key elements (condition, age, gender). Then, construct a sentence summarizing the patient's status.

Input: {tabular_input}

Step 1: Identify key elements.
Step 2: Formulate a clinical narrative.
Step 3: Ensure all important information is included.

Complete narrative:
"""

# Tree-of-thought prompt
tot_prompt = f"""
Convert the following tabular data into a narrative. Explore two styles: (1) clinical tone, (2) patient-friendly tone. For each, list key elements before writing the narrative.

Input: {tabular_input}

Narratives:
1. Clinical tone:
   - Key elements to include: [list elements]
   - Clinical narrative: [write narrative]

2. Patient-friendly tone:
   - Key elements to include: [list elements]
   - Patient-friendly narrative: [write narrative]
"""

# Call OpenAI API
print("Converting tabular data to narratives...")
cot_narrative = call_openai(cot_prompt)
tot_narrative = call_openai(tot_prompt)

print("\nChain-of-thought narrative:")
print(cot_narrative)

print("\nTree-of-thought narratives:")
print(tot_narrative)
# -

Converting tabular data to narratives...

Chain-of-thought narrative:
Step 1: Identify key elements.
- Condition: Housing unsatisfactory (finding)
- Age: 31
- Gender: Male

Step 2: Formulate a clinical narrative.
A 31-year-old male is experiencing unsatisfactory housing conditions.

Step 3: Ensure all important information is included.
Complete narrative:
A 31-year-old male is experiencing unsatisfactory housing conditions, which may impact his overall well-being and health status.

Tree-of-thought narratives:
1. Clinical tone:

   - Key elements to include:
     - Patient ID
     - Age
     - Gender
     - Condition
     - Objective language
     - Formal structure

   - Clinical narrative:
     Patient ID: 30a6452c-4297-a1ac-977a-6a23237c7b46 is a 31-year-old male presenting with a documented condition of unsatisfactory housing. The patient's living conditions have been assessed and found to be inadequate, which may impact his overall health and well-being. Further evaluation and pot

# ## Step 4: Disease Prediction with LLMs
# Predict potential diseases based on patient data.
# **Prompt Engineering**:
# - **Few-shot learning**: Examples of predictions.

In [6]:
# +
# Prepare data for prediction
prediction_input = f"Age: {tabular_data['AGE']}, Gender: {tabular_data['GENDER']}, Conditions: {tabular_data['DESCRIPTION_condition']}"

# Few-shot prompt for disease prediction
prediction_prompt = f"""
Predict potential diseases or complications based on patient data. Include likelihood and rationale.

Example 1:
Input: Age: 50, Gender: Male, Conditions: Hypertension
Prediction: 
- High risk of coronary artery disease due to hypertension and age.
- Moderate risk of stroke, especially if hypertension is poorly controlled.
- Consider screenings for diabetes as it often co-occurs with hypertension.

Example 2:
Input: Age: 30, Gender: Female, Conditions: Asthma
Prediction:
- Moderate risk of allergic rhinitis, as it commonly co-occurs with asthma.
- Low risk of COPD, but consider monitoring pulmonary function.
- Environmental factors may exacerbate asthma symptoms; recommend evaluation.

Input: {prediction_input}
Prediction:
"""

# Call OpenAI API
print("Generating disease predictions...")
prediction = call_openai(prediction_prompt, max_tokens=600)

print("\nDisease prediction:")
print(prediction)

Generating disease predictions...

Disease prediction:
- **Moderate risk of respiratory issues or infections:** Unsatisfactory housing conditions may include poor ventilation, dampness, or mold, which can lead to or exacerbate respiratory problems such as asthma or bronchitis.

- **Increased risk of mental health disorders:** Poor housing conditions can contribute to stress, anxiety, or depression. It is important to assess mental health status and provide support if needed.

- **Potential for infectious diseases:** Crowded or unsanitary living conditions can increase the risk of transmission of infectious diseases. It is advisable to ensure vaccinations are up-to-date and consider screenings for diseases like tuberculosis if relevant.

- **Consider nutritional deficiencies or related conditions:** Housing issues might correlate with limited access to healthy food, potentially leading to nutritional deficiencies. Monitoring for signs of malnutrition or related conditions, such as anemi

# ## Step 5: Simplify Medical Notes
# Simplify a medical note for a 10-year-old child.
# **Prompt Engineering**:
# - **In-context learning**: Example simplification.

In [7]:
# +
# Sample medical note
medical_note = f"Patient presents with {tabular_data['DESCRIPTION_condition']}, managed with standard protocol."

# In-context prompt for simplification
simplify_prompt = f"""
Simplify the following medical note for a 10-year-old child. The explanation should be:
- Easy to understand without medical jargon
- Reassuring and age-appropriate
- Accurate but simplified

Example:
Input: Patient has hypertension, treated with medication.
Simplified: Your heart is working a bit hard, so we give you medicine to help it not work too hard. This helps keep you healthy.

Input: {medical_note}
Simplified:
"""
# Call OpenAI API
print("Simplifying medical notes...")
simplified_note = call_openai(simplify_prompt)

print("\nOriginal medical note:")
print(medical_note)

print("\nSimplified note for a child:")
print(simplified_note)
# -

Simplifying medical notes...

Original medical note:
Patient presents with Housing unsatisfactory (finding), managed with standard protocol.

Simplified note for a child:
It looks like your home isn't the best place for you right now, but we're following a plan to make sure you're safe and comfortable.


# ## Step 6: Evaluation and Comparison
# Let's automatically evaluate the different approaches where possible and discuss evaluation strategies.

In [8]:


# +
# Collect all generated outputs for comparison
outputs = {
    "In-context summary": in_context_summary,
    "Few-shot summary": few_shot_summary,
    "Chain-of-thought narrative": cot_narrative,
    "Tree-of-thought narratives": tot_narrative,
    "Disease prediction": prediction,
    "Simplified note": simplified_note
}

# Calculate token counts as a simple metric
import re

def count_tokens(text):
    """Rough token count estimation"""
    return len(re.findall(r'\b\w+\b', text))

print("Output token counts:")
for name, output in outputs.items():
    token_count = count_tokens(output)
    print(f"- {name}: {token_count} tokens")

print("\nEvaluation summary:")
print("1. Summarization: Evaluated using ROUGE scores against reference")
print("2. Narratives: Qualitative assessment needed - clarity and completeness")
print("3. Disease prediction: Accuracy requires clinical validation")
print("4. Note simplification: Readability metrics and age-appropriateness assessment")

# Example additional evaluation: readability of simplified note
# This is a simple metric - more sophisticated NLP metrics could be used
words = simplified_note.split()
avg_word_length = sum(len(word) for word in words) / len(words)
print(f"\nSimplified note - average word length: {avg_word_length:.2f} characters")
print(f"Simplified note - word count: {len(words)} words")
# -


Output token counts:
- In-context summary: 32 tokens
- Few-shot summary: 16 tokens
- Chain-of-thought narrative: 59 tokens
- Tree-of-thought narratives: 157 tokens
- Disease prediction: 184 tokens
- Simplified note: 28 tokens

Evaluation summary:
1. Summarization: Evaluated using ROUGE scores against reference
2. Narratives: Qualitative assessment needed - clarity and completeness
3. Disease prediction: Accuracy requires clinical validation
4. Note simplification: Readability metrics and age-appropriateness assessment

Simplified note - average word length: 4.32 characters
Simplified note - word count: 25 words


# ## Step 7: Advanced Applications and Extensions
# Let's explore one more advanced application: medical question answering with reasoning.

In [9]:
# +
# Medical QA with reasoning
patient_question = "Why might someone with my condition need ongoing monitoring?"

qa_prompt = f"""
Answer a medical question from a patient with the condition: {tabular_data['DESCRIPTION_condition']}. 
Use chain-of-thought reasoning to explain medical concepts clearly.

Patient question: {patient_question}

Think through your answer step-by-step:
1. Consider what the patient already knows
2. Identify key medical concepts needed to understand the answer
3. Explain those concepts in plain language
4. Directly answer the question
5. Provide relevant context or next steps

Complete answer:
"""
# Call OpenAI API
print("Generating medical QA response...")
qa_response = call_openai(qa_prompt, max_tokens=700)

print("\nPatient question:")
print(patient_question)

print("\nAI response with reasoning:")
print(qa_response)
# -

Generating medical QA response...

Patient question:
Why might someone with my condition need ongoing monitoring?

AI response with reasoning:
To address why someone with the condition "Housing unsatisfactory (finding)" might need ongoing monitoring, let's think through this step-by-step.

1. **Consider what the patient already knows**: The patient knows their housing situation is unsatisfactory, which could mean various things like poor living conditions, overcrowding, or environmental hazards. They might already understand some immediate effects of these conditions on their health, such as stress or exposure to allergens.

2. **Identify key medical concepts needed to understand the answer**:
   - **Impact of environment on health**: Poor housing can lead to or exacerbate various health issues, including respiratory problems, mental health issues, and infectious diseases.
   - **Chronic exposure and health deterioration**: Continuous exposure to poor living conditions can lead to chro

# ## Step 8: Advanced Healthcare Applications with Real Data

In [10]:
# Get a complete patient profile from our dataset
patient_id = data['PATIENT'].iloc[5]  # Use a different patient for variety
patient_records = data[data['PATIENT'] == patient_id]

# Create a comprehensive patient history from real data
patient_age = patient_records['AGE'].iloc[0]
patient_gender = patient_records['GENDER'].iloc[0]
patient_conditions = patient_records['DESCRIPTION_condition'].unique()
conditions_text = ", ".join(patient_conditions[:5])  # Limit to first 5 conditions for brevity

# Construct a realistic patient history from our actual data
patient_history = f"""
Patient is a {patient_age}-year-old {patient_gender} with history of {conditions_text}.
Recently experiencing symptoms related to their conditions. 
Current medications include standard treatments for their conditions.
Physical examination reveals typical findings associated with these conditions.
"""

print("Patient history constructed from real data:")
print(patient_history)

# ### 8.1 Diagnostic Reasoning with Context Learning using real patient data

diagnostic_prompt = f"""
Analyze this patient case, considering differential diagnoses, using medical reasoning.

Example 1:
Patient case: 58-year-old male with sudden onset chest pain radiating to left arm, shortness of breath, and diaphoresis. History of smoking and hypertension.
Medical reasoning: 
1. Primary concern: Acute coronary syndrome given presentation and risk factors
2. Differential diagnoses: Myocardial infarction (most likely), unstable angina, aortic dissection, pulmonary embolism
3. Key indicators: Pain radiation, associated symptoms, and risk profile suggest cardiac origin
4. Immediate actions: ECG, cardiac enzymes, aspirin, and urgent cardiology evaluation

Current patient case from our dataset:
{patient_history}

Medical reasoning:
"""

print("Generating diagnostic reasoning for real patient...")
diagnostic_reasoning = call_openai(diagnostic_prompt, max_tokens=800)
print("\nDiagnostic reasoning analysis:")
print(diagnostic_reasoning)

# ### 8.2 Treatment Plan Generation based on real conditions

# Get more specific data for treatment planning
if 'REASONDESCRIPTION' in patient_records.columns:
    reasons = patient_records['REASONDESCRIPTION'].dropna().unique()
    reason_text = ". ".join(reasons[:3])  # First 3 reasons
else:
    reason_text = "Standard indications for the diagnosed conditions."

treatment_prompt = f"""
Generate a treatment plan for this patient, using tree-of-thought reasoning.

Patient information:
{patient_history}
Reason for encounter: {reason_text}
Diagnostic assessment: {diagnostic_reasoning[:200]}... (truncated)

Develop a treatment plan considering:

1. Pharmacological interventions:
   - Recommended medications based on the specific conditions: {conditions_text}
   - Potential medication interactions to consider

2. Non-pharmacological interventions:
   - Lifestyle modifications specific to these conditions
   - Monitoring recommendations

3. Follow-up plan:
   - Suggested timeline based on condition severity
   - Key parameters to monitor

Complete treatment plan:
"""

print("\nGenerating treatment plan for real patient conditions...")
treatment_plan = call_openai(treatment_prompt, max_tokens=800)
print("\nTreatment plan:")
print(treatment_plan)

# ### 8.3 Disease Classification for Actual Patient Conditions

# Extract real conditions from dataset for classification
real_conditions = list(patient_records['DESCRIPTION_condition'].unique())[:4]  # Take up to 4 real conditions

# Create a classification prompt using real conditions
classification_prompt = f"""
For each condition from our patient data, classify the appropriate clinical approach:
A. Medication management primarily
B. Specialist referral required
C. Additional testing needed
D. Surgical intervention may be necessary
E. Lifestyle modification as primary treatment

For each classification, provide your reasoning.

Current conditions to classify from patient data:
"""

for i, condition in enumerate(real_conditions):
    condition_prompt = classification_prompt + f"\nCondition {i+1}: {condition}"
    print(f"\nClassifying condition {i+1}: {condition}")
    classification = call_openai(condition_prompt, max_tokens=300)
    print(f"Classification and reasoning:\n{classification}")

# ### 8.4 Predictive Analysis Based on Patient's Actual Conditions

# Create prediction parameters based on real patient data
prediction_parameters = ", ".join([f"'{cond}'" for cond in real_conditions[:3]])

predictive_prompt = f"""
Based on this patient's profile from our dataset, predict the 6-month prognosis.

Patient profile:
- Age: {patient_age}
- Gender: {patient_gender}
- Diagnosed conditions: {prediction_parameters}

Consider multiple potential trajectories for the following:
1. Disease progression likelihood
2. Complication risks
3. Treatment response prediction
4. Quality of life impact

Format:
- Outcome area: [predicted outcome]
- Likelihood: [low/medium/high]
- Reasoning: [clinical reasoning based on the specific conditions]
"""

print("\nGenerating predictive analysis using real conditions...")
prediction_analysis = call_openai(predictive_prompt, max_tokens=800)
print("\nPredictive analysis:")
print(prediction_analysis)

# ### 8.5 Comprehensive Care Plan Combining All Insights

combined_prompt = f"""
Create a comprehensive care plan that integrates all analyses:
1. Patient profile: {patient_age}-year-old {patient_gender} with {conditions_text}
2. Diagnostic assessment summary
3. Treatment recommendations
4. Risk stratification
5. Monitoring plan

Format this as a concise clinical care plan that could be used in practice.
"""

print("\nGenerating comprehensive care plan...")
care_plan = call_openai(combined_prompt, max_tokens=800)
print("\nComprehensive clinical care plan:")
print(care_plan)

# Save outputs with reference to the actual patient ID (de-identified)
healthcare_llm_outputs = {
    "Patient_ID": f"Patient_{hash(patient_id) % 10000}",  # Hashed for privacy
    "Patient_demographics": f"{patient_age}-year-old {patient_gender}",
    "Conditions": list(patient_conditions),
    "Diagnostic_reasoning": diagnostic_reasoning,
    "Treatment_plan": treatment_plan,
    "Risk_prediction": prediction_analysis,
    "Care_plan": care_plan
}

print("\nAll healthcare LLM analyses completed using real patient data.")

Patient history constructed from real data:

Patient is a 31-year-old M with history of Housing unsatisfactory (finding), Received higher education (finding), Transport problem (finding), Social isolation (finding), Stress (finding).
Recently experiencing symptoms related to their conditions. 
Current medications include standard treatments for their conditions.
Physical examination reveals typical findings associated with these conditions.

Generating diagnostic reasoning for real patient...

Diagnostic reasoning analysis:
Patient case: A 31-year-old male with a history of housing instability, higher education, transport problems, social isolation, and stress. Recently experiencing symptoms related to these conditions. Currently on standard treatments for these conditions. Physical examination reveals typical findings associated with these conditions.

Medical reasoning:

1. **Primary concern**: The main issue appears to be related to psychosocial stressors impacting mental health and

# ## Conclusion
#
# This tutorial applied generative AI to healthcare using synthetic data and OpenAI's models. We explored prompt engineering techniques like in-context learning, few-shot learning, chain-of-thought, and tree-of-thought reasoning for various healthcare tasks:
#
# 1. **Summarization** of patient medical history
# 2. **Converting tabular data** to narrative text
# 3. **Disease prediction** based on patient characteristics
# 4. **Simplification** of medical notes for patients
# 5. **Medical question answering** with reasoning
#
# ### Future Extensions
#
# - Scale to larger datasets
# - Compare different OpenAI models (GPT-4o vs GPT-3.5-turbo)
# - Implement more sophisticated evaluation metrics
# - Fine-tune models on specific medical tasks
# - Add guardrails and safety checks for medical content
# - Explore multimodal applications (e.g., with medical imaging)
#
# ### Important Considerations
#
# - Always ensure patient privacy with synthetic or properly de-identified data
# - Verify model outputs with healthcare professionals before clinical use
# - Be cautious about potential biases in model training data
# - Consider regulatory requirements for healthcare AI applications