
# Anonymization of mental health medical records

### Introduction to Natural Language Processing - Final Project

to:

Alexander(Sasha) Apartsin, PhD

by:  
   
Uriel Atzmon, id 209307172  
Victoria Chuykina, id 321544512  


### Project Background
Mental health clinical notes often contain personally identifiable information (PII) and extended PII (ePII) such as places of work, family details, or personal events, which can compromise patient privacy. Traditional anonymization methods struggle to remove such details while preserving the clinical value of the text. This project aims to develop an LLM-based anonymization system that can identify and rephrase or remove ePII without losing medically relevant information. The system will be trained and validated on synthetic psychiatric records using a question-answering framework to assess both privacy protection and clinical utility.

| **Term**                                          | **Definition**                                                                                                             |
| ------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- |
| **PII**<br/>(Personally Identifiable Information) | Any information that can directly identify an individual (e.g., full name, ID number, address).                            |
| **ePII**<br/>(Extended PII)                       | Indirect personal details that can still reveal identity, such as workplace, family relations, or personal life events.    |
| **Clinical Value**                                | The medically relevant content in a record, such as symptoms, diagnoses, and treatment information.                        |
| **LLM**<br/>(Large Language Model)                | A neural network trained on vast textual data to understand and generate human-like language (e.g., GPT, Claude, Mistral). |
| **Anonymization**                                 | The process of removing or altering identifying information to protect individual privacy.                                 |
| **Rephrasing**                                    | Changing the structure or wording of a sentence to obscure personal details while retaining meaning.                       |
| **NER**<br/>(Named Entity Recognition)            | A technique used to identify names, locations, and other entities in text; commonly used for de-identification.            |
| **BERTScore**                                     | A semantic similarity metric that compares model-generated answers to reference answers based on contextual embeddings.    |
| **CQs**<br/>(Clinical Questions)                  | Questions that test whether the anonymized text still conveys medical insights.                                            |
| **PQs**<br/>(Personal Questions)                  | Questions that test whether identifying details remain after anonymization.                                                |


### Clinical Question Bank – Introduction
This section presents a curated set of clinical questions derived from DSM-5 diagnostic criteria for various mental health disorders.
Each question is designed to:

Represent a key symptom or diagnostic element of a specific disorder.

Be used for validating the clinical content of anonymized medical notes.

Ensure that after anonymization, clinically relevant information remains answerable.

These questions are especially useful in the context of:

Synthetic data generation – as prompts to include clinically meaningful content.

Evaluation of LLM anonymization quality – to test if answers to clinical questions are still preserved.

The dataset below includes:

10 major psychiatric diagnoses (e.g., MDD, PTSD, OCD).

In [2]:
import pandas as pd

# Load your CSV file
df_cq = pd.read_csv("clinical_questions_dsm5.csv")

# Set display options to show full column width
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', 100)  

# Display the full table
display(df_cq)


Unnamed: 0,Disorder,Clinical Question
0,Major Depressive Disorder (MDD),Does the patient describe a persistent depressed mood?
1,Major Depressive Disorder (MDD),Has the patient lost interest or pleasure in usual activities?
2,Major Depressive Disorder (MDD),Is there mention of significant weight loss or gain?
3,Major Depressive Disorder (MDD),Does the patient report insomnia or hypersomnia?
4,Major Depressive Disorder (MDD),Is there observable psychomotor agitation or retardation?
5,Major Depressive Disorder (MDD),Does the patient express fatigue or loss of energy?
6,Major Depressive Disorder (MDD),Are there feelings of worthlessness or excessive guilt?
7,Major Depressive Disorder (MDD),Is there diminished ability to concentrate or make decisions?
8,Major Depressive Disorder (MDD),Does the patient mention recurrent thoughts of death or suicide?
9,Major Depressive Disorder (MDD),Has the episode lasted at least two weeks?


### Introduction – Personal Questions (ePII)
This section presents a structured bank of personal questions aimed at identifying whether a medical note still contains any personally identifiable information (PII) or extended PII (ePII) after anonymization.

The questions are based on common ePII categories, including indirect identifiers such as place of work, family members, residential location, personal life events, or uncommon activities.
Even though these details may not fall under strict PII definitions, they can still pose a re-identification risk.

This question set is used for:

Synthetic text generation – by prompting LLMs to include personal facts.

Anonymization validation – by checking whether answers to personal questions are still extractable.

The dataset includes:

10 categories of extended personal information.

5 example questions per category, designed to support both data generation and privacy evaluation tasks.

In [3]:
import pandas as pd

# Load your CSV file
df_pq = pd.read_csv("personal_questions_epii.csv")

# Set display options to show full column width
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', 100)  

# Display the full table
display(df_pq)


Unnamed: 0,category,prompt
0,Patient's Name,First Name:
1,Patient's Name,Last Name:
2,Patient's Name,Full Residential Address:
3,Patient's Name,City of Residence:
4,Patient's Name,Date of Birth:
5,Place of Residence,Gender (Male/Female):
6,Place of Residence,Country of Birth:
7,Place of Residence,Parents' Country of Origin:
8,Place of Residence,Marital Status:
9,Place of Residence,Do you have children? How many:


In [None]:
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

NameError: name 'sk' is not defined

The code generates a synthetic short psychiatric intake paragraph by combining answers to one randomly selected clinical question and one personal question, using OpenAI's GPT model.


In [None]:
import openai
import random

openai.api_key =   # Replace with your actual API key

# Function to generate a synthetic clinical note using one clinical and one personal question
def generate_synthetic_note(df_cq, df_pq, model="gpt-4"):
    # Randomly sample one clinical question
    cq = df_cq.sample(1).iloc[0]['Clinical Question']
    
    # Randomly sample one personal question
    pq = df_pq.sample(1).iloc[0]['prompt']

    # Build the prompt to guide the language model
    prompt = f"""
Create a realistic short clinical note (5–7 sentences) based on the following constraints:

- Clinical detail: The note should include information that answers the question: "{cq}"
- Personal detail: The note should include information that answers the question: "{pq}"

Structure it as a psychiatric intake paragraph, suitable for use in anonymization testing.
    """.strip()

    # Call the OpenAI GPT model with the constructed prompt
    response = openai.ChatCompletion.create(
        model=model,  # You can change to 'gpt-3.5-turbo' if needed
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7,  # Controls randomness. 0 = more deterministic
    )

    # Return a dictionary containing the input and output
    return {
        "prompt": prompt,
        "clinical_question": cq,
        "personal_question": pq,
        "generated_note": response.choices[0].message['content'].strip()
    }

# Example usage: generate one synthetic clinical note
result = generate_synthetic_note(df_cq, df_pq)
print(result['generated_note'])


Patient is a 45-year-old female presenting with significant impulsivity in at least two self-damaging areas. She has a history of substance misuse, including alcohol and prescription medication, and also exhibits impulsive spending that has led to substantial financial debt. Moreover, she has a record of reckless driving, which has resulted in several traffic violations and accidents. The patient provided her contact information for further consultation and follow-up; her phone number is 555-123-4567. She has been advised to seek immediate help in case of any emergency or exacerbation of her impulsive behaviors.


This script generates synthetic records for _NUM_ fictional psychiatric patients using OpenAI's language model. The resulting dataset combines personal and clinical information and is intended for use in anonymization model training and evaluation.


Load Input Files

clinical_questions_dsm5.csv: Contains clinical questions grouped by disorder.

personal_questions_epii_old_ver.csv: Contains categorized personal questions.

Sampling Questions

For each patient, the code randomly selects:

One disorder and 10 related clinical questions.

10 personal questions, one from each of 10 unique categories.

Generating Answers

Each clinical question is answered with a realistic, sentence-length clinical fact using GPT.

Each personal question is answered with a short, specific identifying detail (e.g., name, phone, address).

Output Structure

For each patient:

PQ_1–PQ_10: Personal questions

PA_1–PA_10: Personal answers

CQ_1–CQ_10: Clinical questions

CA_1–CA_10: Clinical answers

All results are saved into a single structured CSV file: generated_patient_data.csv.

In [None]:
import openai
import pandas as pd
import random
from tqdm import tqdm

# Load the questions
df_cq = pd.read_csv("clinical_questions_dsm5.csv")
df_pq = pd.read_csv("personal_questions_epii.csv")  # using updated file format

openai.api_key =  # Replace with your actual API key

# ========== Helper Functions ==========

def answer_clinical_question(question, model="gpt-3.5-turbo"):
    prompt = f"""
You are helping to build a realistic clinical case. Answer the following **clinical** question with a short, concrete, realistic fact about a hypothetical patient.

Question:
{question}

Answer:"""
    response = openai.ChatCompletion.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0.6,
    )
    return response.choices[0].message.content.strip()

def answer_personal_question(prompt_text, model="gpt-3.5-turbo"):
    prompt = f"""
You are constructing structured demographic data for a synthetic patient. Give a very short and specific answer to the personal field below. Respond with only one fact — such as a name, place, email, phone number, or other identifiable detail — with explanation = the {prompt_text.lower()} is ___.

Prompt:
{prompt_text}

Answer:"""
    response = openai.ChatCompletion.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0.5,
    )
    return response.choices[0].message.content.strip()

# ========== Main Execution ==========

num_patients = 5
qa_records = []

for pid in tqdm(range(1, num_patients + 1)):
    try:
        # Select random disorder and clinical questions
        disorder = random.choice(df_cq["Disorder"].unique())
        cqs = random.sample(
            df_cq[df_cq["Disorder"] == disorder]["Clinical Question"].tolist(), 10
        )

        # Select 10 random personal prompts (any category)
        pqs = df_pq.sample(10)["prompt"].tolist()

        # Generate answers using OpenAI
        cas = [answer_clinical_question(q) for q in cqs]
        pas = [answer_personal_question(q) for q in pqs]

        qa_records.append({
            "PatientID": pid,
            "Disorder": disorder,
            "PQs": pqs,
            "PAs": pas,
            "CQs": cqs,
            "CAs": cas
        })

    except Exception as e:
        print(f"❌ Error for patient {pid}: {e}")

# ========== Save Output ==========

df_output = pd.DataFrame(qa_records)

# Expand nested lists: personal questions/answers first, then clinical
for i in range(10):
    df_output[f"PQ_{i+1}"] = df_output["PQs"].apply(lambda x: x[i] if i < len(x) else "")
    df_output[f"PA_{i+1}"] = df_output["PAs"].apply(lambda x: x[i] if i < len(x) else "")
    df_output[f"CQ_{i+1}"] = df_output["CQs"].apply(lambda x: x[i] if i < len(x) else "")
    df_output[f"CA_{i+1}"] = df_output["CAs"].apply(lambda x: x[i] if i < len(x) else "")

# Drop the raw lists to flatten the CSV
df_output.drop(columns=["PQs", "PAs", "CQs", "CAs"], inplace=True)

# Save the final CSV
df_output.to_csv("generated_patient_data.csv", index=False)
print("✅ File saved: generated_patient_data.csv")


100%|██████████| 5/5 [01:04<00:00, 12.94s/it]

✅ File saved: generated_patient_data.csv





Large scale

In [None]:
import openai
import pandas as pd
import random
import time
from tqdm import tqdm

# ========== Load Input Data ==========

# Load clinical questions (DSM-5-based)
df_cq = pd.read_csv("clinical_questions_dsm5.csv")

# Load personal questions (ePII-based)
df_pq = pd.read_csv("personal_questions_epii.csv")

# Set OpenAI API key
openai.api_key =  # Replace with your actual API key

# ========== Helper Functions ==========

def answer_clinical_question(question, model="gpt-3.5-turbo"):
    """
    Generate a short, realistic clinical answer for a given question using OpenAI.
    """
    prompt = f"""
You are helping to build a realistic clinical case. Answer the following **clinical** question with a short, concrete, realistic fact about a hypothetical patient.

Question:
{question}

Answer:"""
    response = openai.ChatCompletion.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0.6,
    )
    time.sleep(0.2)  # Prevent rate limiting
    return response.choices[0].message.content.strip()


def answer_personal_question(prompt_text, model="gpt-3.5-turbo"):
    """
    Generate a short, specific answer for a personal question (PII) using OpenAI.
    """
    prompt = f"""
You are constructing structured demographic data for a synthetic patient. Give a very short and specific answer to the personal field below. Respond with only one fact — such as a name, place, email, phone number, or other identifiable detail — with explanation = the {prompt_text.lower()} is ___.

Prompt:
{prompt_text}

Answer:"""
    response = openai.ChatCompletion.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0.5,
    )
    time.sleep(0.2)
    return response.choices[0].message.content.strip()

# ========== Main Data Generation Loop ==========

num_patients = 500  # Number of synthetic patients to generate
qa_records = []

for pid in tqdm(range(1, num_patients + 1), desc="Generating Patients"):
    try:
        # Select a random disorder and 10 related clinical questions
        disorder = random.choice(df_cq["Disorder"].unique())
        cqs = random.sample(
            df_cq[df_cq["Disorder"] == disorder]["Clinical Question"].tolist(), 10
        )

        # Select 10 random personal questions
        pqs = df_pq.sample(10)["prompt"].tolist()

        # Generate answers via OpenAI
        cas = [answer_clinical_question(q) for q in cqs]
        pas = [answer_personal_question(q) for q in pqs]

        # Append patient record
        qa_records.append({
            "PatientID": pid,
            "Disorder": disorder,
            "PQs": pqs,
            "PAs": pas,
            "CQs": cqs,
            "CAs": cas
        })

        # Save partial backup every 100 patients
        if pid % 100 == 0:
            df_partial = pd.DataFrame(qa_records)
            df_partial.to_csv(f"backup_until_patient_{pid}.csv", index=False)
            print(f"✅ Backup saved after {pid} patients")

    except Exception as e:
        print(f"❌ Error for patient {pid}: {e}")

# ========== Post-Processing and Save Output ==========

# Convert list-based records to flat DataFrame columns
df_output = pd.DataFrame(qa_records)

for i in range(10):
    df_output[f"PQ_{i+1}"] = df_output["PQs"].apply(lambda x: x[i] if i < len(x) else "")
    df_output[f"PA_{i+1}"] = df_output["PAs"].apply(lambda x: x[i] if i < len(x) else "")
    df_output[f"CQ_{i+1}"] = df_output["CQs"].apply(lambda x: x[i] if i < len(x) else "")
    df_output[f"CA_{i+1}"] = df_output["CAs"].apply(lambda x: x[i] if i < len(x) else "")

# Drop raw list columns
df_output.drop(columns=["PQs", "PAs", "CQs", "CAs"], inplace=True)

# Save final flat CSV
df_output.to_csv("generated_patient_data_500.csv", index=False)
print("✅ Final CSV saved: generated_patient_data_500.csv")


Generating Patients:  15%|█▌        | 75/500 [19:58<1:30:07, 12.72s/it]

❌ Error for patient 75: The server is overloaded or not ready yet.


Generating Patients:  16%|█▌        | 78/500 [25:37<11:57:19, 101.99s/it]

❌ Error for patient 78: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))


Generating Patients:  20%|█▉        | 99/500 [36:18<11:32:53, 103.67s/it]

❌ Error for patient 99: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))


Generating Patients:  20%|██        | 100/500 [36:35<8:38:18, 77.75s/it] 

✅ Backup saved after 100 patients


Generating Patients:  22%|██▏       | 108/500 [38:39<2:00:47, 18.49s/it]

❌ Error for patient 108: The server is overloaded or not ready yet.


Generating Patients:  34%|███▍      | 170/500 [1:00:51<10:29:00, 114.37s/it]

❌ Error for patient 170: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))


Generating Patients:  40%|████      | 200/500 [1:09:36<2:04:12, 24.84s/it]  

✅ Backup saved after 200 patients


Generating Patients:  46%|████▌     | 231/500 [1:22:59<7:48:41, 104.54s/it]

❌ Error for patient 231: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))


Generating Patients:  60%|██████    | 300/500 [1:41:58<54:22, 16.31s/it]   

✅ Backup saved after 300 patients


Generating Patients:  76%|███████▌  | 378/500 [2:08:00<3:32:40, 104.60s/it]

❌ Error for patient 378: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))


Generating Patients:  78%|███████▊  | 391/500 [2:11:16<22:12, 12.22s/it]   

❌ Error for patient 390: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 391: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  78%|███████▊  | 392/500 [2:11:16<15:29,  8.61s/it]

❌ Error for patient 392: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  79%|███████▊  | 393/500 [2:11:17<11:28,  6.44s/it]

❌ Error for patient 393: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  79%|███████▉  | 394/500 [2:11:18<08:31,  4.82s/it]

❌ Error for patient 394: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  79%|███████▉  | 395/500 [2:11:19<06:00,  3.43s/it]

❌ Error for patient 395: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  79%|███████▉  | 397/500 [2:11:19<03:03,  1.79s/it]

❌ Error for patient 396: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 397: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  80%|███████▉  | 399/500 [2:11:19<01:38,  1.02it/s]

❌ Error for patient 398: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 399: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  80%|████████  | 401/500 [2:11:20<00:57,  1.73it/s]

❌ Error for patient 400: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 401: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  81%|████████  | 403/500 [2:11:20<00:37,  2.60it/s]

❌ Error for patient 402: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 403: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  81%|████████  | 405/500 [2:11:21<00:27,  3.42it/s]

❌ Error for patient 404: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 405: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  81%|████████▏ | 407/500 [2:11:21<00:22,  4.05it/s]

❌ Error for patient 406: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 407: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  82%|████████▏ | 409/500 [2:11:21<00:19,  4.60it/s]

❌ Error for patient 408: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 409: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  82%|████████▏ | 411/500 [2:11:22<00:18,  4.91it/s]

❌ Error for patient 410: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 411: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  83%|████████▎ | 413/500 [2:11:22<00:17,  5.10it/s]

❌ Error for patient 412: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 413: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  83%|████████▎ | 415/500 [2:11:23<00:17,  4.98it/s]

❌ Error for patient 414: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 415: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  83%|████████▎ | 416/500 [2:11:23<00:16,  5.03it/s]

❌ Error for patient 416: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  84%|████████▎ | 418/500 [2:11:23<00:16,  4.89it/s]

❌ Error for patient 417: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 418: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  84%|████████▍ | 420/500 [2:11:24<00:16,  4.97it/s]

❌ Error for patient 419: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 420: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  84%|████████▍ | 422/500 [2:11:24<00:15,  4.99it/s]

❌ Error for patient 421: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 422: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  85%|████████▍ | 423/500 [2:11:24<00:14,  5.15it/s]

❌ Error for patient 423: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  85%|████████▌ | 425/500 [2:11:25<00:14,  5.03it/s]

❌ Error for patient 424: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 425: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  85%|████████▌ | 427/500 [2:11:25<00:14,  5.06it/s]

❌ Error for patient 426: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 427: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  86%|████████▌ | 428/500 [2:11:25<00:14,  5.00it/s]

❌ Error for patient 428: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  86%|████████▌ | 429/500 [2:11:25<00:15,  4.49it/s]

❌ Error for patient 429: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  86%|████████▌ | 431/500 [2:11:26<00:14,  4.71it/s]

❌ Error for patient 430: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 431: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  87%|████████▋ | 433/500 [2:11:26<00:13,  4.83it/s]

❌ Error for patient 432: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 433: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  87%|████████▋ | 435/500 [2:11:27<00:13,  4.97it/s]

❌ Error for patient 434: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 435: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  87%|████████▋ | 437/500 [2:11:27<00:12,  5.15it/s]

❌ Error for patient 436: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 437: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  88%|████████▊ | 439/500 [2:11:27<00:11,  5.18it/s]

❌ Error for patient 438: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 439: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  88%|████████▊ | 441/500 [2:11:28<00:11,  5.33it/s]

❌ Error for patient 440: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 441: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  88%|████████▊ | 442/500 [2:11:28<00:11,  5.15it/s]

❌ Error for patient 442: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  89%|████████▉ | 444/500 [2:11:29<00:19,  2.90it/s]

❌ Error for patient 443: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 444: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  89%|████████▉ | 446/500 [2:11:30<00:14,  3.67it/s]

❌ Error for patient 445: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 446: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  90%|████████▉ | 448/500 [2:11:30<00:11,  4.37it/s]

❌ Error for patient 447: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 448: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  90%|█████████ | 450/500 [2:11:30<00:10,  4.84it/s]

❌ Error for patient 449: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 450: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  90%|█████████ | 451/500 [2:11:31<00:10,  4.89it/s]

❌ Error for patient 451: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  91%|█████████ | 453/500 [2:11:31<00:09,  5.02it/s]

❌ Error for patient 452: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 453: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  91%|█████████ | 455/500 [2:11:31<00:08,  5.18it/s]

❌ Error for patient 454: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 455: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  91%|█████████ | 456/500 [2:11:31<00:08,  5.25it/s]

❌ Error for patient 456: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  92%|█████████▏| 458/500 [2:11:32<00:08,  5.03it/s]

❌ Error for patient 457: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 458: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  92%|█████████▏| 460/500 [2:11:32<00:07,  5.13it/s]

❌ Error for patient 459: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 460: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  92%|█████████▏| 462/500 [2:11:33<00:07,  5.16it/s]

❌ Error for patient 461: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 462: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  93%|█████████▎| 464/500 [2:11:33<00:06,  5.31it/s]

❌ Error for patient 463: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 464: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  93%|█████████▎| 466/500 [2:11:33<00:06,  5.21it/s]

❌ Error for patient 465: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 466: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  93%|█████████▎| 467/500 [2:11:34<00:06,  5.23it/s]

❌ Error for patient 467: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  94%|█████████▍| 469/500 [2:11:34<00:06,  5.09it/s]

❌ Error for patient 468: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 469: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  94%|█████████▍| 471/500 [2:11:34<00:05,  5.13it/s]

❌ Error for patient 470: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 471: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  95%|█████████▍| 473/500 [2:11:35<00:05,  5.12it/s]

❌ Error for patient 472: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 473: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  95%|█████████▌| 475/500 [2:11:35<00:04,  5.17it/s]

❌ Error for patient 474: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 475: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  95%|█████████▌| 477/500 [2:11:36<00:04,  5.11it/s]

❌ Error for patient 476: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 477: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  96%|█████████▌| 479/500 [2:11:36<00:04,  5.13it/s]

❌ Error for patient 478: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 479: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  96%|█████████▌| 481/500 [2:11:36<00:03,  5.29it/s]

❌ Error for patient 480: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 481: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  97%|█████████▋| 483/500 [2:11:37<00:06,  2.79it/s]

❌ Error for patient 482: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 483: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  97%|█████████▋| 485/500 [2:11:38<00:04,  3.70it/s]

❌ Error for patient 484: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 485: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  97%|█████████▋| 487/500 [2:11:38<00:03,  4.23it/s]

❌ Error for patient 486: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 487: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  98%|█████████▊| 489/500 [2:11:39<00:02,  4.63it/s]

❌ Error for patient 488: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 489: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  98%|█████████▊| 491/500 [2:11:39<00:01,  4.93it/s]

❌ Error for patient 490: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 491: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  99%|█████████▊| 493/500 [2:11:39<00:01,  5.11it/s]

❌ Error for patient 492: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 493: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  99%|█████████▉| 494/500 [2:11:40<00:01,  5.05it/s]

❌ Error for patient 494: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients:  99%|█████████▉| 496/500 [2:11:40<00:00,  5.06it/s]

❌ Error for patient 495: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 496: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients: 100%|█████████▉| 498/500 [2:11:40<00:00,  5.13it/s]

❌ Error for patient 497: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 498: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients: 100%|██████████| 500/500 [2:11:41<00:00,  5.25it/s]

❌ Error for patient 499: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.
❌ Error for patient 500: Rate limit reached for gpt-3.5-turbo in organization org-jPgrScBOBpmyGlofNuouEiK4 on requests per day (RPD): Limit 10000, Used 10000, Requested 1. Please try again in 8.64s. Visit https://platform.openai.com/account/rate-limits to learn more.


Generating Patients: 100%|██████████| 500/500 [2:11:41<00:00, 15.80s/it]

✅ Final CSV saved: generated_patient_data_500.csv





##  Synthetic Therapist Note Generation

This notebook generates realistic psychiatric intake notes using GPT-3.5.

- Loads `generated_patient_data.csv` with clinical and personal questions.
- For each patient, prompts GPT to write an 6–8 sentence therapist-style summary.
- Notes include clinical symptoms (based on disorder) and personal details (name, address, ID, etc.).
- Output is saved to `generated_patient_data_with_notes.csv` in the `Therapist_Note` column.


In [None]:
import openai
import pandas as pd
from tqdm import tqdm

# Load data
df = pd.read_csv("generated_patient_data_final.csv")

openai.api_key =   # Replace with your actual API key

# Function to generate summary
def generate_synthetic_note(clinical_questions, personal_questions, disorder, model="gpt-3.5-turbo"):
    clinical_part = "\n".join([f"- {q}" for q in clinical_questions])
    personal_part = "\n".join([f"- {q}" for q in personal_questions])

    prompt = f"""
Write a psychiatric intake summary (6–8 sentences) written from the therapist's perspective, documenting the patient's narrative and observed symptoms.

- The paragraph should include a realistic blend of clinical symptoms and identifiable personal background.
- Address all of the following clinical observations related to: {disorder}
{clinical_part}

- Also reflect all of the following personal details shared by the patient:
{personal_part}

- Present the patient's story in a natural, flowing clinical note format — do not list questions or answers directly.
- Additionally, include clearly identifiable personal information as typically documented in an intake: the patient’s full name, exact home address (street, number, city), national ID or license number, phone number, and personal email address — woven seamlessly into the narrative.
- The result will be used to evaluate anonymization models, so realism and detail are essential.
""".strip()

    response = openai.ChatCompletion.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7,
    )

    return response.choices[0].message.content.strip()


# Apply to all rows
summaries = []

for _, row in tqdm(df.iterrows(), total=len(df)):
    disorder = row["Disorder"]
    clinical_questions = [row[f"CQ_{i+1}"] for i in range(10)]
    personal_questions = [row[f"PQ_{i+1}"] for i in range(10)]
    
    try:
        summary = generate_synthetic_note(clinical_questions, personal_questions, disorder)
    except Exception as e:
        summary = f"Error generating note: {e}"

    summaries.append(summary)

# Add summaries to DataFrame
df["Therapist_Note"] = summaries

# Save new CSV
df.to_csv("generated_patient_data_with_notes.csv", index=False)
print("✅ File saved: generated_patient_data_with_notes.csv")


100%|██████████| 878/878 [40:07<00:00,  2.74s/it]

✅ File saved: generated_patient_data_with_notes.csv





This code loads tagged clinical notes and counts the number of embedded tags (in the format [[TAG:TEXT]]) per patient.  
It also calculates the frequency of each tag type (e.g., NAME, DIAGNOSIS, SYMPTOM) across the dataset for analysis.


In [None]:
import openai
import pandas as pd
from tqdm import tqdm

# Load data
df = pd.read_csv("generated_patient_data_with_notes.csv")  

openai.api_key =  # Replace with your actual API key

# Function to annotate therapist notes
def tag_therapist_note(note, model="gpt-3.5-turbo"):
    prompt = f"""
You are a clinical language model. Annotate the following psychiatric intake note by embedding tags **within the text** using the format [[TAG:TEXT]].

Use these tag types:
- NAME
- AGE
- LOCATION
- ORG (Organization or Workplace)
- CONTACT (Email, Phone)
- DATE
- NATIONALITY
- LANGUAGE
- RELIGION
- FAMILY
- TRAUMA
- IMMIGRATION
- OCCUPATION
- SYMPTOM (e.g., insomnia, guilt)
- DIAGNOSIS
- RISK (e.g., suicidal ideation)

Return only the annotated note, without explanation.

### Example input:
The patient John Doe, aged 28, reported insomnia and suicidal ideation after moving from Syria. His father died in an accident.

### Example output:
The patient [[NAME:John Doe]], aged [[AGE:28]], reported [[SYMPTOM:insomnia]] and [[RISK:suicidal ideation]] after [[IMMIGRATION:moving from Syria]]. His [[FAMILY:father]] died in [[TRAUMA:an accident]].

Now annotate this note:
\"\"\"
{note}
\"\"\"
""".strip()

    try:
        response = openai.ChatCompletion.create(
            model=model,
            messages=[{"role": "user", "content": prompt}],
            temperature=0
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        return f"[ERROR] {str(e)}"

# Apply annotation to all notes
tagged_notes = []

for i, row in tqdm(df.iterrows(), total=len(df), desc="Annotating notes"):
    note = row.get("Therapist_Note", "")
    if not isinstance(note, str) or note.strip() == "":
        tagged_notes.append("[ERROR] Empty or invalid note")
        continue

    tagged = tag_therapist_note(note)
    tagged_notes.append(tagged)

# Add the tagged notes to the DataFrame
df["Tagged_Note"] = tagged_notes

# Save to CSV
output_file = "generated_patient_data_with_tagged_notes.csv"
df.to_csv(output_file, index=False)
print(f"✅ File saved: {output_file}")


Annotating notes: 100%|██████████| 878/878 [1:00:11<00:00,  4.11s/it]

✅ File saved: generated_patient_data_with_tagged_notes.csv



