In [1]:
from tqdm import tqdm
# Alex can I have $50 in AWS Bedrock credits to test some LLM stuff

##  Extracting Clinical Outcomes w/ Outlines
Utilizing the *outlines* package to support a Pydantic model and enforce structure in generated text. By providing a Pydantic Model of what a therapeutic outcome object should be, can we prompt for outcomes from a provided clinical trials section of an FDA label (faux-RAG).   

### Test Run using Outlines
Load FDA label data as search space. 



In [2]:
import pandas as pd

search_space = pd.read_excel('../20240424_trial_searchspace.xlsx').reset_index(drop=True).drop('Unnamed: 0', axis=1)


Load model. Using Orca this time:

>Developed by some of the researchers behind Llama, the Mistral large language models are the gold standard for accessible and performant open source models.  
>Mistral AI offers 7B and a mixture-of-experts 8x7B open source models competitive or better than commercial models of similar size. Available under the Apache 2.0 license, the Mistral models are now also available via most cloud vendors. After releasing their open source models, Mistral AI has also begun offering Small, Large, and Embed models via their business API.
  
>Orca is a descendant of LLaMA developed by Microsoft with finetuning on explanation traces obtained from GPT-4.  
>Orca-13B is a LLM developed by Microsoft. It is based on LLaMA with finetuning on complex explanation traces obtained from GPT-4. By using rich signals, Orca surpasses the performance of models such as Vicuna-13B on complex tasks. However, given its model backbone and the data used for its finetuning, Orca is under noncommercial use.  

Orca is a direct fine-tune of Mistral AI
https://news.ycombinator.com/item?id=37743781, https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca 
>We have used our own OpenOrca dataset to fine-tune on top of Mistral 7B. This dataset is our attempt to reproduce the dataset generated for Microsoft Research's Orca Paper. We use OpenChat packing, trained with Axolotl.
>
>This release is trained on a curated filtered subset of most of our GPT-4 augmented data. It is the same subset of our data as was used in our OpenOrcaxOpenChat-Preview2-13B model.




In [3]:
import outlines

model = outlines.models.transformers("Open-Orca/Mistral-7B-OpenOrca")
# TODO: instruct vs OpenOrca

  from .autonotebook import tqdm as notebook_tqdm
Loading checkpoint shards: 100%|██████████| 2/2 [00:17<00:00,  8.85s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Create a function to generate prompts. This will make it just a little bit easier to keep track of the prompts being used (they are huge) and directly insert each label into them outside of the main loop.

In [4]:
@outlines.prompt #TODO: add value to description (break out point 1)
def identify_outcomes(clinical_trial):
    """You are a professional medical practicioner with a medical degree. Other doctors \
    send you clinical trial reports from which you need to extract:

    1. The primary outcome measure and measured value
    2. The therapeutic treatment regiment used to achieve the primary outcome
    
    # EXAMPLE
    CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
    RESULT: {"outcome": "Median overall survival", "value": "9.0 months", "regiment":"triplet-therapy group"}

    # OUTPUT INSTRUCTIONS    
    
    Answer in valid JSON. Here are different objects relevant for the output:

    ClinicalOutcome:
        outcome (str): name of the primary outcome measure
        value (str): the value that was measured from the outcome
        regiment (str): the therapeutic treatment strategy used to obtain the outcome

    # OUTPUT
    
    CLINICAL_TRIAL: {{ clinical_trial }}
    RESULT: """

Create a pydantic model that can be representative of a therapeutic outcome object. The model below is pretty basic and just contains three fields expected to be strings. The intent is that this model will serve as the schema that the LLM needs to 'fill in' given the context, search space, and one-shot example in the prompt.

In [5]:
from pydantic import BaseModel

# Notes: Outcome could be a giant Enum according to buckets? Value str to capture lots of values, but could enforce strict int
class ClinicalOutcome(BaseModel):
    outcome: str 
    value: str
    regiment: str

Create list of prompts using the prompt function we made.

In [34]:
prompts = [identify_outcomes(trial) for trial in list(search_space['clinical_studies'][68:100])]
prompts[0:2]

['You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:\n\n1. The primary outcome measure and measured value\n2. The therapeutic treatment regiment used to achieve the primary outcome\n\n# EXAMPLE\nCLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.\nRESULT: {"outcome": "Median overall survival", "va

Combine the model and the pydantic schema to create a generator object

In [35]:
generator = outlines.generate.json(model, ClinicalOutcome)

Run through all prompts

In [36]:
results = []
for prompt in tqdm(prompts):
    result = generator(prompt)
    print(prompt)
    print(result)
    results.append(result)

  3%|▎         | 1/32 [53:45<27:46:24, 3225.32s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

  6%|▋         | 2/32 [1:46:13<26:29:56, 3179.89s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

  9%|▉         | 3/32 [2:38:31<25:27:36, 3160.58s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 12%|█▎        | 4/32 [3:41:04<26:24:11, 3394.70s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 16%|█▌        | 5/32 [4:40:01<25:50:43, 3446.05s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 19%|█▉        | 6/32 [5:46:27<26:12:45, 3629.44s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 22%|██▏       | 7/32 [7:08:37<28:09:32, 4054.89s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 25%|██▌       | 8/32 [7:59:55<24:57:26, 3743.59s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 28%|██▊       | 9/32 [9:04:01<24:07:23, 3775.82s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 31%|███▏      | 10/32 [9:51:37<21:20:17, 3491.71s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 34%|███▍      | 11/32 [10:46:07<19:58:19, 3423.79s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 38%|███▊      | 12/32 [11:57:43<20:29:45, 3689.26s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 41%|████      | 13/32 [12:54:18<18:59:59, 3599.99s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 44%|████▍     | 14/32 [14:07:47<19:13:21, 3844.54s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 47%|████▋     | 15/32 [14:57:15<16:54:26, 3580.37s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 50%|█████     | 16/32 [16:34:33<18:55:55, 4259.73s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 53%|█████▎    | 17/32 [17:54:13<18:24:03, 4416.22s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 56%|█████▋    | 18/32 [18:58:28<16:31:06, 4247.62s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 59%|█████▉    | 19/32 [19:53:32<14:18:56, 3964.36s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 62%|██████▎   | 20/32 [20:59:19<13:11:47, 3958.94s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 66%|██████▌   | 21/32 [21:43:15<10:52:59, 3561.77s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 69%|██████▉   | 22/32 [23:09:40<11:14:51, 4049.14s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 72%|███████▏  | 23/32 [24:16:17<10:05:00, 4033.38s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 75%|███████▌  | 24/32 [27:39:27<14:24:06, 6480.86s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 78%|███████▊  | 25/32 [28:53:21<11:24:26, 5866.60s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 81%|████████▏ | 26/32 [29:45:48<8:25:05, 5050.88s/it] 

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 84%|████████▍ | 27/32 [30:42:31<6:19:42, 4556.41s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 88%|████████▊ | 28/32 [31:54:00<4:58:24, 4476.11s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 91%|█████████ | 29/32 [33:17:27<3:51:46, 4635.35s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 94%|█████████▍| 30/32 [34:07:44<2:18:19, 4149.89s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

 97%|█████████▋| 31/32 [35:34:08<1:14:20, 4460.26s/it]

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.

100%|██████████| 32/32 [36:24:09<00:00, 4095.31s/it]  

You are a professional medical practicioner with a medical degree. Other doctors send you clinical trial reports from which you need to extract:

1. The primary outcome measure and measured value
2. The therapeutic treatment regiment used to achieve the primary outcome

# EXAMPLE
CLINICAL_TRIAL: The median overall survival was 9.0 months in the triplet-therapy group and 5.4 months in the control group (hazard ratio for death, 0.52; 95% confidence interval [CI], 0.39 to 0.70; P<0.001). The confirmed response rate was 26% (95% CI, 18 to 35) in the triplet-therapy group and 2% (95% CI, 0 to 7) in the control group (P<0.001). The median overall survival in the doublet-therapy group was 8.4 months (hazard ratio for death vs. control, 0.60; 95% CI, 0.45 to 0.79; P<0.001). Adverse events of grade 3 or higher occurred in 58% of patients in the triplet-therapy group, in 50% in the doublet-therapy group, and in 61% in the control group.
RESULT: {"outcome": "Median overall survival", "value": "9.




In [30]:
results[0]

ClinicalOutcome(outcome='primary', value='100%', regiment='duloxetine delayed-release capsules} 60-120 mg daily }')

### Group and Save
Break apart each dictionary response and save as a new dataframe with explicit columns for each data point to make downstream evaluation just a little bit easier.

In [37]:
outcomes = []
values = []
regiments = []

for clinical_outcome in results:
    outcomes.append(clinical_outcome.outcome)
    values.append(clinical_outcome.value)
    regiments.append(clinical_outcome.regiment)

clinical_outcomes_df = pd.DataFrame().assign(outcome=outcomes,value=values,regiment=regiments)
clinical_outcomes_df

Unnamed: 0,outcome,value,regiment
0,Blood Pressure,decreased blood pressure,Felodipine treatment
1,Improvement of veins,4.5,Polidocanol
2,Median overall survival,9.0 months,triplet-therapy group
3,ADHD symptoms,statistically significantly improved on atomox...,atomoxetine
4,Responder rates,57%,Sandimmune ® and cyclosporine
5,completion of all weeks of study treatment or ...,42%,lamotrigine
6,primary_efficacy,better_than_Comparator,atorvastatin_calcium_10_mg_daily
7,VT/VF,double-blind,intravenous amiodarone
8,Median hemoglobin concentration,13.5 g/dL,VPRIV every other week
9,Symptom scores,6.2 units,combination therapy


In [None]:
list(search_space['brand_name'][5:35])

In [38]:
# search_space[5:35] #['brand_name']  ['application_number']  ['clinical_studies']
clinical_outcomes_df['brand_name'] = list(search_space['brand_name'][68:100])
clinical_outcomes_df['application_number'] = list(search_space['application_number'][68:100])
clinical_outcomes_df['clinical_studies'] = list(search_space['clinical_studies'][68:100])


clinical_outcomes_df[0:5]

Unnamed: 0,outcome,value,regiment,brand_name,application_number,clinical_studies
0,Blood Pressure,decreased blood pressure,Felodipine treatment,Felodipine,ANDA204800,Clinical Studies Felodipine produces dose rela...
1,Improvement of veins,4.5,Polidocanol,Asclera,NDA021201,14 CLINICAL STUDIES Asclera was evaluated in a...
2,Median overall survival,9.0 months,triplet-therapy group,Clofarabine,NDA021673,14 CLINICAL STUDIES Seventy-eight (78) pediatr...
3,ADHD symptoms,statistically significantly improved on atomox...,atomoxetine,Atomoxetine,ANDA202682,14 CLINICAL STUDIES 14.1 ADHD Studies in Child...
4,Responder rates,57%,Sandimmune ® and cyclosporine,Gengraf,ANDA065003,CLINICAL TRIALS Graph Rheumatoid Arthritis The...


In [39]:
# clinical_outcomes_df.to_excel('20240611_llm-orca-ner-set3.xlsx')