In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=bnb_config
)

`low_cpu_mem_usage` was None, now set to True since model is quantized.


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

In [3]:
messages = [
    {"role": "system", "content": "You are a trained medical echocardiography practictioner"},
    {"role": "user", "content": "Quantify Severe Mitral regurgitation using EROA, RgV and RgF"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

###Explanation: The terminators list specifies the criteria for stopping the generation of text.
###The eos_token_id is the ID for the "End of Sequence" token, which indicates that the model should stop generating text.
###The second item, tokenizer.convert_tokens_to_ids(""), attempts to convert an empty string to a token ID, but this might
###be a mistake or placeholder.
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=1024,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
'''
do_sample=True: Enables sampling instead of greedy decoding (which would always choose the highest-probability next token).
temperature=0.6: Controls the randomness of predictions. A lower temperature makes the model more conservative, while a higher temperature makes it more creative.
top_p=0.9: Implements nucleus sampling, where the model considers only the top 90% of the probability mass when generating each token, allowing for more diverse outputs.
'''
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.


As a trained medical echocardiography practitioner, I'd be happy to help quantify severe mitral regurgitation using EROA, RgV, and RgF.

EROA (Effective Regurgitant Orifice Area) is a measure of the size of the mitral regurgitant orifice, which is the area where blood flows backward into the left atrium during systole. A larger EROA indicates more severe mitral regurgitation.

RgV (Regurgitant Volume) is a measure of the volume of blood that regurgitates into the left atrium during systole. A higher RgV indicates more severe mitral regurgitation.

RgF (Regurgitant Fraction) is a measure of the proportion of left ventricular stroke volume that regurgitates into the left atrium during systole. A higher RgF indicates more severe mitral regurgitation.

Here are the normal and abnormal ranges for these parameters:

* EROA: < 0.2 cm² (normal), ≥ 0.4 cm² (severe)
* RgV: < 20 mL (normal), ≥ 40 mL (severe)
* RgF: < 20% (normal), ≥ 30% (severe)

In a patient with severe mitral regurgitation, the

In [4]:
messages = [
    {"role": "system", "content": "You are a trained medical echocardiography practictioner"},
    {"role": "user", "content": "Quantify Severe Mitral regurgitation using EROA, RgV and RgF"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

###Explanation: The terminators list specifies the criteria for stopping the generation of text.
###The eos_token_id is the ID for the "End of Sequence" token, which indicates that the model should stop generating text.
###The second item, tokenizer.convert_tokens_to_ids(""), attempts to convert an empty string to a token ID, but this might
###be a mistake or placeholder.
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=1024,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
'''
do_sample=True: Enables sampling instead of greedy decoding (which would always choose the highest-probability next token).
temperature=0.6: Controls the randomness of predictions. A lower temperature makes the model more conservative, while a higher temperature makes it more creative.
top_p=0.9: Implements nucleus sampling, where the model considers only the top 90% of the probability mass when generating each token, allowing for more diverse outputs.
'''
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.


As a trained medical echocardiography practitioner, I'd be happy to help quantify severe mitral regurgitation using the EROA, RgV, and RgF indices.

**EROA (Effective Regurgitant Orifice Area)**:

EROA is a measure of the effective area of the mitral valve that allows regurgitant flow to occur. In severe mitral regurgitation, EROA is typically ≥ 0.4 cm².

**RgV (Regurgitant Volume)**:

RgV is the volume of blood that regurgitates through the mitral valve per beat. In severe mitral regurgitation, RgV is typically ≥ 47 mL/beat.

**RgF (Regurgitant Fraction)**:

RgF is the proportion of left ventricular stroke volume that is regurgitated through the mitral valve. In severe mitral regurgitation, RgF is typically ≥ 50%.

Using these indices, we can categorize the severity of mitral regurgitation as follows:

* Mild: EROA < 0.2 cm², RgV < 20 mL/beat, RgF < 30%
* Moderate: EROA 0.2-0.4 cm², RgV 20-47 mL/beat, RgF 30-50%
* Severe: EROA ≥ 0.4 cm², RgV ≥ 47 mL/beat, RgF ≥ 50%

In severe mitral r

In [5]:
messages = [
    {"role": "system", "content": "You are a trained medical echocardiography practictioner"},
    {"role": "user", "content": "Criteria for MR jet for severe MR"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

###Explanation: The terminators list specifies the criteria for stopping the generation of text.
###The eos_token_id is the ID for the "End of Sequence" token, which indicates that the model should stop generating text.
###The second item, tokenizer.convert_tokens_to_ids(""), attempts to convert an empty string to a token ID, but this might
###be a mistake or placeholder.
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=1024,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
'''
do_sample=True: Enables sampling instead of greedy decoding (which would always choose the highest-probability next token).
temperature=0.6: Controls the randomness of predictions. A lower temperature makes the model more conservative, while a higher temperature makes it more creative.
top_p=0.9: Implements nucleus sampling, where the model considers only the top 90% of the probability mass when generating each token, allowing for more diverse outputs.
'''
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.


As a medical echocardiography practitioner, I can outline the criteria for a moderate or severe mitral regurgitation (MR) jet based on the American Society of Echocardiography (ASE) guidelines:

**Moderate MR (Grade 3-4):**

1. **Jet width**: The jet width is ≥ 4 mm in the left atrium (LA) or ≥ 2 mm in the pulmonary vein (PV) in the apical four-chamber view.
2. **Jet length**: The jet length is ≥ 2 cm in the LA or ≥ 1 cm in the PV in the apical four-chamber view.
3. **Flow convergence**: The flow convergence zone is ≥ 1 cm in the LA or ≥ 0.5 cm in the PV in the apical four-chamber view.
4. **Color Doppler signal**: The color Doppler signal is moderate to severe, with a high-velocity jet (> 2.5 m/s) in the LA or PV in the apical four-chamber view.

**Severe MR (Grade 4-5):**

1. **Jet width**: The jet width is ≥ 6 mm in the LA or ≥ 3 mm in the PV in the apical four-chamber view.
2. **Jet length**: The jet length is ≥ 3 cm in the LA or ≥ 1.5 cm in the PV in the apical four-chamber view.
