# **Medical LLM Fine-Tuning Notebook**

## **1. Setup & Installation**
```python
!pip install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git
!pip install trl==0.14.0 peft==0.14.0 xformers==0.0.28.post3
!pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124
!pip install --upgrade datasets huggingface_hub evaluate wandb
```
**Why?**  
- **Unsloth**: Optimizes LoRA fine-tuning for 2x faster training  
- **TRL/Peft**: Enables parameter-efficient fine-tuning (LoRA)  
- **Torch 2.5.1**: Required for Unsloth compatibility  
- **WandB**: Tracks training metrics  

---

## **2. Initialize Model**
```python
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
    max_seq_length=2048,
    load_in_4bit=True,  # Quantization for memory efficiency
    token=hf_token
)
```
**Key Choices:**  
- **4-bit Loading**: Reduces VRAM usage by ~75%  
- **2048 seq_len**: Accommodates long medical reasoning chains  
- **DeepSeek-R1**: Strong base for clinical reasoning  

---

## **3. Pre-Fine-Tuning Test**
```python
def generate_response(question):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    outputs = model.generate(**inputs, max_new_tokens=200)
    return tokenizer.decode(outputs[0])

test_case = "A 61-year-old woman with urinary incontinence..."
pre_tuning_response = generate_response(test_case)
```
**Purpose:**  
Establishes baseline performance before fine-tuning  

---

## **4. Dataset Preparation**
```python
dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT", split="train[:500]")
dataset = dataset.train_test_split(test_size=0.1)

def format_prompt(example):
    return f"""
### Clinical Case: {example['Question']}
### Analysis: <think>{example['Complex_CoT']}</think>
### Diagnosis: {example['Response']}
"""
```
**Why This Format?**  
- **Structured prompts** improve model's clinical reasoning  
- **Chain-of-Thought (CoT)**: Forces step-by-step analysis  

---

## **5. LoRA Configuration**
```python
model_lora = FastLanguageModel.get_peft_model(
    model,
    r=16,  # LoRA rank
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_alpha=16,
    lora_dropout=0  # Disabled for Unsloth optimization
)
```
**LoRA Parameters Explained:**  
- **r=16**: Balance between adaptability and overfitting  
- **Target modules**: Where LoRA adapters are injected  
- **alpha=16**: Scales LoRA weights (r/alpha ~1:1 ratio recommended)  

---

## **6. Training Setup**
```python
training_args = TrainingArguments(
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,  # Effective batch size = 8
    learning_rate=2e-4,  # Optimal for LoRA
    eval_strategy="steps",
    eval_steps=50  # Validate every 50 steps
)

trainer = SFTTrainer(
    model=model_lora,
    train_dataset=dataset["train"],
    formatting_func=format_prompt,
    args=training_args
)
```
**Training Strategy:**  
- Small batch size → Fits in Colab's free GPU  
- Gradient accumulation → Simulates larger batches  
- Frequent eval → Catch overfitting early  

---

## **7. Post-Training Evaluation**
```python
post_tuning_response = generate_response(test_case)

def compare_responses(question, pre, post):
    print(f"PRE: {pre.split('Diagnosis:')[-1]}")
    print(f"POST: {post.split('Diagnosis:')[-1]}")
```
**Expected Improvements:**  
1. More accurate diagnoses  
2. Better explanation of pathophysiology  
3. Reduced hallucinations  

---

## **8. Gradio Deployment**
```python
import gradio as gr

demo = gr.Interface(
    fn=generate_response,
    inputs=gr.Textbox(lines=5),
    outputs="text",
    examples=[test_case]
)
demo.launch()
```
**Deployment Notes:**  
- Runs locally in Colab  
- Add `share=True` for temporary public link  

---

## **Key Takeaways**
1. **LoRA Efficiency**: Achieves good results with only 500 examples  
2. **Unsloth Benefits**: 2x faster than standard Peft  
3. **Medical Specialization**: Model learns clinical reasoning patterns  
4. **Quantization**: Enables fine-tuning on consumer GPUs  


In [1]:
#install required dependencies
!pip install unsloth # install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git
!pip install trl==0.14.0 peft==0.14.0 xformers==0.0.28.post3
!pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124

Collecting unsloth
  Downloading unsloth-2025.4.7-py3-none-any.whl.metadata (46 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/46.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.8/46.8 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting unsloth_zoo>=2025.4.4 (from unsloth)
  Downloading unsloth_zoo-2025.4.4-py3-none-any.whl.metadata (8.0 kB)
Collecting xformers>=0.0.27.post2 (from unsloth)
  Downloading xformers-0.0.30-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (1.0 kB)
Collecting bitsandbytes (from unsloth)
  Downloading bitsandbytes-0.45.5-py3-none-manylinux_2_24_x86_64.whl.metadata (5.0 kB)
Collecting tyro (from unsloth)
  Downloading tyro-0.9.19-py3-none-any.whl.metadata (9.9 kB)
Collecting datasets>=2.16.0 (from unsloth)
  Downloading datasets-3.6.0-py3-none-any.whl.metadata (19 kB)
Collecting trl!=0.15.0,!=0.9.0,!=0.9.1,!=0.9.2,!=0.9.3,<=0.15.2,>=0.7.9 (from unsloth)
  Downloadin

Collecting git+https://github.com/unslothai/unsloth.git
  Cloning https://github.com/unslothai/unsloth.git to /tmp/pip-req-build-_v5e3xh5
  Running command git clone --filter=blob:none --quiet https://github.com/unslothai/unsloth.git /tmp/pip-req-build-_v5e3xh5
  Resolved https://github.com/unslothai/unsloth.git to commit 8a055402a27c3d9643cc16947ce40311a280e69c
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: unsloth
  Building wheel for unsloth (pyproject.toml) ... [?25l[?25hdone
  Created wheel for unsloth: filename=unsloth-2025.4.8-py3-none-any.whl size=263278 sha256=0dee04f7a2764291f6c3aee5a430f1992db1dddcad8a6996ec0033f902c6a4c5
  Stored in directory: /tmp/pip-ephem-wheel-cache-d6ds6o3h/wheels/d1/17/05/850ab10c33284a4763b0595cd8ea9d01fce6e221cac24b3c01
Successfully built unsloth
Installing collected packages: unsloth
 

In [1]:
!pip install --upgrade datasets huggingface_hub
!pip install -qU evaluate rouge_score


Collecting huggingface_hub
  Downloading huggingface_hub-0.31.1-py3-none-any.whl.metadata (13 kB)
Downloading huggingface_hub-0.31.1-py3-none-any.whl (484 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m484.3/484.3 kB[0m [31m26.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: huggingface_hub
  Attempting uninstall: huggingface_hub
    Found existing installation: huggingface-hub 0.30.2
    Uninstalling huggingface-hub-0.30.2:
      Successfully uninstalled huggingface-hub-0.30.2
Successfully installed huggingface_hub-0.31.1
  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.0/84.0 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for rouge_score (setup.py) ... [?25l[?25hdone


In [2]:
from unsloth import FastLanguageModel
import torch
from trl import SFTTrainer
from unsloth import is_bfloat16_supported
from huggingface_hub import login
from transformers import TrainingArguments
from datasets import load_dataset
import wandb

Unsloth: Patching Xformers to fix some performance issues.
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


In [3]:
from google.colab import userdata
hf_token = userdata.get('HF_TOKEN')
login(hf_token)

In [4]:
from unsloth import FastLanguageModel
from transformers import TextStreamer

# Configuration
model_name = "deepseek-ai/DeepSeek-R1-Distill-Llama-8B"
max_seq_length = 2048  # For long medical reasoning chains

# Load 4-bit quantized model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=max_seq_length,
    dtype=None,
    load_in_4bit=True,
    token=hf_token  # Replace with your HF token
)

# Set pad token if missing
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

==((====))==  Unsloth 2025.4.8: Fast Llama patching. Transformers: 4.51.3.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.96G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/236 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/53.0k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

In [7]:
def generate_response(question):
    prompt = f"""
### Clinical Case:
{question}

### Step-by-Step Analysis:
<think>
"""
    inputs = tokenizer(prompt, return_tensors="pt", truncation=True).to("cuda")
    streamer = TextStreamer(tokenizer)

    outputs = model.generate(
        **inputs,
        max_new_tokens=1200,
        do_sample=True,
        temperature=0.7,
        streamer=streamer,
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

In [8]:
### Test Case 1: Urinary Incontinence
# *A classic clinical reasoning challenge*

# %%
question = """A 61-year-old woman presents with involuntary urine leakage when coughing or sneezing,
but no nighttime symptoms. Gynecological exam shows a hypermobile urethra.
What is the most likely diagnosis and what would cystometry show?"""

print("=== Pre-Fine-Tuning Response ===")
pre_tuning_response = generate_response(question)


=== Pre-Fine-Tuning Response ===
<｜begin▁of▁sentence｜>
### Clinical Case:
A 61-year-old woman presents with involuntary urine leakage when coughing or sneezing, 
but no nighttime symptoms. Gynecological exam shows a hypermobile urethra. 
What is the most likely diagnosis and what would cystometry show?

### Step-by-Step Analysis:
<think>
Okay, so I'm trying to figure out this clinical case. Let's break it down step by step. 

The patient is a 61-year-old woman who has involuntary urine leakage when she coughs or sneezes, but she doesn't have any nighttime symptoms. During her gynecological exam, they noticed that her urethra is hypermobile. I need to figure out what the most likely diagnosis is and what the cystometry would show.

First, I'll think about possible conditions that cause involuntary urine leakage during coughing or sneezing. I remember that when you cough or sneeze, you can sometimes feel a small urethral nick, which is a common, usually minor issue. But since this is hap

In [12]:
from datasets import load_dataset

# Load dataset
dataset = load_dataset(
    "FreedomIntelligence/medical-o1-reasoning-SFT", "en",
    split="train[:500]",  # First 500 examples
    trust_remote_code=True
)

# Split into train/validation
dataset = dataset.train_test_split(test_size=0.1, seed=42)

# Formatting function for SINGLE example
def format_prompt(example):
    return f"""
### Clinical Case:
{example['Question']}

### Step-by-Step Analysis:
<think>
{example['Complex_CoT']}
</think>

### Final Assessment:
{example['Response']}
"""

# Apply formatting to ENTIRE dataset
def preprocess_function(batch):
    return {"text": [format_prompt({
        "Question": q,
        "Complex_CoT": cot,
        "Response": r
    }) for q, cot, r in zip(
        batch["Question"],
        batch["Complex_CoT"],
        batch["Response"]
    )]}

# Map the formatting across all splits
dataset = dataset.map(
    preprocess_function,
    batched=True,
    remove_columns=dataset["train"].column_names  # Remove original columns
)

# Preview
print(dataset["train"][0]["text"])  # Now shows fully formatted example

Map:   0%|          | 0/450 [00:00<?, ? examples/s]

Map:   0%|          | 0/50 [00:00<?, ? examples/s]


### Clinical Case:
A 4-year-old child presents with symptoms including hoarseness, a croupy cough, aphonia, dyspnea, and wheezing. Considering these symptoms, what is the most probable diagnosis?

### Step-by-Step Analysis:
<think>
Alright, let's see what we have here. A 4-year-old kid showing up with that rough, bark-like cough and hoarseness? That definitely rings a bell for croup, doesn't it? Ah, croup is so common at this age, especially if it's fall or winter when those viruses love to spread.

But wait, there's more going on here. Aphonia? That's unusual for croup. You typically get the barking cough, maybe a little hoarseness, but losing the voice? That needs a closer look.

Now, there's also dyspnea and wheezing thrown into the mix. Hmm, wheezing makes me think lower in the airway. This could be a sign of something else entirely. Croup usually brings stridor more so than wheezing.

Other conditions come to mind. Could it be epiglottitis? But that usually hits with a high fever

In [17]:
# Instead of re-applying LoRA, create a new LoRA model based on the original model
original_model = model  # Store the original model with LoRA

# Apply LoRA to the original, non-LoRA model
model_lora = FastLanguageModel.get_peft_model(
    model=original_model,  # Use the original model instead of original_model.base_model
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth",
    random_state=3047,
    use_rslora=False,
    loftq_config=None
)

# Training arguments
from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="./medical-lora",
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,
    warmup_steps=10,
    num_train_epochs=1,
    learning_rate=2e-4,
    fp16=not torch.cuda.is_bf16_supported(),
    bf16=torch.cuda.is_bf16_supported(),
    logging_steps=10,
    eval_strategy="steps",  # Changed from evaluation_strategy
    eval_steps=50,
    save_strategy="steps",
    save_steps=100,
    report_to="wandb"
)

# Initialize trainer
from trl import SFTTrainer

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset["train"],
    eval_dataset=dataset["test"],
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    args=training_args
)

Unsloth: Already have LoRA adapters! We shall skip this step.


Map (num_proc=2):   0%|          | 0/450 [00:00<?, ? examples/s]

Map (num_proc=2):   0%|          | 0/50 [00:00<?, ? examples/s]

In [18]:
import wandb
wandb.init(project="medical-llm-finetuning")

# Start training
trainer.train()

# Save model
model.save_pretrained("medical-deepseek-lora")
tokenizer.save_pretrained("medical-deepseek-lora")

wandb.finish()

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mmb15lhr[0m ([33mmb15lhr-devsphinx[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 450 | Num Epochs = 1 | Total steps = 56
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 41,943,040/8,000,000,000 (0.52% trained)


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss,Validation Loss
50,1.5292,1.499543


Unsloth: Not an error, but LlamaForCausalLM does not accept `num_items_in_batch`.
Using gradient accumulation will be very slightly less accurate.
Read more on gradient accumulation issues here: https://unsloth.ai/blog/gradient


0,1
eval/loss,▁
eval/runtime,▁
eval/samples_per_second,▁
eval/steps_per_second,▁
train/epoch,▁▃▄▆▇▇█
train/global_step,▁▃▄▆▇▇█
train/grad_norm,█▂▁▃▁
train/learning_rate,█▇▅▃▁
train/loss,█▃▂▁▂

0,1
eval/loss,1.49954
eval/runtime,41.6807
eval/samples_per_second,1.2
eval/steps_per_second,0.6
total_flos,1.4210510401929216e+16
train/epoch,0.99556
train/global_step,56.0
train/grad_norm,0.27069
train/learning_rate,3e-05
train/loss,1.5292


In [19]:
# Reload model for inference
model, tokenizer = FastLanguageModel.from_pretrained(
    "medical-deepseek-lora",
    max_seq_length=max_seq_length,
    dtype=None,
    load_in_4bit=True,
)


==((====))==  Unsloth 2025.4.8: Fast Llama patching. Transformers: 4.51.3.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Unsloth 2025.4.8 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


In [20]:
### Test Case 1 Revisited: Urinary Incontinence
# Let's compare responses to the same question

print("=== Post-Fine-Tuning Response ===")
post_tuning_response = generate_response(question)


=== Post-Fine-Tuning Response ===
<｜begin▁of▁sentence｜>
### Clinical Case:
A 61-year-old woman presents with involuntary urine leakage when coughing or sneezing, 
but no nighttime symptoms. Gynecological exam shows a hypermobile urethra. 
What is the most likely diagnosis and what would cystometry show?

### Step-by-Step Analysis:
<think>
Okay, so we've got this 61-year-old woman who's been having some problems with her bladder. Specifically, whenever she coughs or sneezes, she ends up peeing involuntarily. That sounds like a pretty specific symptom pattern. It's not just about going to the bathroom; it's more about an unexpected urge to pee when she does something like coughing or sneezing.

Now, let's think about what this could mean. Involuntary urine leakage when coughing or sneezing is pretty classic for something called stress urinary incontinence. This is a common issue, especially among older women, and it usually involves the urethra not being able to hold up against the press

In [21]:
question2 = """A 42-year-old IV drug user presents with fever, shortness of breath,
and a new murmur. Blood cultures grow gram-positive cocci in clusters.
Echocardiography shows vegetations on the tricuspid valve.
What is the most likely organism and why is this valve involved?"""

print("=== Post-Fine-Tuning Response (New Case) ===")
post_tuning_response2 = generate_response(question2)


=== Post-Fine-Tuning Response (New Case) ===
<｜begin▁of▁sentence｜>
### Clinical Case:
A 42-year-old IV drug user presents with fever, shortness of breath, 
and a new murmur. Blood cultures grow gram-positive cocci in clusters. 
Echocardiography shows vegetations on the tricuspid valve. 
What is the most likely organism and why is this valve involved?

### Step-by-Step Analysis:
<think>
Alright, let's think through this. We've got a 42-year-old IV drug user who's got a fever, shortness of breath, and a new murmur. Sounds like a lot going on. The blood cultures showed gram-positive cocci in clusters. Hmm, gram-positive cocci—could be a few different types, but one thing's for sure, they're likely to be the culprit.

Now, the echocardiography showing vegetations on the tricuspid valve is interesting. That means the infection has likely taken hold there. I remember that tricuspid valve endocarditis is a common thing in IV drug users. These infections often start in the tricuspid valve and 

In [22]:
from IPython.display import display, Markdown

def compare_responses(question, pre, post):
    display(Markdown(f"""
### **Question**:
{question}

#### **Pre-Fine-Tuning**:
```text
{pre.split('### Final Assessment:')[-1].strip()}
```

#### **Post-Fine-Tuning**:
```text
{post.split('### Final Assessment:')[-1].strip()}
```
"""))

compare_responses(question, pre_tuning_response, post_tuning_response)


### **Question**:  
A 61-year-old woman presents with involuntary urine leakage when coughing or sneezing, 
but no nighttime symptoms. Gynecological exam shows a hypermobile urethra. 
What is the most likely diagnosis and what would cystometry show?

#### **Pre-Fine-Tuning**:  
```text
### Clinical Case:
A 61-year-old woman presents with involuntary urine leakage when coughing or sneezing, 
but no nighttime symptoms. Gynecological exam shows a hypermobile urethra. 
What is the most likely diagnosis and what would cystometry show?

### Step-by-Step Analysis:
<think>
Okay, so I'm trying to figure out this clinical case. Let's break it down step by step. 

The patient is a 61-year-old woman who has involuntary urine leakage when she coughs or sneezes, but she doesn't have any nighttime symptoms. During her gynecological exam, they noticed that her urethra is hypermobile. I need to figure out what the most likely diagnosis is and what the cystometry would show.

First, I'll think about possible conditions that cause involuntary urine leakage during coughing or sneezing. I remember that when you cough or sneeze, you can sometimes feel a small urethral nick, which is a common, usually minor issue. But since this is happening involuntarily, maybe it's something more serious.

I know that the urethra is a muscle sphincter, so if there's a problem with its function, it might lead to incontinence. The hypermobile urethra mentioned in the exam could mean that the urethral sphincter is not closing properly. That makes me think of a condition like urethral sphincter insufficiency or urethral incompetence.

I also recall that in women, another possible cause could be a congenital issue, maybe something that wasn't fully repaired after childhood, but at 61, that's less likely. Alternatively, it could be related to aging, like atrophy, but atrophy usually affects the bladder and possibly the urethra as well. However, the specific issue here is more about the sphincter function.

Cystometry, which is a test to measure bladder pressure and sphincter resistance, is key here. If the urethral sphincter is weak, the cystometry would show low resistance. There might also be a urethral leak during filling, which would indicate that the sphincter isn't properly closing to prevent urination.

I think the most likely diagnosis is urethral sphincter insufficiency, which is a type of urinary incontinence where the urethral sphincter doesn't properly close, leading to involuntary leakage. This condition is more common in women and can be caused by childbirth, aging, or other trauma.

So, putting it all together, the likely diagnosis is urethral sphincter insufficiency, and the cystometry would show low sphincter resistance and possibly a leak during filling, confirming the insufficiency.
</think>

**Clinical Case Analysis:**

The 61-year-old woman presents with involuntary urine leakage upon coughing or sneezing, without nighttime symptoms. During the gynecological exam, a hypermobile urethra was noted. 

**Diagnosis:**
The most likely diagnosis is **urethral sphincter insufficiency**, a condition characterized by impaired closure of the urethral sphincter, leading to urinary incontinence, particularly during activities that increase intra-abdominal pressure, such as coughing or sneezing.

**Cystometry Findings:**
Cystometry would reveal:
- **Low sphincter resistance** indicating impaired sphincter function.
- Possible evidence of a **urethral leak** during filling, further supporting the diagnosis of sphincter insufficiency.

**Conclusion:**
The condition is consistent with urethral sphincter insufficiency, confirmed by the cystometric findings of low sphincter resistance and urethral leakage.
```

#### **Post-Fine-Tuning**:  
```text
The most likely diagnosis for a 61-year-old woman presenting with involuntary urine leakage when coughing or sneezing, but no nighttime symptoms, is stress urinary incontinence. This condition is characterized by the inability of the urethra to withstand increased pressure from activities like coughing or sneezing, leading to accidental urine leakage.

In a cystometric examination, if the diagnosis is confirmed, the test would typically show a low compliance pressure. This indicates that the bladder cannot withstand pressure changes and thus easily leaks when pressure is applied, aligning with the symptoms and physical findings described. The test would also show that the urethra is not functioning well, supporting the diagnosis of stress urinary incontinence.
```
```
