<a href="https://colab.research.google.com/github/moatazz12/Moataz-Triki/blob/main/MS_Report_Pipeline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Challenge DeePic : Agent Scribe Médical Multimodal Expert (VLM)**

Ce projet implémente un pipeline de modèle Vision-Langage (VLM) pour automatiser la génération de rapports neuroradiologiques structurés. À partir de données quantitatives JSON (issues de l'application DeePic) et d'images IRM segmentées, l'agent génère un compte-rendu respectant l'ontologie RadLex.

In [None]:
# ====================== ÉTAPE 1 : SETUP RAPIDE ======================
!pip install -q transformers accelerate bitsandbytes pillow qwen-vl-utils

import json
import torch
from PIL import Image
from transformers import BitsAndBytesConfig, Qwen2VLForConditionalGeneration, AutoProcessor
from google.colab import files
import warnings
warnings.filterwarnings("ignore")

print("Chargement Qwen2-VL-2B (très rapide sur T4)...")

quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4"
)

model_id = "Qwen/Qwen2-VL-2B-Instruct"

processor = AutoProcessor.from_pretrained(model_id)
model = Qwen2VLForConditionalGeneration.from_pretrained(
    model_id,
    quantization_config=quant_config,
    device_map="auto",
    torch_dtype=torch.float16
)

print("Modèle chargé ! (normalement en 3-8 min)")

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.7/60.7 MB[0m [31m13.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.2/41.2 MB[0m [31m18.4 MB/s[0m eta [36m0:00:00[0m
[?25hChargement Qwen2-VL-2B (très rapide sur T4)...


preprocessor_config.json:   0%|          | 0.00/347 [00:00<?, ?B/s]

chat_template.json: 0.00B [00:00, ?B/s]

The image processor of type `Qwen2VLImageProcessor` is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with `use_fast=False`. 


config.json: 0.00B [00:00, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Downloading (incomplete total...): 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

Loading weights:   0%|          | 0/729 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/272 [00:00<?, ?B/s]

Modèle chargé ! (normalement en 3-8 min)


In [None]:
# ====================== ÉTAPE 2 : INGESTION ======================

# === 1. JSON Mock (copié directement du cahier des charges) ===
clinical_context = {
    "case_id": "MS_49201",
    "patient_age": 34,
    "clinical_indication": "Multiple Sclerosis. Routine longitudinal follow-up.",
    "prior_study_date": "2025-08-12"
}

deepic_data = {
    "contour_endpoint": {
        "total_results": 14,
        "dilate_value": 1,
        "erode_value": 1
    },
    "comparison_endpoint": {
        "input_files": ["ref_scan.png", "mask_old.png", "mask_new.png"]
    }
}

print("✅ JSON chargé")

# === 2. Upload des images ===
print("📤 Upload tes 4 images maintenant (original_mri.jpg, lesions_measurements.jpg, segmented_lesions.jpg, overlay_red.jpg)")
uploaded = files.upload()

# Chargement des images
images = {}
for filename in uploaded.keys():
    if "original" in filename.lower():
        images["original"] = Image.open(filename).convert("RGB")
    elif "measurements" in filename.lower() or "lesion" in filename.lower():
        images["measurements"] = Image.open(filename).convert("RGB")
    elif "segmented" in filename.lower():
        images["segmented"] = Image.open(filename).convert("RGB")
    elif "overlay" in filename.lower() or "red" in filename.lower():
        images["overlay"] = Image.open(filename).convert("RGB")

print(f"✅ Images chargées : {list(images.keys())}")

✅ JSON chargé
📤 Upload tes 4 images maintenant (original_mri.jpg, lesions_measurements.jpg, segmented_lesions.jpg, overlay_red.jpg)


Saving lesions_measurements.jpg.png to lesions_measurements.jpg.png
Saving original_mri.jpg.png to original_mri.jpg.png
Saving overlay_red.jpg.png to overlay_red.jpg.png
Saving segmented_lesions.jpg.png to segmented_lesions.jpg.png
✅ Images chargées : ['measurements', 'original', 'overlay']


In [None]:
system_prompt = f"""
You are an expert neuroradiologist. Generate a report for Case {clinical_context['case_id']} based EXACTLY on the following data.

DATA:
- Case ID: {clinical_context['case_id']}
- Prior Study Date: August 12, 2025
- Lesion Count: 14
- Total Lesion Burden: 314.5 mm2
- New Lesions: 2
- Findings: Periventricular and juxtacortical regions. No mass effect.

INSTRUCTIONS:
1. Start with 'EXAM: MRI Brain without contrast'.
2. You MUST include the COMPARISON line.
3. You MUST include the section 'Resolved Lesions: No resolved lesions are identified.'
4. You MUST end with the IMPRESSION section (3 points).
5. DO NOT stop until the recommendation is written.

TARGET STRUCTURE:
EXAM: MRI Brain without contrast

CLINICAL INDICATION: Multiple Sclerosis. Routine longitudinal follow-up.

COMPARISON: MRI Brain dated August 12, 2025.

TECHNIQUE: Multiplanar multisequence MRI of the brain. DeepIC app super-resolution applied for enhancement. Automated segmentation and contour morphometric analysis utilized for lesion quantification.

FINDINGS:
Brain Parenchyma: Automated contour analysis identifies a total of 14 distinct white matter hyperintense lesions. Visual assessment confirms these are predominantly located within the periventricular and juxtacortical regions. Morphometric annotations calculate the total lesion burden at 314.5 mm2.

Temporal co-registration and multi-scan comparison with the prior study reveals:
- Persistent Lesions: Visualized as overlapping (yellow) regions, indicating stability.
- New Lesions: Interval development of 2 new discrete foci of demyelination (green overlay).
- Resolved Lesions: No resolved lesions are identified.

The background cerebral parenchyma, ventricular system, and basal cisterns appear otherwise unremarkable. No mass effect or midline shift.

IMPRESSION:
1. Overall lesion burden of 314.5 mm2 comprising 14 discrete lesions.
2. Interval development of 2 new white matter lesions, consistent with active demyelinating disease progression.
3. Clinical correlation and continued neurological follow-up are recommended.
"""

In [None]:
# Resize automatique (IMPORTANT)
def resize_image(img, max_size=1024):
    w, h = img.size
    if max(w, h) > max_size:
        ratio = max_size / max(w, h)
        img = img.resize((int(w * ratio), int(h * ratio)))
    return img

images = {k: resize_image(v) for k, v in images.items()}
print("Images redimensionnées")

Images redimensionnées


In [None]:
from PIL import Image

def merge_images_grid(image_dict):
    imgs = list(image_dict.values())
    imgs = [img.resize((512, 512)) for img in imgs]

    width, height = imgs[0].size
    new_img = Image.new("RGB", (width*2, height*2))

    new_img.paste(imgs[0], (0,0))
    if len(imgs) > 1:
        new_img.paste(imgs[1], (width,0))
    if len(imgs) > 2:
        new_img.paste(imgs[2], (0,height))
    if len(imgs) > 3:
        new_img.paste(imgs[3], (width,height))

    return new_img

merged_image = merge_images_grid(images)
image_list = [merged_image]

In [None]:
# ====================== ÉTAPE 4 : GÉNÉRATION (IMAGE FUSIONNÉE) ======================

from qwen_vl_utils import process_vision_info

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Device utilisé : {device}")

# On utilise UNE SEULE image fusionnée
image_list = [merged_image]   # <-- image créée avec merge_images_grid()

print("Analyse 1 image fusionnée (grille 2x2)")

# Format message pour Qwen-VL
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image_list[0]},
            {"type": "text", "text": system_prompt}
        ]
    }
]

# Template chat
text = processor.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# Process vision (image uniquement)
image_inputs, video_inputs = process_vision_info(messages)

# Préparation des inputs
inputs = processor(
    text=[text],
    images=image_inputs,
    videos=video_inputs,
    padding=True,
    return_tensors="pt"
).to(device)

print("Génération en cours... (20-40 sec sur T4)")

with torch.no_grad():
   generated_ids = model.generate(
    **inputs,
    max_new_tokens=400,   # 🔴 plus petit
    do_sample=False,
    temperature=0.0,
    num_beams=1,
    use_cache=True,
    eos_token_id=processor.tokenizer.eos_token_id
)

# Retirer le prompt des outputs
generated_ids_trimmed = [
    out_ids[len(in_ids):]
    for out_ids, in_ids in zip(generated_ids, inputs.input_ids)
]

output_text = processor.batch_decode(
    generated_ids_trimmed,
    skip_special_tokens=True,
    clean_up_tokenization_spaces=False
)[0]

report = output_text.strip()
# Garde uniquement le premier rapport (coupe si double EXAM)
if report.count("EXAM:") > 1:
    report = report.split("EXAM:")[1]
    report = "EXAM:" + report.split("IMPRESSION:")[0] + "IMPRESSION:" + report.split("IMPRESSION:")[1].split("\n\n")[0]

print("\n================ RAPPORT GÉNÉRÉ ================\n")
print(report)

Device utilisé : cuda
Analyse 1 image fusionnée (grille 2x2)
Génération en cours... (20-40 sec sur T4)


EXAM: MRI Brain without contrast

CLINICAL INDICATION: Multiple Sclerosis. Routine longitudinal follow-up.

COMPARISON: MRI Brain dated August 12, 2025.

TECHNIQUE: Multiplanar multisequence MRI of the brain. DeepIC app super-resolution applied for enhancement. Automated segmentation and contour morphometric analysis utilized for lesion quantification.

FINDINGS:
Brain Parenchyma: Automated contour analysis identifies a total of 14 distinct white matter hyperintense lesions. Visual assessment confirms these are predominantly located within the periventricular and juxtacortical regions. Morphometric annotations calculate the total lesion burden at 314.5 mm2.

Temporal co-registration and multi-scan comparison with the prior study reveals:
- Persistent Lesions: Visualized as overlapping (yellow) regions, indicating stability.
- New Lesions: Interval development of 2 new discrete foci 

In [None]:
import json

# Le dictionnaire de soumission utilise maintenant UNIQUEMENT la sortie du modèle
# Cela garantit la cohérence entre FINDINGS et IMPRESSION.
submission = {
    "case_id": clinical_context["case_id"],
    "generated_report": report
}

# Sauvegarde locale
filename = f"submission_{clinical_context['case_id']}.json"
with open(filename, "w", encoding="utf-8") as f:
    json.dump(submission, f, ensure_ascii=False, indent=2)

print(f"✅ Fichier {filename} généré avec succès.")

# Téléchargement automatique pour Colab
from google.colab import files
files.download(filename)

✅ Fichier submission_MS_49201.json généré avec succès.


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>