# **Medical Prescription Reader — Handwritten Prescription to Structured Excel Report**

This notebook demonstrates how to build an end-to-end pipeline for reading handwritten doctor's prescriptions using Sarvam AI's suite of models.

### **Use Case**
Automate prescription digitisation for pharmacies, hospitals, and health-tech applications.
1. **Extract:** Use **Sarvam Vision Document Intelligence** to OCR the prescription image or PDF.
2. **Parse:** Use **Sarvam-M** to convert raw OCR text into clean, structured JSON (patient info, medications, diagnosis).
3. **Export:** Use **openpyxl** to write the prescription data into a formatted two-sheet Excel report.

### **Supported Formats**
- Images: `.jpg`, `.jpeg`, `.png`
- Documents: `.pdf`
- Languages: 22 Indian languages + English (Hindi, Tamil, Telugu, Kannada, Bengali, Gujarati, Marathi, and more)

> ⚠️ **Disclaimer:** This notebook is for **demo and educational purposes only**. Do **not** use it for real medical decisions, dispensing medications, or clinical workflows. Always consult a qualified healthcare professional.

In [None]:
# Pinning versions for reproducibility
!pip install -Uqq sarvamai>=0.1.24 openpyxl>=3.1.0 python-dotenv>=1.0.0 Pillow>=12.1.1

### **1. Setup & API Key**

Obtain your API key from the [Sarvam AI Dashboard](https://dashboard.sarvam.ai).
Create a `.env` file in this directory with `SARVAM_API_KEY=your_key_here`, or set the environment variable directly.

In [None]:
from __future__ import annotations

import os
import json
import re
import zipfile
import tempfile
from pathlib import Path

from dotenv import load_dotenv
from sarvamai import SarvamAI

load_dotenv()

SARVAM_API_KEY = os.environ.get("SARVAM_API_KEY", "")
if not SARVAM_API_KEY or SARVAM_API_KEY == "YOUR_SARVAM_API_KEY":
    raise RuntimeError(
        "SARVAM_API_KEY is not set. Add it to your .env file or set the environment variable."
    )

client = SarvamAI(api_subscription_key=SARVAM_API_KEY)

print("Client initialised.")

### **2. Step 1 — EXTRACT: Document Intelligence Helper**

`extract_prescription_text` sends the prescription file to Sarvam Vision Document Intelligence and returns the extracted text as a Markdown string.

The API uses an async job workflow: create → upload → start → wait → download (ZIP) → unzip.

> **Note:** The API accepts `.pdf` or `.zip` only. PNG/JPG images are automatically wrapped in a ZIP before upload.

In [None]:
_IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png'}


def extract_prescription_text(file_path: str) -> str:
    """Extract text from a prescription image or PDF using Sarvam Document Intelligence.

    Images (.jpg, .png) are automatically wrapped in a ZIP archive before upload,
    as the API only accepts PDF or ZIP files directly.
    """
    path = Path(file_path)
    upload_path = file_path
    tmp_zip: str | None = None

    if path.suffix.lower() in _IMAGE_EXTENSIONS:
        # Wrap image in a flat ZIP — required by the Document Intelligence API
        with tempfile.NamedTemporaryFile(suffix='.zip', delete=False) as tmp:
            tmp_zip = tmp.name
        with zipfile.ZipFile(tmp_zip, 'w', zipfile.ZIP_DEFLATED) as zf:
            zf.write(file_path, arcname=path.name)
        upload_path = tmp_zip

    try:
        job = client.document_intelligence.create_job(
            language="en-IN",
            output_format="md"
        )
        job.upload_file(upload_path)
        job.start()

        status = job.wait_until_complete()
        if status.job_state != "Completed":
            raise RuntimeError(
                f"Document Intelligence job ended with state: {status.job_state}. "
                f"Details: {status}"
            )

        # Download ZIP output to a temp file, then extract markdown in-memory
        with tempfile.NamedTemporaryFile(suffix='.zip', delete=False) as tmp:
            out_zip = tmp.name

        try:
            job.download_output(out_zip)
            with zipfile.ZipFile(out_zip, 'r') as zf:
                md_files = [f for f in zf.namelist() if f.endswith('.md')]
                if not md_files:
                    raise RuntimeError(
                        "No markdown output found in Document Intelligence result. "
                        f"ZIP contents: {zf.namelist()}"
                    )
                with zf.open(md_files[0]) as f:
                    return f.read().decode('utf-8')
        finally:
            os.unlink(out_zip)

    finally:
        if tmp_zip:
            os.unlink(tmp_zip)


print("extract_prescription_text defined.")

### **3. Step 2 — PARSE: Structured JSON Extraction**

`parse_prescription` sends the raw OCR markdown to **Sarvam-M** with a structured prompt and returns a validated Python dict.

A `confidence` score below **0.85** triggers a warning — flag those prescriptions for manual review before acting on the extracted data.

In [None]:
PARSE_SYSTEM_PROMPT = """You are a precise medical prescription data extractor. Extract the following fields from the prescription text and return ONLY valid JSON with no other text, no markdown fences, no explanation.

Required JSON schema:
{
  \"patient_name\": \"<string or null>\",
  \"patient_age\": \"<string or null>\",
  \"doctor_name\": \"<string or null>\",
  \"doctor_registration_no\": \"<string or null>\",
  \"date\": \"<DD-MM-YYYY format or null>\",
  \"medications\": [
    {
      \"drug_name\": \"<string>\",
      \"dosage\": \"<string>\",
      \"frequency\": \"<string>\",
      \"duration\": \"<string>\",
      \"instructions\": \"<string or null>\"
    }
  ],
  \"diagnosis\": \"<string or null>\",
  \"follow_up\": \"<string or null>\",
  \"language_detected\": \"<primary language of the prescription, e.g. English, Hindi, Tamil>\",
  \"confidence\": <float between 0.0 and 1.0>
}

Rules:
- Use null (not \"null\") for fields not present in the prescription
- medications must always be an array, even if empty
- confidence reflects how completely all fields could be read (1.0 = perfect, 0.0 = unreadable)
- Return ONLY the JSON object"""


def parse_prescription(raw_text: str) -> dict:
    """Parse raw prescription text into structured JSON using Sarvam-M."""
    response = client.chat.completions(
        messages=[
            {"role": "system", "content": PARSE_SYSTEM_PROMPT},
            {"role": "user", "content": f"Extract data from this prescription:\n\n{raw_text}"}
        ]
    )

    if not response or not response.choices:
        raise ValueError("Sarvam-M returned no response. Check your API quota.")

    content = response.choices[0].message.content
    if content is None:
        raise ValueError("Sarvam-M returned an empty message content.")

    raw_json = content.strip()
    # Strip markdown code fences if the model wraps output anyway
    raw_json = re.sub(r'^```(?:json)?\s*|\s*```$', '', raw_json, flags=re.DOTALL).strip()

    try:
        parsed = json.loads(raw_json)
    except json.JSONDecodeError:
        print(f"ERROR: Could not parse JSON from model response:\n{raw_json}")
        raise

    confidence = parsed.get("confidence", 1.0)
    if confidence < 0.85:
        print(
            f"WARNING: Low confidence ({confidence:.2f}) — review this prescription manually "
            "before acting on the extracted data."
        )

    return parsed


print("parse_prescription defined.")

### **4. Step 3 — EXPORT: Excel Prescription Report Writer**

`write_to_excel` takes a parsed prescription dict and writes it into a formatted `.xlsx` file with two sheets:

- **Summary** — patient name, age, doctor details, date, diagnosis, follow-up, language, confidence
- **Medications** — one row per drug with dosage, frequency, duration, and instructions

In [None]:
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment


def write_to_excel(prescription: dict, output_path: str) -> None:
    """Write a parsed prescription dict to a two-sheet Excel workbook.

    Sheet 1 — Summary: patient/doctor info, diagnosis, follow-up.
    Sheet 2 — Medications: tabular medication list.
    """
    wb = Workbook()
    header_fill = PatternFill(start_color="1F4E79", end_color="1F4E79", fill_type="solid")
    header_font = Font(color="FFFFFF", bold=True)
    label_font  = Font(bold=True)

    # ── Sheet 1: Summary ─────────────────────────────────────────────
    ws_summary = wb.active
    ws_summary.title = "Summary"

    title_cell = ws_summary.cell(row=1, column=1, value="Prescription Summary")
    ws_summary.merge_cells("A1:B1")
    title_cell.font      = Font(color="FFFFFF", bold=True, size=14)
    title_cell.fill      = PatternFill(start_color="1F4E79", end_color="1F4E79", fill_type="solid")
    title_cell.alignment = Alignment(horizontal="center")

    summary_fields = [
        ("Patient Name",       prescription.get("patient_name")),
        ("Patient Age",        prescription.get("patient_age")),
        ("Doctor Name",        prescription.get("doctor_name")),
        ("Doctor Reg. No.",    prescription.get("doctor_registration_no")),
        ("Date",               prescription.get("date")),
        ("Diagnosis",          prescription.get("diagnosis")),
        ("Follow Up",          prescription.get("follow_up")),
        ("Language Detected",  prescription.get("language_detected")),
        ("Confidence",         prescription.get("confidence")),
    ]

    for row_idx, (label, value) in enumerate(summary_fields, start=2):
        label_cell = ws_summary.cell(row=row_idx, column=1, value=label)
        label_cell.font = label_font
        ws_summary.cell(row=row_idx, column=2, value=value)

    ws_summary.column_dimensions["A"].width = 22
    ws_summary.column_dimensions["B"].width = 50

    # ── Sheet 2: Medications ──────────────────────────────────────────
    ws_meds = wb.create_sheet(title="Medications")

    med_headers = ["Drug Name", "Dosage", "Frequency", "Duration", "Instructions"]
    for col_idx, label in enumerate(med_headers, start=1):
        cell = ws_meds.cell(row=1, column=col_idx, value=label)
        cell.font      = header_font
        cell.fill      = header_fill
        cell.alignment = Alignment(horizontal="center")

    ws_meds.freeze_panes = "A2"

    for med in prescription.get("medications", []):
        ws_meds.append([
            med.get("drug_name"),
            med.get("dosage"),
            med.get("frequency"),
            med.get("duration"),
            med.get("instructions"),
        ])

    for col in ws_meds.columns:
        max_len = max((len(str(cell.value or "")) for cell in col), default=10)
        ws_meds.column_dimensions[col[0].column_letter].width = min(max_len + 4, 50)

    wb.save(output_path)
    print(f"Prescription report saved to: {output_path}")


print("write_to_excel defined.")

### **5. End-to-End Pipeline**

`process_prescription` ties all three steps together. Pass a single file path (image or PDF) and an output Excel path.

In [None]:
def process_prescription(file_path: str, output_path: str = "prescription_report.xlsx") -> dict | None:
    """
    End-to-end pipeline: extract -> parse -> export for a single prescription file.

    Args:
        file_path:   Path to a prescription image (.jpg, .png) or PDF (.pdf).
        output_path: Path for the output Excel file.

    Returns:
        Parsed prescription dict, or None if processing failed.
    """
    print(f"Processing: {file_path}")
    try:
        print("  Step 1/3 — Extracting text via Document Intelligence...")
        raw_text = extract_prescription_text(file_path)

        print("  Step 2/3 — Parsing structured data with Sarvam-M...")
        parsed = parse_prescription(raw_text)

        print("  Step 3/3 — Writing prescription report to Excel...")
        write_to_excel(parsed, output_path)

        print(
            f"\nPatient: {parsed.get('patient_name')} | "
            f"Doctor: {parsed.get('doctor_name')} | "
            f"Medications: {len(parsed.get('medications', []))} | "
            f"Confidence: {parsed.get('confidence', 0):.2f}"
        )
        return parsed

    except Exception as e:
        print(f"ERROR: Failed to process {file_path}: {e}")
        return None


print("process_prescription defined.")

### **6. Demo — Run the Pipeline**

Cell 8 creates a synthetic handwritten-style prescription using Pillow (no real prescription required), then runs the full pipeline. The image mimics a doctor's notepad with a printed letterhead, ruled lines, and hand-inked text in blue.

In [None]:
import random
from PIL import Image, ImageDraw, ImageFont


def _load_font(size: int) -> ImageFont.FreeTypeFont:
    """Load a TrueType font with cross-platform fallbacks."""
    candidates = [
        "/System/Library/Fonts/Helvetica.ttc",          # macOS
        "/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf",  # Linux (Debian/Ubuntu)
        "/usr/share/fonts/dejavu/DejaVuSans.ttf",           # Linux (Fedora/RHEL)
        "C:/Windows/Fonts/Arial.ttf",                        # Windows
    ]
    for path in candidates:
        try:
            return ImageFont.truetype(path, size)
        except (IOError, OSError):
            continue
    return ImageFont.load_default()


def _create_sample_prescription(output_path: str = "sample_data/sample_prescription.png") -> str:
    """Create a synthetic handwritten-style doctor's prescription image for demo purposes."""
    Path(output_path).parent.mkdir(parents=True, exist_ok=True)

    # Off-white background to simulate a doctor's notepad
    img  = Image.new("RGB", (700, 900), color=(255, 252, 240))
    draw = ImageDraw.Draw(img)

    font_title = _load_font(22)
    font_body  = _load_font(16)
    font_rx    = _load_font(38)
    font_med   = _load_font(15)
    font_small = _load_font(13)

    ink = (0, 0, 140)  # dark-blue pen colour

    # Ruled lines to simulate prescription pad paper
    for y_line in range(120, 880, 30):
        draw.line([(30, y_line), (670, y_line)], fill=(200, 200, 220), width=1)

    # Printed letterhead
    draw.rectangle([(0, 0), (700, 110)], fill=(20, 60, 120))
    draw.text((350, 22), "Dr. Rajesh Kumar",                              font=font_title, fill="white",         anchor="mm")
    draw.text((350, 50), "MBBS, MD (General Medicine)",                   font=font_small, fill=(200, 220, 255), anchor="mm")
    draw.text((350, 70), "Reg. No: MH-12345  |  Consulting: 9AM - 2PM",  font=font_small, fill=(200, 220, 255), anchor="mm")
    draw.text((350, 90), "City Hospital, Mumbai  |  Ph: +91-22-98765432", font=font_small, fill=(200, 220, 255), anchor="mm")

    # Fixed seed for reproducible jitter
    rng = random.Random(42)

    def jx(x: int, y: int, mx: int = 2) -> tuple:
        """Apply small random offset to simulate handwriting."""
        return x + rng.randint(-mx, mx), y + rng.randint(-mx, mx)

    # Handwritten body
    y = 130
    draw.text(jx(420, y), "Date: 10-02-2025",                                   font=font_body, fill=ink)
    y += 38
    draw.text(jx(40,  y), "Patient: Priya Sharma",                               font=font_body, fill=ink)
    draw.text(jx(420, y), "Age: 34 yrs",                                         font=font_body, fill=ink)
    y += 38
    draw.text(jx(40,  y), "Diagnosis: Viral fever with mild allergic rhinitis",  font=font_body, fill=ink)

    y += 55
    draw.text((40, y), "Rx", font=font_rx, fill=ink)
    y += 55

    medications = [
        ("1.", "Tab. Paracetamol 500mg", "1-0-1  (after meals)", "x 5 days"),
        ("2.", "Tab. Cetirizine 10mg",   "0-0-1  (at night)",    "x 7 days"),
        ("3.", "Tab. Vitamin C 500mg",   "1-0-0",                "x 30 days"),
    ]

    for num, drug, sig, dur in medications:
        draw.text(jx(40, y), num,  font=font_med, fill=ink)
        draw.text(jx(70, y), drug, font=font_med, fill=ink)
        y += 30
        draw.text(jx(70, y), f"Sig: {sig}   {dur}", font=font_small, fill=ink)
        y += 38

    y += 20
    draw.text(jx(40,  y), "Follow up: After 1 week",  font=font_body,  fill=ink)
    y += 65
    draw.text(jx(430, y), "Dr. Rajesh Kumar",          font=font_body,  fill=ink)
    y += 28
    draw.text(jx(430, y), "(Signature & Stamp)",       font=font_small, fill=(100, 100, 160))

    # Slight rotation to simulate a scanned/photographed document
    img = img.rotate(angle=-1.5, fillcolor=(255, 252, 240), expand=False)

    img.save(output_path)
    print(f"Sample prescription created: {output_path}")
    return output_path


# --- Run the demo ---
sample_path = _create_sample_prescription()
result = process_prescription(sample_path, output_path="prescription_report.xlsx")

### **7. Results**

View the parsed JSON and download the generated Excel report.

In [None]:
from IPython.display import FileLink, display

if result:
    print("=== Parsed Prescription Data ===\n")
    print(json.dumps(result, indent=2, ensure_ascii=False))

    print("\n=== Download Prescription Report ===")
    display(FileLink("prescription_report.xlsx", result_html_prefix="Click to download: "))
else:
    print("No result to display. Check the error messages above.")

### **8. Error Reference**

| Error Code | HTTP Status | Cause | Solution |
| :--- | :--- | :--- | :--- |
| `invalid_api_key_error` | 403 | Invalid API key | Verify your key at [dashboard.sarvam.ai](https://dashboard.sarvam.ai). |
| `insufficient_quota_error` | 429 | Quota exceeded | Check your usage limits. |
| `internal_server_error` | 500 | Server-side issue | Wait and retry the request. |
| Job state not `Completed` | — | Doc Intelligence failure | Check file format; supported: `.pdf`, `.zip` (images auto-wrapped). |
| `JSONDecodeError` | — | Sarvam-M returned non-JSON | Usually transient; re-run the cell. |

### **9. Conclusion & Resources**

This recipe demonstrates how to chain **Sarvam Vision** and **Sarvam-M** into a practical prescription digitisation workflow — reading handwritten prescriptions in any Indian language with a single pipeline call.

* [Sarvam AI Docs](https://docs.sarvam.ai)
* [Document Intelligence API](https://docs.sarvam.ai/api-reference-docs/document-intelligence)
* [Sarvam-M Chat API](https://docs.sarvam.ai/api-reference-docs/chat)
* [Indic Language Support](https://docs.sarvam.ai/language-support)

> ⚠️ **Reminder:** This notebook is for demo and educational purposes only. Do not use it for real medical decisions.

**Keep Building!**