# 3.03 All Province Alberta Crosswalk

**Consolidated notebook for mapping Alberta billing codes to BC, MB, ON, and SK equivalents.**

## Workflow
1. Upload all files (PDFs, Reference CSVs, Taxonomy)
2. Configure Alberta code to match
3. Run each province individually
4. Combine results

## Province-Specific Features
| Province | Chunking | Special Features |
|----------|----------|------------------|
| BC | Level 1 | Code prefixes (P, G, PG) |
| MB | Level 1 | Specialty-based fees |
| ON | Level 2 | H/P settings, Surg/Asst/Anae fees |
| SK | Level 1 | Referred/Not Referred dual-fees, Age premiums |

---
# STEP 1: Setup
---

## Cell 1: Install Dependencies

In [None]:
!pip install openai pandas pdfplumber openpyxl tqdm PyMuPDF -q

import pandas as pd
import pdfplumber
import fitz  # PyMuPDF
import json
import re
from tqdm.notebook import tqdm
from google.colab import files

print("All dependencies loaded.")
print("Ready to proceed.")

---
# STEP 2: Upload Files
---

## Cell 2a: Upload Province PDFs

Upload all 4 province schedule PDFs. Files will be auto-detected by name.

In [None]:
print("="*70)
print("STEP 2a: Upload Province Schedule PDFs")
print("="*70)
print("\nExpected files:")
print("  - BC Payment Schedule - March 31, 2024.pdf")
print("  - MB Payment Schedule - April 1, 2024.pdf")
print("  - ON - February 20, 2024 (effective April 1, 2024).pdf")
print("  - SK Payment Schedule - April 1, 2024.pdf")
print()

uploaded_pdfs = files.upload()

# Auto-detect province from filename
PDF_FILES = {'BC': None, 'MB': None, 'ON': None, 'SK': None}

for filename in uploaded_pdfs.keys():
    filename_upper = filename.upper()
    if 'BC' in filename_upper:
        PDF_FILES['BC'] = filename
    elif 'MB' in filename_upper:
        PDF_FILES['MB'] = filename
    elif 'ON' in filename_upper:
        PDF_FILES['ON'] = filename
    elif 'SK' in filename_upper:
        PDF_FILES['SK'] = filename

print("\n" + "="*70)
print("Detected PDFs:")
print("="*70)
for prov, f in PDF_FILES.items():
    status = "✓" if f else "✗ MISSING"
    print(f"  {prov}: {status} {f if f else ''}")

# Warn if any missing
missing = [p for p, f in PDF_FILES.items() if f is None]
if missing:
    print(f"\n⚠️  WARNING: Missing PDFs for: {', '.join(missing)}")
    print("    You can still run the provinces that have PDFs.")
else:
    print("\n✓ All 4 province PDFs loaded successfully.")

## Cell 2b: Upload Section Reference CSVs

Upload all 4 section reference CSVs. Files will be auto-detected by name.

In [None]:
print("="*70)
print("STEP 2b: Upload Section Reference CSVs")
print("="*70)
print("\nExpected files:")
print("  - bc_section_reference_simple.csv")
print("  - manitoba_section_reference_final.csv")
print("  - on_section_reference_full.csv")
print("  - sk_section_reference_simple.csv")
print()

uploaded_refs = files.upload()

# Auto-detect province from filename
REF_FILES = {'BC': None, 'MB': None, 'ON': None, 'SK': None}

for filename in uploaded_refs.keys():
    filename_lower = filename.lower()
    if 'bc' in filename_lower:
        REF_FILES['BC'] = filename
    elif 'mb' in filename_lower or 'manitoba' in filename_lower:
        REF_FILES['MB'] = filename
    elif 'on' in filename_lower:
        REF_FILES['ON'] = filename
    elif 'sk' in filename_lower:
        REF_FILES['SK'] = filename

print("\n" + "="*70)
print("Detected Reference CSVs:")
print("="*70)
for prov, f in REF_FILES.items():
    status = "✓" if f else "✗ MISSING"
    print(f"  {prov}: {status} {f if f else ''}")

# Warn if any missing
missing = [p for p, f in REF_FILES.items() if f is None]
if missing:
    print(f"\n⚠️  WARNING: Missing reference CSVs for: {', '.join(missing)}")
    print("    You can still run the provinces that have reference files.")
else:
    print("\n✓ All 4 province reference CSVs loaded successfully.")

## Cell 2c: Upload Extraction Taxonomy

Upload the extraction taxonomy Excel file for Phase 2 attribute extraction.

In [None]:
print("="*70)
print("STEP 2c: Upload Extraction Taxonomy")
print("="*70)
print("\nExpected file:")
print("  - extraction_taxonomy.xlsx")
print()

uploaded_tax = files.upload()

TAXONOMY_FILE = list(uploaded_tax.keys())[0]
df_taxonomy = pd.read_excel(TAXONOMY_FILE)

print("\n" + "="*70)
print(f"Loaded Taxonomy: {TAXONOMY_FILE}")
print("="*70)
print(f"\n{len(df_taxonomy)} attributes:")
for _, row in df_taxonomy.iterrows():
    print(f"  - {row['attribute']}: {row['data_type']}")

# Build taxonomy reference string for prompts
taxonomy_reference = "\n".join([
    f"- {row['attribute']} ({row['data_type']}): {row['definition']} Taxonomy: {row['taxonomy']}"
    for _, row in df_taxonomy.iterrows()
])

print("\n✓ Taxonomy loaded and ready for Phase 2.")

## Cell 3: API Key

Enter your OpenAI API key.

In [None]:
print("="*70)
print("STEP 2d: API Key")
print("="*70)

OPENAI_API_KEY = ""  # <-- Paste your key here, or leave blank to use getpass

if not OPENAI_API_KEY:
    from getpass import getpass
    OPENAI_API_KEY = getpass("Enter OpenAI API Key: ")

from openai import OpenAI
client = OpenAI(api_key=OPENAI_API_KEY)

print("\n✓ API client initialized.")
print("\n" + "="*70)
print("SETUP COMPLETE - Ready to configure Alberta code and run provinces")
print("="*70)

---
# STEP 3: Alberta Code Configuration
---

**Edit this cell to change the Alberta code being mapped.**

## Cell 4: Alberta Code Config

⚠️ **EDIT THIS CELL** to map a different Alberta code.

In [None]:
# ============================================================================
# ALBERTA CODE CONFIGURATION
# ============================================================================
# Edit this section to map a different Alberta billing code.
# The province configs (Cell 5) should NOT need to change.
# ============================================================================

ALBERTA_CODE_CONFIG = {
    # Basic code info
    'code': '03.03CV',
    'description': 'Telehealth consultation',
    'fee': 25.09,
    
    # Clinical definition - describes the service in detail
    'clinical_definition': """Assessment of a patient's condition via telephone or secure videoconference.

NOTE:
- At minimum: limited assessment requiring history related to presenting problems, appropriate records review, and advice to the patient
- Total physician time spent providing patient care must be MINIMUM 10 MINUTES
- If less than 10 minutes same day, must use HSC 03.01AD instead
- May only be claimed if service was initiated by the patient or their agent
- May only be claimed if service is personally rendered by the physician
- Benefit includes ordering appropriate diagnostic tests and discussion with patient
- Patient record must include detailed summary of all services including start/stop times
- Time spent on administrative tasks cannot be claimed
- May NOT be claimed same day as: 03.01AD, 03.01S, 03.01T, 03.03FV, 03.05JR, 03.08CV, 08.19CV, 08.19CW, or 08.19CX by same physician for same patient
- May NOT be claimed same day as in-person visit or consultation by same physician for same patient

Category: V Visit (Virtual)
Base rate: $25.09""",
    
    # Service type context - helps LLM understand what we're looking for
    'service_context': """This is a BASIC PATIENT-FACING virtual visit by any physician (not specialist-specific, not physician-to-physician).""",
    
    # What to search for (specific to this AB code type)
    'search_criteria': """
WHAT TO LOOK FOR:
- Virtual visits / virtual care
- Telephone consultations / assessments
- Video consultations / assessments
- Telehealth codes
- Any code that can be billed for a patient-facing virtual encounter
""",
    
    # What to exclude (specific to this AB code type)
    'exclusion_criteria': """
DO NOT INCLUDE:
- Physician-to-physician consultations (e-consults between doctors)
- E-assessments / e-consults (specialist-to-PCP) - not patient-facing
- In-person only codes
- Diagnostic procedures (ECG, imaging, labs)
- Codes you cannot find literally in the text
""",
}

# Display config
print("="*70)
print("ALBERTA CODE CONFIGURATION")
print("="*70)
print(f"\nCode: {ALBERTA_CODE_CONFIG['code']}")
print(f"Description: {ALBERTA_CODE_CONFIG['description']}")
print(f"Fee: ${ALBERTA_CODE_CONFIG['fee']}")
print("\n✓ Alberta code configured.")

---
# STEP 4: Province Configurations
---

**DO NOT EDIT** unless province schedule structure changes.

## Cell 5: Province Configs

Static configurations for each province's schedule structure.

In [None]:
# Placeholder - Will be built in Step 4
print("Province configs will be added in Step 4")

---
# STEP 5: Shared Functions
---

## Cell 6: Shared Functions

Core processing functions used by all provinces.

In [None]:
# Placeholder - Will be built in Step 5
print("Shared functions will be added in Step 5")

---
# STEP 6: Run Provinces
---

Run each province individually. Results are saved and downloaded after each province completes.

## Cell 7a: Run British Columbia

In [None]:
# Placeholder - Will be built in Step 6
print("BC processing will be added in Step 6")

## Cell 7b: Run Manitoba

In [None]:
# Placeholder - Will be built in Step 6
print("MB processing will be added in Step 6")

## Cell 7c: Run Ontario

In [None]:
# Placeholder - Will be built in Step 6
print("ON processing will be added in Step 6")

## Cell 7d: Run Saskatchewan

In [None]:
# Placeholder - Will be built in Step 6
print("SK processing will be added in Step 6")

---
# STEP 7: Combine & Summary
---

## Cell 8: Combine All Provinces & Final Summary

In [None]:
# Placeholder - Will be built in Step 7
print("Combine logic will be added in Step 7")