# Crosswalk Telehealth - All Provinces

Finds equivalent billing codes for Alberta **03.03CV - Telehealth consultation** across:
- Ontario (ON)
- Saskatchewan (SK)
- Manitoba (MB)
- British Columbia (BC)

**Each province runs in its own cell** - if one fails, re-run just that cell.

**Output:** Single unified Excel with all provinces

## 1. Setup

In [None]:
!pip install openai pandas pdfplumber openpyxl tqdm -q
print('Ready')

## 2. Upload All Provincial PDFs

In [None]:
from google.colab import files

print("Upload all 4 provincial schedule PDFs (Ctrl+click to select multiple):")
print("  - Ontario (ON) - Schedule of Benefits")
print("  - Saskatchewan (SK) - Payment Schedule")
print("  - Manitoba (MB) - Physician Manual")
print("  - British Columbia (BC) - Payment Schedule")
print()
uploaded = files.upload()

# Detect which file is which province
ON_PDF = SK_PDF = MB_PDF = BC_PDF = None

for f in uploaded.keys():
    f_lower = f.lower()
    if 'on_' in f_lower or 'ontario' in f_lower or 'moh' in f_lower:
        ON_PDF = f
    elif 'sk_' in f_lower or 'sask' in f_lower:
        SK_PDF = f
    elif 'mb_' in f_lower or 'manit' in f_lower or 'winnipeg' in f_lower:
        MB_PDF = f
    elif 'bc_' in f_lower or 'british' in f_lower:
        BC_PDF = f

print(f"\nDetected files:")
print(f"  Ontario:      {ON_PDF}")
print(f"  Saskatchewan: {SK_PDF}")
print(f"  Manitoba:     {MB_PDF}")
print(f"  BC:           {BC_PDF}")

# Verify all found
missing = []
if not ON_PDF: missing.append('Ontario')
if not SK_PDF: missing.append('Saskatchewan')
if not MB_PDF: missing.append('Manitoba')
if not BC_PDF: missing.append('BC')

if missing:
    print(f"\n WARNING: Could not detect files for: {', '.join(missing)}")
    print("Please rename files to include province name/code and re-upload")
else:
    print("\n All 4 provinces detected!")

## 3. API Key

In [None]:
OPENAI_API_KEY = ""  # <-- Paste your key here

if not OPENAI_API_KEY:
    from getpass import getpass
    OPENAI_API_KEY = getpass("API Key: ")

from openai import OpenAI
client = OpenAI(api_key=OPENAI_API_KEY)
print("API ready")

## 4. Alberta Code + Core Functions

In [None]:
import pandas as pd
import pdfplumber
import json
import re
from tqdm.notebook import tqdm

# Alberta code definition
AB_CODE = "03.03CV"
AB_DESC = "Telehealth consultation"
AB_FEE = 25.09

AB_CLINICAL_DEFINITION = """Assessment of a patient's condition via telephone or secure videoconference.

NOTE:
- At minimum: limited assessment requiring history related to presenting problems, appropriate records review, and advice to the patient
- Total physician time spent providing patient care must be MINIMUM 10 MINUTES
- If less than 10 minutes same day, must use HSC 03.01AD instead
- May only be claimed if service was initiated by the patient or their agent
- May only be claimed if service is personally rendered by the physician
- Benefit includes ordering appropriate diagnostic tests and discussion with patient
- Patient record must include detailed summary of all services including start/stop times
- Time spent on administrative tasks cannot be claimed
- May NOT be claimed same day as: 03.01AD, 03.01S, 03.01T, 03.03FV, 03.05JR, 03.08CV, 08.19CV, 08.19CW, or 08.19CX by same physician for same patient
- May NOT be claimed same day as in-person visit or consultation by same physician for same patient

Category: V Visit (Virtual)
Base rate: $25.09"""

# Tracking
PAGES_PER_CALL = 10
total_cost = 0.0
total_calls = 0

# Store results from all provinces
all_results = []

def track_cost(inp, out):
    global total_cost, total_calls
    total_cost += (inp/1e6)*3.0 + (out/1e6)*15.0
    total_calls += 1

def is_valid_level2_header(line):
    """Check if line is a valid Level 2 subsection header."""
    line = line.strip()
    if not re.match(r'^[A-Z][a-z]', line):
        return False
    if len(line) < 5 or len(line) > 60:
        return False
    if re.search(r'[.!?]\s*$', line):
        return False
    if re.search(r'[.!?]\s+[A-Z]', line):
        return False
    if re.search(r'[\[\]\(\)]', line):
        return False
    non_header_starts = ['The ', 'This ', 'A ', 'An ', 'If ', 'For ', 'When ', 'Where ', 
                         'Note', 'See ', 'Refer', 'Include', 'Exclude', 'Payment']
    for start in non_header_starts:
        if line.startswith(start):
            return False
    if any(c.isdigit() for c in line[:4]):
        return False
    if '....' in line or '. . .' in line:
        return False
    return True

def build_prompt(province_name, batch_pages, context, section_info):
    """Build the search prompt - EXACT same logic as working Ontario version."""
    section_text = section_info['level1']
    if section_info['level2']:
        section_text += f" > {section_info['level2']}"
    
    return f"""You are a senior physician billing specialist mapping Alberta fee codes to {province_name} equivalents.

ALBERTA CODE TO MATCH:
- Code: {AB_CODE}
- Description: {AB_DESC}
- Fee: ${AB_FEE}

CLINICAL SERVICE DEFINITION:
{AB_CLINICAL_DEFINITION}

This is a BASIC PATIENT-FACING virtual visit by any physician (not specialist-specific, not physician-to-physician).

{province_name.upper()} SCHEDULE EXCERPT (pages {batch_pages[0]}-{batch_pages[-1]}):
Current Section: {section_text}

{context}

TASK:
Find {province_name} codes that bill for THIS SAME CLINICAL ENCOUNTER - a basic virtual care assessment between a physician and patient.

STEP 1 - FIND PRIMARY CODE(S):
What {province_name} code(s) would a physician bill for this same 10+ minute patient-facing virtual assessment?
- Look for: Limited/basic virtual care visits, telephone assessments, video assessments
- Separate codes if {province_name} splits by modality (phone vs video)

STEP 2 - FIND ADD-ON CODES:
What {province_name} codes can be billed IN ADDITION TO the primary code for this type of visit?
- Each add-on must link to specific primary code(s)
- Only include add-ons specifically eligible for virtual care visits
- IMPORTANT: If an add-on has DIFFERENT FEES by modality (telephone vs video), return it as SEPARATE entries

DO NOT INCLUDE:
- Physician-to-physician consultations - wrong service type
- E-assessments / e-consults (specialist-to-PCP) - not patient-facing
- Specialist-only consultations - wrong provider scope
- Ambulance/transport/detention codes - completely different services
- Diagnostic procedure codes (ECG, imaging, etc.) - not consultations
- Psychiatry/psychology specific codes - different specialty
- In-person visit codes (unless no virtual equivalent exists)
- Appendix reference codes that are just claim submission references

JSON only:
{{
  "found": true/false,
  "primary_codes": [
    {{
      "code": "...",
      "description": "full description from schedule",
      "fee": 00.00,
      "modality": "telephone|video|both",
      "reasoning": "why this matches"
    }}
  ],
  "add_on_codes": [
    {{
      "code": "...",
      "description": "...",
      "fee": 00.00,
      "modality": "telephone|video|both",
      "links_to": ["primary_code1", "primary_code2"],
      "condition": "when this add-on applies"
    }}
  ]
}}

IMPORTANT: If a code has different fees for telephone vs video, create SEPARATE entries for each modality with the specific fee.

If no relevant codes on these pages: {{"found": false, "primary_codes": [], "add_on_codes": []}}"""

print(f"Alberta Code: {AB_CODE} - {AB_DESC} (${AB_FEE})")
print("Core functions ready")

## 5. Province Processing Function

In [None]:
def process_province(prov_code, prov_name, pdf_file):
    """Process a single province - EXACT same logic as working Ontario version."""
    global all_results, total_cost, total_calls
    
    print(f"{'='*70}")
    print(f"PROCESSING: {prov_name} ({prov_code})")
    print(f"File: {pdf_file}")
    print("="*70)
    
    # ===== LOAD PDF WITH SECTION DETECTION =====
    # EXACT same logic as Crosswalk_Telehealth_Final.ipynb
    print(f"\nLoading {prov_name} PDF...")
    pdf_pages = {}
    page_sections = {}
    
    current_level1 = "UNKNOWN SECTION"
    current_level2 = ""
    
    with pdfplumber.open(pdf_file) as pdf:
        for i, page in enumerate(tqdm(pdf.pages, desc="Loading pages")):
            page_num = i + 1
            try:
                text = page.extract_text()
                if not text:
                    continue
                
                pdf_pages[page_num] = text
                
                # Detect section headers from first 10 lines
                lines = text.split('\n')[:10]
                for line in lines:
                    line = line.strip()
                    
                    # Level 1: ALL CAPS headers
                    if len(line) > 10 and line.isupper():
                        if not any(c.isdigit() for c in line[:4]):
                            if not line.startswith('PAGE') and '....' not in line:
                                current_level1 = line
                                current_level2 = ""
                    
                    # Level 2: Title case headers
                    elif is_valid_level2_header(line):
                        current_level2 = line
                
                page_sections[page_num] = {"level1": current_level1, "level2": current_level2}
            except:
                pass
    
    print(f"Loaded {len(pdf_pages)} pages")
    
    # ===== SEARCH ALL PAGES =====
    print(f"\nSearching for matches...")
    
    all_primary = []
    all_addons = []
    
    page_nums = sorted(pdf_pages.keys())
    batches = [page_nums[i:i+PAGES_PER_CALL] for i in range(0, len(page_nums), PAGES_PER_CALL)]
    
    print(f"Searching {len(page_nums)} pages in {len(batches)} batches...")
    
    for batch_pages in tqdm(batches, desc=f"Searching {prov_code}"):
        context = "\n".join([f"=== PAGE {p} ===\n{pdf_pages[p]}" for p in batch_pages if p in pdf_pages])
        
        # Get section info
        section_info = {"level1": "Unknown", "level2": ""}
        for pg in batch_pages:
            if pg in page_sections:
                section_info = page_sections[pg]
                break
        
        prompt = build_prompt(prov_name, batch_pages, context, section_info)
        
        try:
            resp = client.chat.completions.create(
                model="gpt-5.1-2025-11-13",
                messages=[{"role": "user", "content": prompt}],
                temperature=0.1,
                max_completion_tokens=1500
            )
            track_cost(resp.usage.prompt_tokens, resp.usage.completion_tokens)
            
            content = resp.choices[0].message.content
            match = re.search(r'\{[\s\S]*\}', content)
            if match:
                result = json.loads(match.group())
                
                if result.get('found'):
                    n_primary = len(result.get('primary_codes', []))
                    n_addon = len(result.get('add_on_codes', []))
                    print(f"  Pages {batch_pages[0]}-{batch_pages[-1]}: {n_primary} primary, {n_addon} add-ons")
                    
                    for p in result.get('primary_codes', []):
                        p['pages'] = f"{batch_pages[0]}-{batch_pages[-1]}"
                        p['level1'] = section_info['level1']
                        p['level2'] = section_info['level2']
                        all_primary.append(p)
                    
                    for a in result.get('add_on_codes', []):
                        a['pages'] = f"{batch_pages[0]}-{batch_pages[-1]}"
                        a['level1'] = section_info['level1']
                        a['level2'] = section_info['level2']
                        all_addons.append(a)
                        
        except Exception as e:
            print(f"Error on pages {batch_pages[0]}-{batch_pages[-1]}: {e}")
    
    # ===== DEDUPLICATE =====
    seen_primary = {}
    for p in all_primary:
        key = f"{p.get('code', '')}_{p.get('modality', '')}"
        if key and key not in seen_primary:
            seen_primary[key] = p
    
    seen_addon = {}
    for a in all_addons:
        key = f"{a.get('code', '')}_{a.get('modality', '')}"
        if key and key not in seen_addon:
            seen_addon[key] = a
    
    primary_codes = list(seen_primary.values())
    addon_codes = list(seen_addon.values())
    
    # ===== DISPLAY RESULTS =====
    print(f"\n--- {prov_code} PRIMARY CODES ({len(primary_codes)}) ---")
    for p in primary_codes:
        print(f"  {p.get('code', ''):8} | ${p.get('fee', '?'):>7} | {p.get('modality', '?'):10} | {p.get('description', '')[:40]}")
    
    print(f"\n--- {prov_code} ADD-ON CODES ({len(addon_codes)}) ---")
    for a in addon_codes:
        links = ', '.join(a.get('links_to', [])) if a.get('links_to') else 'unspecified'
        print(f"  {a.get('code', ''):8} | ${a.get('fee', '?'):>7} | Links to: {links}")
    
    # ===== ADD TO COMBINED RESULTS =====
    for p in primary_codes:
        all_results.append({
            'AB_Code': AB_CODE,
            'AB_Description': AB_DESC,
            'AB_Fee': AB_FEE,
            'Target_Province': prov_code,
            'Code': p.get('code', ''),
            'Description': p.get('description', ''),
            'Fee': p.get('fee', ''),
            'Type': 'PRIMARY',
            'Modality': p.get('modality', ''),
            'Links_To': '',
            'Condition': '',
            'Reasoning': p.get('reasoning', ''),
            'Level_1_Section': p.get('level1', ''),
            'Level_2_Subsection': p.get('level2', ''),
            'Pages': p.get('pages', '')
        })
    
    for a in addon_codes:
        all_results.append({
            'AB_Code': AB_CODE,
            'AB_Description': AB_DESC,
            'AB_Fee': AB_FEE,
            'Target_Province': prov_code,
            'Code': a.get('code', ''),
            'Description': a.get('description', ''),
            'Fee': a.get('fee', ''),
            'Type': 'ADD-ON',
            'Modality': a.get('modality', ''),
            'Links_To': ', '.join(a.get('links_to', [])) if a.get('links_to') else '',
            'Condition': a.get('condition', ''),
            'Reasoning': '',
            'Level_1_Section': a.get('level1', ''),
            'Level_2_Subsection': a.get('level2', ''),
            'Pages': a.get('pages', '')
        })
    
    print(f"\n {prov_name} complete: {len(primary_codes)} primary + {len(addon_codes)} add-ons")
    print(f"Running total: {len(all_results)} results | ${total_cost:.2f} spent")

print("Province processing function ready")

---
## 6. Run Ontario

In [None]:
process_province("ON", "Ontario", ON_PDF)

---
## 7. Run Saskatchewan

In [None]:
process_province("SK", "Saskatchewan", SK_PDF)

---
## 8. Run Manitoba

In [None]:
process_province("MB", "Manitoba", MB_PDF)

---
## 9. Run British Columbia

In [None]:
process_province("BC", "British Columbia", BC_PDF)

---
## 10. Save Unified Results

In [None]:
print("="*70)
print("ALL PROVINCES COMPLETE")
print("="*70)
print(f"Total results: {len(all_results)}")
print(f"Total API calls: {total_calls}")
print(f"Total cost: ${total_cost:.2f}")

# Create DataFrame
df = pd.DataFrame(all_results)

# Save to Excel
output_file = 'crosswalk_telehealth_all_provinces.xlsx'
df.to_excel(output_file, index=False)
print(f"\nSaved {len(df)} results to {output_file}")

# Download
from google.colab import files
files.download(output_file)

## 11. Summary by Province

In [None]:
print("SUMMARY BY PROVINCE")
print("="*70)

for prov in ['ON', 'SK', 'MB', 'BC']:
    prov_df = df[df['Target_Province'] == prov]
    primary = prov_df[prov_df['Type'] == 'PRIMARY']
    addon = prov_df[prov_df['Type'] == 'ADD-ON']
    
    print(f"\n{prov}:")
    print(f"  PRIMARY codes: {len(primary)}")
    for _, row in primary.iterrows():
        print(f"    {row['Code']:8} | ${row['Fee']:>7} | {row['Modality']:10} | {row['Description'][:35]}")
    
    print(f"  ADD-ON codes: {len(addon)}")
    for _, row in addon.iterrows():
        print(f"    {row['Code']:8} | ${row['Fee']:>7} | Links: {row['Links_To'][:20]}")

## 12. Full Results Table

In [None]:
df