# üöÄ Operational Flow:

**Executes the entire flow with a notebook interface:**
- Initiate DB
- Load CSV to DB
- Load (and validate) TAK repository
- Apply TAKs on DB
- View output

In [None]:
# !pip install -r requirements-py37.txt
# !pip install -e .

In [1]:
from pathlib import Path
from backend.dataaccess import DataAccess
from core.mediator import Mediator, setup_logging
import logging
import pandas as pd

In [2]:
# Paths
KB_PATH = Path("core/knowledge-base")
DB_PATH = Path("backend/data/mediator.db")
CSV_PATH = Path("backend/data/input_data.csv")

In [3]:
# # 1. Connect to existing DB
# da = DataAccess(db_path=str(DB_PATH))

# 2. Or auto-create (and optionally drop existing)
da = DataAccess(db_path=str(DB_PATH), auto_create=True)

# Check stats
stats = da.get_table_stats()
for table, info in stats.items():
    print(f"{table}: {info['rows']} rows, {info['n_patients']} patients")

[Info] Keeping existing database structure.
InputPatientData: 1790 rows, 12 patients
OutputPatientData: 95 rows, 1 patients
PatientQAScores: 6 rows, 1 patients


In [4]:
# Load CSV into InputPatientData
total_rows = da.load_csv_to_input(
    csv_path=str(CSV_PATH),
    if_exists='append',           # 'append' or 'replace'
    clear_output_and_qa=False,    # Set True to clear outputs
    yes=True                      # Auto-confirm
)
print(f"Loaded {total_rows} rows")

[Info] Validating+inserting CSV in chunks (pandas)...


CSV chunks: 0chunk [00:00, ?chunk/s]

CSV chunks: 1chunk [00:00, 11.50chunk/s]

[Info] Finished loading. Inserted 1790 rows.
[Info]: DB initiated successfully!
[Info]: Total tables created: 4
[Info]: Table 'sqlite_sequence' - Rows: 1
[Info]: Table 'InputPatientData' - Rows: 1790
[Info]: Table 'OutputPatientData' - Rows: 0
[Info]: Table 'PatientQAScores' - Rows: 0
Loaded 1790 rows





In [10]:
# Setup logging ONCE (before creating Mediator)
setup_logging(
    log_file=Path("logs/mediator.log"),
    console_level=logging.WARNING  # Only warnings/errors in console
)

# Initialize mediator
mediator = Mediator(knowledge_base_path=KB_PATH, data_access=da)

# Build TAK repository
repo = mediator.build_repository()

# List all TAK names
print("TAK Names:")
for tak_name in sorted(repo.taks.keys()):
    print(f"  - {tak_name}")


PHASE 1: Building TAK Repository


Loading TAKs: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 148/148 [00:00<00:00, 186.09file/s, Patterns: INSULIN_ON_HIGH_GLUCOSE_PATTERN]      


[Validation] Running business-logic checks on TAK repository...

‚úÖ TAK Repository Built Successfully
  Raw Concepts: 65
  Events:       25
  States:       26
  Trends:       23
  Contexts:     4
  Patterns:     5
  TOTAL TAKs:   148

TAK Names:
  - ACUTE_RESPIRATORY_DISORDER
  - ACUTE_RESPIRATORY_DISORDER_EVENT
  - ADMISSION
  - ADMISSION_EVENT
  - ALANINE-AMINOTRANSFERASE_MEASURE
  - ALANINE-AMINOTRANSFERASE_MEASURE_STATE
  - ALANINE-AMINOTRANSFERASE_MEASURE_TREND
  - ALBUMIN_MEASURE
  - ALBUMIN_MEASURE_STATE
  - ALBUMIN_MEASURE_TREND
  - ANTIBIOTIC_BITZUA_EVENT
  - ANTIBIOTIC_IV_BITZUA
  - ANTIBIOTIC_PO_BITZUA
  - ASPARATE-AMINOTRANSFERASE_MEASURE
  - ASPARATE-AMINOTRANSFERASE_MEASURE_STATE
  - ASPARATE-AMINOTRANSFERASE_MEASURE_TREND
  - ATHEROSCLEROSIS
  - ATHEROSCLEROSIS_EVENT
  - BASAL_BITZUA
  - BASAL_BITZUA_STATE
  - BASE_GLUCOSE_MEASURE
  - BICARBONATE_BITZUA
  - BICARBONATE_BITZUA_EVENT
  - BICARBONATE_MEASURE
  - BICARBONATE_MEASURE_STATE
  - BICARBONATE_MEASURE_TREND
  - 




In [6]:
# Process specific patients (Jupyter-compatible)
patient_ids = [1000]
patient_stats = await mediator.run_async(
    max_concurrent=4,
    patient_subset=patient_ids
)

# Print results with TAK usage breakdown
print("\n" + "="*80)
print("üìä Patient Processing Results")
print("="*80)

for pid, stats in patient_stats.items():
    if "error" in stats:
        print(f"\n‚ùå Patient {pid}: {stats['error']}")
        continue
    
    # Separate used vs unused TAKs
    used_taks = {k: v for k, v in stats.items() if isinstance(v, int) and v > 0}
    unused_taks = {k: v for k, v in stats.items() if isinstance(v, int) and v == 0}
    error_taks = {k: v for k, v in stats.items() if isinstance(v, str) and v.startswith("ERROR")}
    
    total_rows = sum(used_taks.values())
    
    print(f"\n‚úÖ Patient {pid}: {total_rows} total output rows")
    
    # Show used TAKs (with row counts)
    if used_taks:
        print(f"\n   üìà Used TAKs ({len(used_taks)}):")
        for tak_name, count in sorted(used_taks.items(), key=lambda x: -x[1]):
            print(f"      ‚Ä¢ {tak_name}: {count} rows")
    
    # Show unused TAKs (compact list)
    if unused_taks:
        print(f"\n   ‚ö™ Unused TAKs ({len(unused_taks)}):")
        unused_names = sorted(unused_taks.keys())
        # Print in columns (3 per line)
        for i in range(0, len(unused_names), 3):
            row = unused_names[i:i+3]
            print(f"      {', '.join(row)}")
    
    # Show errors
    if error_taks:
        print(f"\n   ‚ùå TAKs with Errors ({len(error_taks)}):")
        for tak_name, error in error_taks.items():
            print(f"      ‚Ä¢ {tak_name}: {error}")

print("\n" + "="*80)


PHASE 1: Building TAK Repository


Loading TAKs:   5%|‚ñå         | 7/140 [00:00<00:00, 160.39file/s, Raw Concepts: BICARBONATE_BITZUA]              

Loading TAKs: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 140/140 [00:00<00:00, 234.89file/s, Patterns: INSULIN_ON_HIGH_GLUCOSE_PATTERN]     



[Validation] Running business-logic checks on TAK repository...

‚úÖ TAK Repository Built Successfully
  Raw Concepts: 61
  Events:       23
  States:       25
  Trends:       21
  Contexts:     5
  Patterns:     5
  TOTAL TAKs:   140


PHASE 2: Processing 1 Patients (max_concurrent=4)
         Patient Subset: [1000]



Processing patients: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1/1 [00:01<00:00,  1.02s/patient]


‚úÖ Patient Processing Complete
  Patients processed: 1
  Total rows written: 187
  Errors:             0


üìä Patient Processing Results

‚úÖ Patient 1000: 187 total output rows

   üìà Used TAKs (29):
      ‚Ä¢ GLUCOSE_MEASURE: 22 rows
      ‚Ä¢ M-SHR_MEASURE: 22 rows
      ‚Ä¢ NORMAL_GLUCOSE_IND: 19 rows
      ‚Ä¢ GLUCOSE_MEASURE_TREND: 15 rows
      ‚Ä¢ M-SHR_MEASURE_TREND: 15 rows
      ‚Ä¢ BOLUS_BITZUA: 10 rows
      ‚Ä¢ MEAL: 10 rows
      ‚Ä¢ GLUCOSE_MEASURE_STATE: 10 rows
      ‚Ä¢ BOLUS_BITZUA_CONTEXT: 10 rows
      ‚Ä¢ MEAL_CONTEXT: 10 rows
      ‚Ä¢ BOLUS_BITZUA_STATE: 9 rows
      ‚Ä¢ M-SHR_MEASURE_STATE: 7 rows
      ‚Ä¢ DISGLYCEMIA_EVENT: 5 rows
      ‚Ä¢ BASAL_BITZUA: 3 rows
      ‚Ä¢ BASAL_BITZUA_STATE: 3 rows
      ‚Ä¢ BASAL_BITZUA_CONTEXT: 3 rows
      ‚Ä¢ HIGH_GLUCOSE_CONTEXT: 2 rows
      ‚Ä¢ ADMISSION: 1 rows
      ‚Ä¢ BMI_MEASURE: 1 rows
      ‚Ä¢ DIABETES_DIAGNOSIS: 1 rows
      ‚Ä¢ FIRST_GLUCOSE_MEASURE: 1 rows
      ‚Ä¢ RELEASE: 1 rows
      ‚Ä¢ WEIGHT_MEA




In [7]:
# Query OutputPatientData
query = """
SELECT PatientId, ConceptName, StartDateTime, EndDateTime, Value
FROM OutputPatientData
WHERE PatientId = 1000
"""
df_results = pd.read_sql_query(query, da.conn)
df_results = df_results.sort_values(by=["StartDateTime"], ignore_index=True)
df_results

Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value
0,1000,ADMISSION_EVENT,2025-01-21 09:00:00,2025-01-21 09:00:01,True
1,1000,INSULIN_ON_ADMISSION_PATTERN,2025-01-21 09:00:01,2025-01-21 13:07:01,Partial
2,1000,GLUCOSE_MEASURE_ON_ADMISSION_PATTERN,2025-01-21 09:00:01,2025-01-21 12:00:01,True
3,1000,BMI_MEASURE_ON_ADMISSION,2025-01-21 09:00:01,2025-01-21 15:00:01,True
4,1000,DIABETES_DIAGNOSIS_CONTEXT,2025-01-21 09:00:01,2025-01-24 13:59:59,True
...,...,...,...,...,...
90,1000,MEAL_CONTEXT,2025-01-24 11:00:00,2025-01-24 13:59:59,Lunch
91,1000,GLUCOSE_MEASURE_STATE,2025-01-24 12:00:00,2025-01-24 13:59:59,Low Glucose
92,1000,BOLUS_BITZUA_CONTEXT,2025-01-24 12:51:00,2025-01-24 13:59:59,True
93,1000,BOLUS_BITZUA_STATE,2025-01-24 12:51:00,2025-01-24 13:59:59,IntraVenous Medium


In [9]:
# Query OutputPatientData
query = """
SELECT PatientId, PatternName, StartDateTime, EndDateTime, ComplianceType, ComplianceScore
FROM PatientQAScores
WHERE PatientId = 1000
"""
df_results = pd.read_sql_query(query, da.conn)
df_results = df_results.sort_values(by=["PatternName"], ignore_index=True)
df_results

Unnamed: 0,PatientId,PatternName,StartDateTime,EndDateTime,ComplianceType,ComplianceScore
0,1000,BMI_MEASURE_ON_ADMISSION,2025-01-21 09:00:00,2025-01-21 15:00:01,TimeConstraint,1.0
1,1000,CREATININE_MEASURE_ON_ADMISSION,2262-04-11 23:47:16.854775807,2262-04-11 23:47:16.854775807,TimeConstraint,0.0
2,1000,GLUCOSE_MEASURE_ON_ADMISSION_PATTERN,2025-01-21 09:00:00,2025-01-21 12:00:01,TimeConstraint,1.0
3,1000,INSULIN_ON_ADMISSION_PATTERN,2025-01-21 09:00:00,2025-01-21 13:07:01,TimeConstraint,1.0
4,1000,INSULIN_ON_ADMISSION_PATTERN,2025-01-21 09:00:00,2025-01-21 13:07:01,ValueConstraint,0.870083
5,1000,INSULIN_ON_HIGH_GLUCOSE_PATTERN,2262-04-11 23:47:16.854775807,2262-04-11 23:47:16.854775807,TimeConstraint,0.0


# üîç TAK Debugging Toolkit

Interactive debugging for a single patient's abstraction flow.

**Features:**
- Load minimal data (only concepts relevant to selected TAKs)
- Use isolated debug DB (no production contamination)
- Full TAK execution trace with cache inspection
- Compare input ‚Üí intermediate ‚Üí output at each layer

In [12]:
# ============================================================
# Configuration: Select Patient & TAKs to Debug
# ============================================================

DEBUG_PATIENT_ID = 1000

# Select TAKs to trace (in dependency order)
DEBUG_TAKS = [
    # Raw concepts
    "DIABETES_DIAGNOSIS",
    "ADMISSION",
    "GLUCOSE_MEASURE",
    "BASAL_BITZUA",
    "DEATH",
    "RELEASE",
    
    # Events
    "ADMISSION_EVENT",
    "DEATH_EVENT",
    "RELEASE_EVENT",
    
    # Contexts
    "DIABETES_DIAGNOSIS_CONTEXT",
    
    # Patterns
    "GLUCOSE_MEASURE_ON_ADMISSION_PATTERN"
]

# Debug DB path (separate from production)
DEBUG_DB_PATH = Path("backend/data/debug_mediator.db")

In [13]:
# ============================================================
# Step 1: Extract Minimal Input Data for Debug Patient
# ============================================================

# Query production DB for this patient's input data
query = f"""
SELECT DISTINCT PatientId, ConceptName, StartDateTime, EndDateTime, Value
FROM InputPatientData
WHERE PatientId = {DEBUG_PATIENT_ID}
ORDER BY StartDateTime, ConceptName
"""

df_patient_input = pd.read_sql_query(query, da.conn)

print(f"üì• Input data for patient {DEBUG_PATIENT_ID}:")
print(f"   Total rows: {len(df_patient_input)}")
print(f"\nüìä Unique concepts in input:")
for concept in sorted(df_patient_input['ConceptName'].unique()):
    count = (df_patient_input['ConceptName'] == concept).sum()
    print(f"   - {concept}: {count} rows")

# Show sample
display(df_patient_input.head(10))

üì• Input data for patient 1000:
   Total rows: 63

üìä Unique concepts in input:
   - ADMISSION: 1 rows
   - BASAL_DOSAGE: 3 rows
   - BASAL_ROUTE: 3 rows
   - BMI_MEASURE: 1 rows
   - BOLUS_DOSAGE: 10 rows
   - BOLUS_ROUTE: 10 rows
   - DIABETES_DIAGNOSIS: 1 rows
   - GLUCOSE_MEASURE: 22 rows
   - MEAL: 10 rows
   - RELEASE: 1 rows
   - WEIGHT_MEASURE: 1 rows


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value
0,1000,ADMISSION,2025-01-21 09:00:00,2025-01-21 09:00:01,True
1,1000,DIABETES_DIAGNOSIS,2025-01-21 09:00:00,2025-01-21 09:00:01,True
2,1000,GLUCOSE_MEASURE,2025-01-21 12:00:00,2025-01-21 12:00:01,172.4
3,1000,MEAL,2025-01-21 13:00:00,2025-01-21 13:00:01,Lunch
4,1000,BOLUS_DOSAGE,2025-01-21 13:07:00,2025-01-21 13:07:01,14.6
5,1000,BOLUS_ROUTE,2025-01-21 13:07:00,2025-01-21 13:07:01,IntraVenous
6,1000,BMI_MEASURE,2025-01-21 15:00:00,2025-01-21 15:00:01,25.3
7,1000,WEIGHT_MEASURE,2025-01-21 15:00:00,2025-01-21 15:00:01,83.9
8,1000,GLUCOSE_MEASURE,2025-01-21 16:00:00,2025-01-21 16:00:01,216.2
9,1000,MEAL,2025-01-21 19:00:00,2025-01-21 19:00:01,Dinner


In [14]:
# ============================================================
# Step 2: Create Debug DB and Load Minimal Data
# ============================================================

# Close any existing debug DB connections
if 'debug_da' in globals() and hasattr(debug_da, 'conn'):
    try:
        debug_da.conn.close()
        del debug_da
    except:
        pass

# Delete debug DB file entirely (force fresh start)
if DEBUG_DB_PATH.exists():
    import time
    for _ in range(3):  # Retry up to 3 times
        try:
            DEBUG_DB_PATH.unlink()
            print(f"üóëÔ∏è  Deleted old debug DB: {DEBUG_DB_PATH}")
            break
        except PermissionError:
            time.sleep(0.5)  # Wait for file handles to release
    else:
        print(f"‚ö†Ô∏è  Could not delete debug DB (file in use). Restart kernel and try again.")
        raise

# Create fresh debug DB
debug_da = DataAccess(db_path=str(DEBUG_DB_PATH), auto_create=True)
print(f"‚úÖ Created fresh debug DB at {DEBUG_DB_PATH}")

# Load ONLY concepts relevant to DEBUG_TAKS
# Build list of input concepts needed by selected TAKs
mediator_temp = Mediator(knowledge_base_path=KB_PATH, data_access=debug_da)
repo_temp = mediator_temp.build_repository()

needed_concepts = set()
for tak_name in DEBUG_TAKS:
    tak = repo_temp.get(tak_name)
    if tak:
        if hasattr(tak, 'derived_from'):
            if isinstance(tak.derived_from, list):
                # Event/Context/Pattern
                for df in tak.derived_from:
                    needed_concepts.add(df['name'])
            else:
                # State/Trend
                needed_concepts.add(tak.derived_from)
        
        # Add attributes for raw-concepts
        if hasattr(tak, 'attributes'):
            for attr in tak.attributes:
                needed_concepts.add(attr['name'])

print(f"\nüéØ Concepts needed for selected TAKs:")
print(f"   {sorted(needed_concepts)}")

# Filter input data to only needed concepts
df_filtered = df_patient_input[df_patient_input['ConceptName'].isin(needed_concepts)].copy()

print(f"\nüì¶ Filtered input data:")
print(f"   Rows: {len(df_patient_input)} ‚Üí {len(df_filtered)}")
print(f"   Concepts: {df_patient_input['ConceptName'].nunique()} ‚Üí {df_filtered['ConceptName'].nunique()}")

# Load to debug DB
if not df_filtered.empty:
    df_filtered.to_sql('InputPatientData', debug_da.conn, if_exists='append', index=False)
    print(f"\n‚úÖ Loaded {len(df_filtered)} rows to debug DB")
else:
    print(f"\n‚ö†Ô∏è  No data to load (needed concepts not found in input)")

üóëÔ∏è  Deleted old debug DB: backend\data\debug_mediator.db
[Info]: Initializing new database at backend\data\debug_mediator.db
[Info] Dropping existing tables...
[Info] Creating tables from DDL...
[Info]: DB initiated successfully!
[Info]: Total tables created: 4
[Info]: Table 'InputPatientData' - Rows: 0
[Info]: Table 'sqlite_sequence' - Rows: 0
[Info]: Table 'OutputPatientData' - Rows: 0
[Info]: Table 'PatientQAScores' - Rows: 0
‚úÖ Created fresh debug DB at backend\data\debug_mediator.db

PHASE 1: Building TAK Repository


Loading TAKs:   8%|‚ñä         | 3/37 [00:00<00:00, 210.54file/s, Raw Concepts: BMI_MEASURE]                 

Loading TAKs: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 37/37 [00:00<00:00, 249.90file/s, Patterns: INSULIN_ON_HIGH_GLUCOSE_PATTERN]     


[Validation] Running business-logic checks on TAK repository...

‚úÖ TAK Repository Built Successfully
  Raw Concepts: 15
  Events:       4
  States:       5
  Trends:       2
  Contexts:     6
  Patterns:     5
  TOTAL TAKs:   37


üéØ Concepts needed for selected TAKs:
   ['ADMISSION', 'ADMISSION_EVENT', 'BASAL_DOSAGE', 'BASAL_ROUTE', 'DEATH', 'DIABETES_DIAGNOSIS', 'DIABETES_DIAGNOSIS_CONTEXT', 'GLUCOSE_MEASURE', 'RELEASE']

üì¶ Filtered input data:
   Rows: 63 ‚Üí 31
   Concepts: 11 ‚Üí 6

‚úÖ Loaded 31 rows to debug DB





In [15]:
# ============================================================
# Step 3: Run Mediator with TAK Cache Tracing
# ============================================================

# Initialize debug mediator
debug_mediator = Mediator(knowledge_base_path=KB_PATH, data_access=debug_da)
debug_repo = debug_mediator.build_repository()

# Enable detailed logging
import logging
logging.basicConfig(level=logging.INFO, format='%(name)s - %(levelname)s - %(message)s')

# Run processing for debug patient
print(f"\nüöÄ Processing patient {DEBUG_PATIENT_ID} with cache tracing...\n")
print("="*80)

# Process patient synchronously (for easier debugging)
stats, tak_cache = debug_mediator._process_patient_sync(DEBUG_PATIENT_ID, return_cache=True)

print("\n" + "="*80)
print(f"\n‚úÖ Processing complete!")
print(f"\nüìä Output stats:")
for tak_name, count in stats.items():
    if isinstance(count, int) and count > 0:
        print(f"   - {tak_name}: {count} rows")
    elif isinstance(count, str) and count.startswith("ERROR"):
        print(f"   ‚ùå {tak_name}: {count}")

print(f"\nüóÑÔ∏è  TAK Cache Size: {len(tak_cache)} TAKs cached")


PHASE 1: Building TAK Repository


Loading TAKs: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 37/37 [00:00<00:00, 279.85file/s, Patterns: INSULIN_ON_HIGH_GLUCOSE_PATTERN]     


[Validation] Running business-logic checks on TAK repository...

‚úÖ TAK Repository Built Successfully
  Raw Concepts: 15
  Events:       4
  States:       5
  Trends:       2
  Contexts:     6
  Patterns:     5
  TOTAL TAKs:   37


üöÄ Processing patient 1000 with cache tracing...








‚úÖ Processing complete!

üìä Output stats:
   - ADMISSION: 1 rows
   - BASAL_BITZUA: 3 rows
   - DIABETES_DIAGNOSIS: 1 rows
   - GLUCOSE_MEASURE: 22 rows
   - NORMAL_GLUCOSE_IND: 19 rows
   - RELEASE: 1 rows
   - ADMISSION_EVENT: 1 rows
   - DISGLYCEMIA_EVENT: 5 rows
   - RELEASE_EVENT: 1 rows
   - BASAL_BITZUA_STATE: 3 rows
   - GLUCOSE_MEASURE_STATE: 10 rows
   - GLUCOSE_MEASURE_TREND: 15 rows
   - BASAL_BITZUA_CONTEXT: 3 rows
   - DIABETES_DIAGNOSIS_CONTEXT: 1 rows
   - HIGH_GLUCOSE_CONTEXT: 2 rows
   - GLUCOSE_MEASURE_ON_ADMISSION_PATTERN: 1 rows
   - INSULIN_ON_ADMISSION_PATTERN: 1 rows

üóÑÔ∏è  TAK Cache Size: 37 TAKs cached


In [16]:
# ============================================================
# Step 4: Inspect TAK Cache (Intermediate Outputs)
# ============================================================

# Query cache for each selected TAK
print(f"\nüîç TAK-by-TAK Cache Inspection:\n")

for tak_name in DEBUG_TAKS:
    tak = debug_repo.get(tak_name)
    if not tak:
        print(f"‚ùå {tak_name}: TAK not found in repository")
        continue
    
    # Get from cache (not DB!)
    if tak_name not in tak_cache:
        print(f"{'='*80}")
        print(f"üìå {tak_name} ({tak.family})")
        print(f"   ‚ö†Ô∏è  NOT IN CACHE (TAK not processed or error occurred)")
        print()
        continue
    
    df_tak_output = tak_cache[tak_name]
    
    print(f"{'='*80}")
    print(f"üìå {tak_name} ({tak.family})")
    
    # Show derived_from info
    if hasattr(tak, 'derived_from'):
        if isinstance(tak.derived_from, list):
            deps = [d['name'] for d in tak.derived_from]
            print(f"   Derived from: {deps}")
        else:
            print(f"   Derived from: {tak.derived_from}")
    
    print(f"   Cached rows: {len(df_tak_output)}")
    
    if not df_tak_output.empty:
        print(f"\n   üìä Column dtypes:")
        for col, dtype in df_tak_output.dtypes.items():
            print(f"      {col}: {dtype}")
        
        print(f"\n   üìã Sample output (first 5 rows):")
        display(df_tak_output.head(5))
        
    else:
        print(f"\n   ‚ö†Ô∏è  EMPTY OUTPUT")
        
        # Check dependencies to diagnose why
        if hasattr(tak, 'derived_from'):
            print(f"\n   üîç Dependency Check:")
            if isinstance(tak.derived_from, list):
                for dep in tak.derived_from:
                    dep_name = dep['name']
                    if dep_name in tak_cache:
                        dep_rows = len(tak_cache[dep_name])
                        print(f"      - {dep_name}: {dep_rows} rows")
                    else:
                        print(f"      - {dep_name}: NOT IN CACHE")
            else:
                dep_name = tak.derived_from
                if dep_name in tak_cache:
                    dep_rows = len(tak_cache[dep_name])
                    print(f"      - {dep_name}: {dep_rows} rows")
                else:
                    print(f"      - {dep_name}: NOT IN CACHE")
    
    print()


üîç TAK-by-TAK Cache Inspection:

üìå DIABETES_DIAGNOSIS (raw-concept)
   Cached rows: 1

   üìä Column dtypes:
      PatientId: int64
      ConceptName: object
      StartDateTime: datetime64[ns]
      EndDateTime: datetime64[ns]
      Value: object
      AbstractionType: object

   üìã Sample output (first 5 rows):


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value,AbstractionType
0,1000,DIABETES_DIAGNOSIS,2025-01-21 09:00:00,2025-01-21 09:00:01,"(True,)",raw-concept



üìå ADMISSION (raw-concept)
   Cached rows: 1

   üìä Column dtypes:
      PatientId: int64
      ConceptName: object
      StartDateTime: datetime64[ns]
      EndDateTime: datetime64[ns]
      Value: object
      AbstractionType: object

   üìã Sample output (first 5 rows):


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value,AbstractionType
0,1000,ADMISSION,2025-01-21 09:00:00,2025-01-21 09:00:01,"(True,)",raw-concept



üìå GLUCOSE_MEASURE (raw-concept)
   Cached rows: 22

   üìä Column dtypes:
      PatientId: int64
      ConceptName: object
      StartDateTime: datetime64[ns]
      EndDateTime: datetime64[ns]
      Value: object
      AbstractionType: object

   üìã Sample output (first 5 rows):


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value,AbstractionType
0,1000,GLUCOSE_MEASURE,2025-01-21 12:00:00,2025-01-21 12:00:01,"(172.4,)",raw-concept
1,1000,GLUCOSE_MEASURE,2025-01-21 16:00:00,2025-01-21 16:00:01,"(216.2,)",raw-concept
2,1000,GLUCOSE_MEASURE,2025-01-21 20:00:00,2025-01-21 20:00:01,"(128.3,)",raw-concept
3,1000,GLUCOSE_MEASURE,2025-01-22 00:00:00,2025-01-22 00:00:01,"(182.1,)",raw-concept
4,1000,GLUCOSE_MEASURE,2025-01-22 04:00:00,2025-01-22 04:00:01,"(131.8,)",raw-concept



üìå BASAL_BITZUA (raw-concept)
   Cached rows: 3

   üìä Column dtypes:
      PatientId: int64
      ConceptName: object
      StartDateTime: datetime64[ns]
      EndDateTime: datetime64[ns]
      Value: object
      AbstractionType: object

   üìã Sample output (first 5 rows):


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value,AbstractionType
0,1000,BASAL_BITZUA,2025-01-21 21:00:00,2025-01-21 21:00:01,"(60.7, SubCutaneous)",raw-concept
1,1000,BASAL_BITZUA,2025-01-22 21:00:00,2025-01-22 21:00:01,"(11.8, IntraVenous)",raw-concept
2,1000,BASAL_BITZUA,2025-01-23 21:00:00,2025-01-23 21:00:01,"(20.5, IntraVenous)",raw-concept



üìå DEATH (raw-concept)
   Cached rows: 0

   ‚ö†Ô∏è  EMPTY OUTPUT

üìå RELEASE (raw-concept)
   Cached rows: 1

   üìä Column dtypes:
      PatientId: int64
      ConceptName: object
      StartDateTime: datetime64[ns]
      EndDateTime: datetime64[ns]
      Value: object
      AbstractionType: object

   üìã Sample output (first 5 rows):


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value,AbstractionType
0,1000,RELEASE,2025-01-24 14:00:00,2025-01-24 14:00:01,"(True,)",raw-concept



üìå ADMISSION_EVENT (event)
   Derived from: ['ADMISSION']
   Cached rows: 1

   üìä Column dtypes:
      PatientId: int64
      ConceptName: object
      StartDateTime: datetime64[ns]
      EndDateTime: datetime64[ns]
      Value: object
      AbstractionType: object

   üìã Sample output (first 5 rows):


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value,AbstractionType
0,1000,ADMISSION_EVENT,2025-01-21 09:00:00,2025-01-21 09:00:01,True,event



üìå DEATH_EVENT (event)
   Derived from: ['DEATH']
   Cached rows: 0

   ‚ö†Ô∏è  EMPTY OUTPUT

   üîç Dependency Check:
      - DEATH: 0 rows

üìå RELEASE_EVENT (event)
   Derived from: ['RELEASE']
   Cached rows: 1

   üìä Column dtypes:
      PatientId: int64
      ConceptName: object
      StartDateTime: datetime64[ns]
      EndDateTime: datetime64[ns]
      Value: object
      AbstractionType: object

   üìã Sample output (first 5 rows):


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value,AbstractionType
0,1000,RELEASE_EVENT,2025-01-24 14:00:00,2025-01-24 14:00:01,True,event



üìå DIABETES_DIAGNOSIS_CONTEXT (context)
   Derived from: ['DIABETES_DIAGNOSIS']
   Cached rows: 1

   üìä Column dtypes:
      PatientId: int64
      ConceptName: object
      StartDateTime: datetime64[ns]
      EndDateTime: datetime64[ns]
      Value: object
      AbstractionType: object

   üìã Sample output (first 5 rows):


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value,AbstractionType
0,1000,DIABETES_DIAGNOSIS_CONTEXT,2025-01-21 09:00:01,2025-01-24 13:59:59,True,context



üìå GLUCOSE_MEASURE_ON_ADMISSION_PATTERN (local-pattern)
   Derived from: ['DIABETES_DIAGNOSIS_CONTEXT', 'GLUCOSE_MEASURE', 'ADMISSION_EVENT']
   Cached rows: 1

   üìä Column dtypes:
      PatientId: int64
      ConceptName: object
      StartDateTime: datetime64[ns]
      EndDateTime: datetime64[ns]
      Value: object
      AbstractionType: object

   üìã Sample output (first 5 rows):


Unnamed: 0,PatientId,ConceptName,StartDateTime,EndDateTime,Value,AbstractionType
0,1000,GLUCOSE_MEASURE_ON_ADMISSION_PATTERN,2025-01-21 09:00:01,2025-01-21 12:00:01,True,local-pattern





In [10]:
# ============================================================
# Step 5: Cleanup Debug DB
# ============================================================

# Close connection
debug_da.conn.close()

# Optionally delete debug DB
# DEBUG_DB_PATH.unlink()
# print(f"üóëÔ∏è  Deleted debug DB: {DEBUG_DB_PATH}")

print(f"\n‚úÖ Debug session complete!")
print(f"   Debug DB preserved at: {DEBUG_DB_PATH}")
print(f"   (Delete manually or re-run this block to recreate)")


‚úÖ Debug session complete!
   Debug DB preserved at: backend\data\debug_mediator.db
   (Delete manually or re-run this block to recreate)
