# 00 ‚Äî Master Orchestrator: Bronze ‚Üí Silver Processing

Main orchestration notebook for processing parquet files through Bronze and Silver layers.

## Architecture Overview

```
Parquet Files (Files/{source}/{run_ts}/)
    ‚Üì
Bronze Layer (append with run_ts for CDC)
    ‚Üì
Silver Layer (CDC merge: INSERT/UPDATE/DELETE)
    ‚Üì
Watermark Update (incremental tables only)
```

## Process Flow

1. **Load Configuration** (DAG, enabled tables, retry filter)
2. **Check Incremental** ‚Üí Run watermark merge if needed
3. **Bronze Processing** ‚Üí Parallel table loading (10 workers)
4. **Bronze Logging** ‚Üí Batch log all results
5. **Silver Processing** ‚Üí Parallel CDC merge (tables with business_keys)
6. **Silver Logging** ‚Üí Batch log all results
7. **Summary Statistics** ‚Üí Performance metrics, efficiency

## Key Features

- **Parallel Processing**: ThreadPoolExecutor for 5-10x speedup
- **Idempotency**: Check logs before reprocessing
- **Retry Support**: Process only specific tables
- **Error Resilience**: Continue on failure, comprehensive logging
- **Performance Tracking**: Efficiency metrics (theoretical vs actual time)

## Parameters

- `source`: Source system name (e.g., "vizier")
- `run_ts`: Run timestamp (e.g., "20251105T142752505")
- `dag_path`: Path to DAG configuration JSON
- `retry_tables`: Optional list of tables to retry
- `force_reload`: Ignore log and reload all
- `max_workers`: Parallel workers (default: 10)
- `debug`: Enable debug output

In [23]:
# Parameters (Papermill compatible)
source = "anva_meeus"                               # Source system name
run_ts = "20251001T183103260"                       # Run timestamp
dag_path = "config/dag_anva_meeus_week.json"        # DAG configuration path
retry_tables = None                                 # Optional: list of table names to retry
force_reload = False                                # If True, ignore logs and reload all
max_workers = 10                                    # Parallel workers for table processing
debug = True                                        # Enable debug output

## [1] Setup and Imports

In [24]:
from datetime import datetime
from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import List, Dict, Any
import json
from uuid import uuid4

print("="*80)
print("MASTER ORCHESTRATOR STARTING")
print("="*80)
print(f"Source: {source}")
print(f"Run TS: {run_ts}")
print(f"DAG: {dag_path}")
print(f"Retry tables: {retry_tables}")
print(f"Force reload: {force_reload}")
print(f"Max workers: {max_workers}")
print(f"Debug: {debug}")
print("="*80)

MASTER ORCHESTRATOR STARTING
Source: anva_meeus
Run TS: 20251001T183103260
DAG: config/dag_anva_meeus_week.json
Retry tables: None
Force reload: False
Max workers: 10
Debug: True


## [2] Load Utility Notebooks

# Load logging utilities
# Load config utilities


In [25]:
# Debug: Check working directory
from pathlib import Path
import os

print(f"Current working directory: {os.getcwd()}")
print(f"Path.cwd(): {Path.cwd()}")
print(f"\nFiles in current dir:")
print(list(Path.cwd().iterdir())[:10])

print(f"\nDoes notebooks/ exist from here? {Path('notebooks').exists()}")
print(f"Does ../notebooks/ exist? {Path('../notebooks').exists()}")
print(f"Does 01_process_data.ipynb exist? {Path('01_process_data.ipynb').exists()}")

Current working directory: /home/sparkadmin/source/repos/dwh_spark_processing
Path.cwd(): /home/sparkadmin/source/repos/dwh_spark_processing

Files in current dir:
[PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/notebook_outputs'), PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/pyproject.toml'), PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/.gitignore'), PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/uv.lock'), PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/.venv'), PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/.python-version'), PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/metastore_db'), PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/backup'), PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/modules'), PosixPath('/home/sparkadmin/source/repos/dwh_spark_processing/.vscode')]

Does notebooks/ exist from here? True
Does ../notebooks/ exist? False
Does 

In [26]:
## [1.5] Initialize Spark Session

# Check if Spark session exists (Fabric/Databricks has it by default)
# For local environments, create it
try:
    spark
    print("‚úì Spark session already available")
except NameError:
    print("Creating Spark session for local environment...")
    from pyspark.sql import SparkSession
    
    spark = SparkSession.builder \
        .appName("DWH_Bronze_Silver_Processing") \
        .enableHiveSupport() \
        .getOrCreate()
    
    print("‚úì Spark session created")

# Verify Spark session
print(f"  Spark version: {spark.version}")
print(f"  Application ID: {spark.sparkContext.applicationId}")

‚úì Spark session already available
  Spark version: 3.5.7
  Application ID: app-20251126164838-0013


In [27]:
%run "./notebooks/01_utils_logging.ipynb"

Bronze log schema defined: logs.bronze_processing_log
Bronze summary schema defined: logs.bronze_run_summary
Silver log schema defined: logs.silver_processing_log
Silver summary schema defined: logs.silver_run_summary
‚úì Schema 'logs' ready
‚úì Exists: logs.bronze_processing_log
‚úì Exists: logs.bronze_run_summary
‚úì Exists: logs.silver_processing_log
‚úì Exists: logs.silver_run_summary

‚úì All log tables ready
‚úì Helper functions defined
‚úì Bronze logging functions defined
‚úì Silver logging functions defined
‚úì Query helper functions defined
LOGGING INFRASTRUCTURE VERIFICATION
‚úì logs.bronze_processing_log                        0 rows
‚úì logs.bronze_run_summary                           0 rows
‚úì logs.silver_processing_log                        0 rows
‚úì logs.silver_run_summary                           0 rows

‚úì Logging infrastructure ready for Bronze and Silver processing


In [28]:
%run "./notebooks/02_utils_config.ipynb"

‚úì Imports loaded
Checking for custom cluster Files directory /data/lakehouse... True
Detected matches for custom cluster Files directories: ['/data/lakehouse/gh_b_avd/lh_gh_bronze/Files']
‚úì Base path: /data/lakehouse/gh_b_avd/lh_gh_bronze/Files
‚úì Environment: Custom Cluster
‚úì Config directory: /data/lakehouse/gh_b_avd/lh_gh_bronze/Files/config
‚úì Watermarks path: /data/lakehouse/gh_b_avd/lh_gh_bronze/Files/config/watermarks.json
‚úì Runplan path: /data/lakehouse/gh_b_avd/lh_gh_bronze/Files/config/runplan.json
‚úì Data base path: /data/lakehouse/gh_b_avd/lh_gh_bronze/Files/greenhouse_sources
‚úì DAG loading functions defined
‚úì Table filtering functions defined
‚úì Watermark functions defined (READ-ONLY)
‚ö†Ô∏è  NOTE: Watermarks are managed by extraction pipeline, not by notebooks
‚úì Path building functions defined
‚úì DAG query helper functions defined
‚úì Runplan functions defined


  validate(nb)


## [3] Load DAG Configuration

In [29]:
# Load and validate DAG
print(f"\nüìã Loading DAG configuration...")
dag = load_dag(dag_path)
print(f"‚úì DAG loaded: {dag.get('source')}")

# Get metadata
dag_metadata = get_dag_metadata(dag)
base_files = dag_metadata['base_files']

print(f"  Base files: {base_files}")

# Get tables to process
tables_to_process = get_tables_to_process(
    dag=dag,
    retry_tables=retry_tables,
    only_enabled=True
)

print(f"\nüìä Tables to process: {len(tables_to_process)}")

# Show summary
dag_summary = summarize_dag(dag)
print(f"  Total enabled: {dag_summary['enabled_tables']}")
print(f"  Load modes: {dag_summary['load_mode_counts']}")

if not tables_to_process:
    print("\n‚ö†Ô∏è  No tables to process. Exiting.")
    raise SystemExit(0)


üìã Loading DAG configuration...
‚úì DAG loaded: anva_meeus
  Base files: greenhouse_sources

üìä Tables to process: 58
  Total enabled: 58
  Load modes: {'snapshot': 57, 'window': 1}


## [4] Generate Run ID

In [30]:
# Generate unique run ID
RUN_ID = f"{run_ts}_{uuid4().hex[:8]}"
print(f"\nüÜî Run ID: {RUN_ID}")


üÜî Run ID: 20251001T183103260_5c9bb965


## [5] Check for Incremental Tables (Watermark Merge)

If incremental tables are present, run watermark merge notebook.
This must happen BEFORE Bronze loading starts.

In [31]:
print(f"\nüíß Checking for incremental tables...")

# Filter incremental tables
incremental_tables = get_tables_by_load_mode(tables_to_process, "incremental")

if len(incremental_tables) > 0:
    print(f"  Found {len(incremental_tables)} incremental tables")
    print(f"  Tables: {[t['name'] for t in incremental_tables[:5]]}")
    
    # Get watermarks path from DAG
    wm_configpath = dag_metadata.get('watermarks_path', 'config/watermarks.json')
    
    # Build watermark folder path (where extraction pipeline writes watermarks)
    wm_folder = f"runtime/{source}/{run_ts}/"
    
    print(f"  Config: {wm_configpath}")
    print(f"  Runtime folder: {wm_folder}")
    
    # Note: In Fabric, this would use mssparkutils.notebook.run()
    # For local testing, we skip watermark merge (not critical for Bronze/Silver testing)
    print(f"\n  ‚ö†Ô∏è  Watermark merge would run here (11_bronze_watermark_merge.ipynb)")
    print(f"     Skipping for now - watermarks managed by extraction pipeline")
else:
    print(f"  ‚óØ No incremental tables - skipping watermark merge")

print("="*80)


üíß Checking for incremental tables...
  ‚óØ No incremental tables - skipping watermark merge


## [6] Bronze Processing (Parallel)

Load all tables from parquet to Bronze Delta tables in parallel.

In [32]:
# Load Bronze worker notebook
%run "./notebooks/10_bronze_load.ipynb"

print("‚úì Bronze worker loaded")

‚úì Imports loaded
‚úì Base path: Files
‚úì Environment: Local
‚úì Helper functions defined
‚úì Bronze worker function defined

BRONZE WORKER READY
Base path: Files
Environment: Local

Function available: process_bronze_table(table_def, source_name, run_ts, ...)

‚ö†Ô∏è  Remember to set RUN_ID before calling process_bronze_table()
‚úì Bronze worker notebook loaded successfully
‚úì Bronze worker loaded


In [None]:
print(f"\nüîµ BRONZE: Loading parquet to Delta tables...")
print(f"  Workers: {max_workers}")
print(f"  Tables: {len(tables_to_process)}")

bronze_start = datetime.utcnow()
bronze_results = []

# Filter tables if not force_reload (check logs)
if not force_reload:
    print(f"\n  üìã Checking logs for already processed tables...")
    
    # Get successfully processed tables from log
    processed_tables = get_successful_tables(run_ts, layer="bronze")
    
    if processed_tables:
        print(f"    Found {len(processed_tables)} already processed tables")
        
        # Filter out already processed
        tables_to_process_bronze = [
            t for t in tables_to_process 
            if t['name'] not in processed_tables
        ]
        
        print(f"    Remaining: {len(tables_to_process_bronze)} tables")
    else:
        tables_to_process_bronze = tables_to_process
else:
    tables_to_process_bronze = tables_to_process
    print(f"  ‚ö†Ô∏è  Force reload enabled - processing all tables")

if not tables_to_process_bronze:
    print(f"\n  ‚úì All tables already processed for this run_ts")
else:
    print(f"\n  üöÄ Processing {len(tables_to_process_bronze)} tables in parallel...\n")
    
    # Wrapper function for parallel execution
    def process_table_wrapper(table_def):
        """Wrapper to catch exceptions and always return a result."""
        try:
            return process_bronze_table(
                table_def=table_def,
                source_name=source,
                run_ts=run_ts,
                base_files=base_files,
                debug=False  # Disable per-table debug in parallel mode
            )
        except Exception as e:
            # If worker throws unhandled exception, create error result
            return {
                "log_id": f"{source}:{table_def['name']}:{run_ts}:error",
                "run_id": RUN_ID,
                "run_ts": run_ts,
                "source": source,
                "table_name": table_def['name'],
                "load_mode": table_def.get('load_mode'),
                "status": "FAILED",
                "rows_read": None,
                "rows_written": None,
                "start_time": datetime.utcnow(),
                "end_time": datetime.utcnow(),
                "duration_seconds": 0,
                "error_message": f"Unhandled exception: {str(e)[:500]}",
                "parquet_path": None,
                "delta_table": None,
            }
    
    # Set RUN_ID
    RUN_ID = f"{run_ts}"
    print(f"RUN_ID set to {RUN_ID}")
    
    # Parallel execution
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = {
            executor.submit(process_table_wrapper, table): table 
            for table in tables_to_process_bronze
        }
        
        completed = 0
        for future in as_completed(futures):
            result = future.result()
            bronze_results.append(result)
            completed += 1
            
            # Progress indicator
            status_icon = "‚úì" if result['status'] == 'SUCCESS' else "‚úó" if result['status'] == 'FAILED' else "‚óØ"
            # Kort foutfragment erbij (max 120 chars, 1 regel)
            error_snippet = (result.get("error_message") or "")[:120].replace("\n", " ")
            
            print(
                f"[{completed}/{len(tables_to_process_bronze)}]"
                f"{status_icon} {result['table_name']:<30} {result['status']:<10} "
                f"{(result.get('rows_written') or 0):>10,} rows {error_snippet}"
                )

bronze_end = datetime.utcnow()
bronze_duration = int((bronze_end - bronze_start).total_seconds())

print(f"\n‚úì Bronze processing completed in {bronze_duration}s")
sys.exit(0)


üîµ BRONZE: Loading parquet to Delta tables...
  Workers: 10
  Tables: 58

  üìã Checking logs for already processed tables...

  üöÄ Processing 58 tables in parallel...

RUN_ID set to 20251001T183103260
[Dim_Agent] Starting (snapshot)
  Parquet: Files/greenhouse_sources/anva_meeus/2025/10/01/20251001T183103260/Dim_Agent
  Target: anva_meeus.Dim_Agent
[Dim_Branche] Starting (snapshot)
  Parquet: Files/greenhouse_sources/anva_meeus/2025/10/01/20251001T183103260/Dim_Branche
  Target: anva_meeus.Dim_Branche
[Dim_Calamiteit] Starting (snapshot)
  Parquet: Files/greenhouse_sources/anva_meeus/2025/10/01/20251001T183103260/Dim_Calamiteit
  Target: anva_meeus.Dim_Calamiteit
[Dim_Collectiviteit] Starting (snapshot)
  Parquet: Files/greenhouse_sources/anva_meeus/2025/10/01/20251001T183103260/Dim_Collectiviteit
  Target: anva_meeus.Dim_Collectiviteit
[Dim_DekkingCode] Starting (snapshot)
  Parquet: Files/greenhouse_sources/anva_meeus/2025/10/01/20251001T183103260/Dim_DekkingCode
  Target: anv

SystemExit: 0

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


## [7] Bronze Logging and Summary

In [None]:
if bronze_results:
    print(f"\nüìä Logging Bronze results...")
    
    # Batch log
    log_bronze_batch(bronze_results)
    
    # Calculate summary statistics
    success_count = sum(1 for r in bronze_results if r['status'] == 'SUCCESS')
    failed_count = sum(1 for r in bronze_results if r['status'] == 'FAILED')
    empty_count = sum(1 for r in bronze_results if r['status'] == 'EMPTY')
    skipped_count = sum(1 for r in bronze_results if r['status'] == 'SKIPPED')
    
    total_rows = sum(r.get('rows_written', 0) or 0 for r in bronze_results)
    
    # Performance metrics
    sum_task_seconds = sum(r.get('duration_seconds', 0) or 0 for r in bronze_results)
    theoretical_min_sec = sum_task_seconds / max_workers if max_workers > 0 else sum_task_seconds
    actual_time_sec = bronze_duration
    efficiency_pct = (theoretical_min_sec / actual_time_sec * 100) if actual_time_sec > 0 else 0
    
    # Failed tables list
    failed_tables = [r['table_name'] for r in bronze_results if r['status'] == 'FAILED']
    
    # Log summary
    bronze_summary = {
        "run_id": RUN_ID,
        "source": source,
        "run_ts": run_ts,
        "run_start": bronze_start,
        "run_end": bronze_end,
        "duration_seconds": bronze_duration,
        "total_tables": len(bronze_results),
        "tables_success": success_count,
        "tables_empty": empty_count,
        "tables_failed": failed_count,
        "tables_skipped": skipped_count,
        "total_rows": total_rows,
        "workers": max_workers,
        "sum_task_seconds": sum_task_seconds,
        "theoretical_min_sec": theoretical_min_sec,
        "actual_time_sec": actual_time_sec,
        "efficiency_pct": efficiency_pct,
        "failed_tables": failed_tables,
    }
    
    log_bronze_summary(bronze_summary)
    
    # Print summary
    print(f"\n  Summary:")
    print(f"    Success: {success_count}")
    print(f"    Failed:  {failed_count}")
    print(f"    Empty:   {empty_count}")
    print(f"    Skipped: {skipped_count}")
    print(f"    Total rows: {total_rows:,}")
    print(f"    Efficiency: {efficiency_pct:.1f}%")
    
    if failed_tables:
        print(f"\n  ‚ö†Ô∏è  Failed tables: {failed_tables}")
else:
    print(f"\n  ‚ÑπÔ∏è  No Bronze results to log")


üìä Logging Bronze results...


PySparkValueError: [CANNOT_BE_NONE] Argument `obj` can not be None.

## [8] Silver Processing (Parallel CDC Merge)

Process tables that have business_keys defined for CDC merge.

In [None]:
# Load Silver worker notebook
%run "./notebooks/20_silver_cdc_merge.ipynb"

print("‚úì Silver CDC merge worker loaded")

In [None]:
print(f"\nüî∑ SILVER: CDC merge from Bronze...")

# Filter tables for Silver processing:
# 1. Must have business_keys defined
# 2. Must have been successfully loaded to Bronze

successful_bronze_tables = [r['table_name'] for r in bronze_results if r['status'] == 'SUCCESS']

tables_for_silver = [
    t for t in tables_to_process 
    if t.get('business_keys') and t['name'] in successful_bronze_tables
]

print(f"  Tables with business_keys: {len([t for t in tables_to_process if t.get('business_keys')])}")
print(f"  Successful Bronze loads: {len(successful_bronze_tables)}")
print(f"  Tables to process in Silver: {len(tables_for_silver)}")

silver_results = []

if not tables_for_silver:
    print(f"\n  ‚ÑπÔ∏è  No tables to process in Silver")
else:
    silver_start = datetime.utcnow()
    
    print(f"\n  üöÄ Processing {len(tables_for_silver)} tables in parallel...\n")
    
    # Wrapper function for parallel execution
    def process_silver_wrapper(table_def):
        """Wrapper to catch exceptions and always return a result."""
        try:
            return process_silver_cdc_merge(
                table_def=table_def,
                source_name=source,
                run_ts=run_ts,
                debug=False
            )
        except Exception as e:
            return {
                "log_id": f"{source}:{table_def['name']}:{run_ts}:silver:error",
                "run_id": RUN_ID,
                "run_ts": run_ts,
                "source": source,
                "table_name": table_def['name'],
                "load_mode": table_def.get('load_mode'),
                "status": "FAILED",
                "rows_inserted": None,
                "rows_updated": None,
                "rows_deleted": None,
                "rows_unchanged": None,
                "total_silver_rows": None,
                "bronze_rows": None,
                "bronze_table": None,
                "silver_table": None,
                "start_time": datetime.utcnow(),
                "end_time": datetime.utcnow(),
                "duration_seconds": 0,
                "error_message": f"Unhandled exception: {str(e)[:500]}",
            }
    
    # Parallel execution
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = {
            executor.submit(process_silver_wrapper, table): table 
            for table in tables_for_silver
        }
        
        completed = 0
        for future in as_completed(futures):
            result = future.result()
            silver_results.append(result)
            completed += 1
            
            status_icon = "‚úì" if result['status'] == 'SUCCESS' else "‚úó"
            deletes = result.get('rows_deleted', 0) or 0
            delete_info = f" ({deletes} deleted)" if deletes > 0 else ""
            print(f"    [{completed}/{len(tables_for_silver)}] {status_icon} {result['table_name']:<30} {result['status']:<10}{delete_info}")
            
    
    silver_end = datetime.utcnow()
    silver_duration = int((silver_end - silver_start).total_seconds())
    
    print(f"\n‚úì Silver processing completed in {silver_duration}s")

## [9] Silver Logging and Summary

In [None]:
if silver_results:
    print(f"\nüìä Logging Silver results...")
    
    # Batch log
    log_silver_batch(silver_results)
    
    # Calculate summary
    success_count = sum(1 for r in silver_results if r['status'] == 'SUCCESS')
    failed_count = sum(1 for r in silver_results if r['status'] == 'FAILED')
    skipped_count = sum(1 for r in silver_results if r['status'] == 'SKIPPED')
    
    total_inserts = sum(r.get('rows_inserted', 0) or 0 for r in silver_results)
    total_updates = sum(r.get('rows_updated', 0) or 0 for r in silver_results)
    total_deletes = sum(r.get('rows_deleted', 0) or 0 for r in silver_results)
    total_unchanged = sum(r.get('rows_unchanged', 0) or 0 for r in silver_results)
    
    failed_tables = [r['table_name'] for r in silver_results if r['status'] == 'FAILED']
    
    # Log summary
    silver_summary = {
        "run_id": RUN_ID,
        "source": source,
        "run_ts": run_ts,
        "run_start": silver_start,
        "run_end": silver_end,
        "duration_seconds": silver_duration,
        "total_tables": len(silver_results),
        "tables_success": success_count,
        "tables_failed": failed_count,
        "tables_skipped": skipped_count,
        "total_inserts": total_inserts,
        "total_updates": total_updates,
        "total_deletes": total_deletes,
        "total_unchanged": total_unchanged,
        "failed_tables": failed_tables,
    }
    
    log_silver_summary(silver_summary)
    
    # Print summary
    print(f"\n  Summary:")
    print(f"    Success: {success_count}")
    print(f"    Failed:  {failed_count}")
    print(f"    Skipped: {skipped_count}")
    if total_inserts or total_updates or total_deletes:
        print(f"    CDC: +{total_inserts or 0} ~{total_updates or 0} -{total_deletes}")
    
    if failed_tables:
        print(f"\n  ‚ö†Ô∏è  Failed tables: {failed_tables}")
else:
    print(f"\n  ‚ÑπÔ∏è  No Silver results to log")

## [10] Final Summary

In [None]:
total_end = datetime.utcnow()
total_duration = int((total_end - bronze_start).total_seconds())

print("\n" + "="*80)
print("ORCHESTRATOR SUMMARY")
print("="*80)
print(f"Run ID: {RUN_ID}")
print(f"Source: {source}")
print(f"Run TS: {run_ts}")
print(f"\nTiming:")
print(f"  Bronze: {bronze_duration}s")
if silver_results:
    print(f"  Silver: {silver_duration}s")
print(f"  Total:  {total_duration}s")

print(f"\nBronze Results:")
if bronze_results:
    bronze_success = sum(1 for r in bronze_results if r['status'] == 'SUCCESS')
    bronze_failed = sum(1 for r in bronze_results if r['status'] == 'FAILED')
    print(f"  ‚úì Success: {bronze_success}/{len(bronze_results)}")
    if bronze_failed > 0:
        print(f"  ‚úó Failed:  {bronze_failed}")
else:
    print(f"  (No processing)")

print(f"\nSilver Results:")
if silver_results:
    silver_success = sum(1 for r in silver_results if r['status'] == 'SUCCESS')
    silver_failed = sum(1 for r in silver_results if r['status'] == 'FAILED')
    print(f"  ‚úì Success: {silver_success}/{len(silver_results)}")
    if silver_failed > 0:
        print(f"  ‚úó Failed:  {silver_failed}")
else:
    print(f"  (No processing)")

# Overall status
if bronze_results:
    all_bronze_ok = all(r['status'] in ('SUCCESS', 'EMPTY', 'SKIPPED') for r in bronze_results)
else:
    all_bronze_ok = True

if silver_results:
    all_silver_ok = all(r['status'] in ('SUCCESS', 'SKIPPED') for r in silver_results)
else:
    all_silver_ok = True

overall_status = "SUCCESS" if (all_bronze_ok and all_silver_ok) else "PARTIAL" if bronze_results or silver_results else "NO_WORK"

print(f"\nOverall Status: {overall_status}")
print("="*80)

if overall_status != "SUCCESS":
    print(f"\n‚ö†Ô∏è  Some tables failed. Check logs for details.")
    print(f"   Use retry_tables parameter to retry specific tables.")
else:
    print(f"\n‚úì All processing completed successfully!")