### BATCH-EXPORTING MULTIPLE TABLES FOR TABLEAU AND POWER BI-READY CSV DATASET

In [6]:
# 1. ======================== SETUP AND DATA PATH =============================
import os
import sqlite3
import numpy as np
import pandas as pd
from datetime import datetime
from sqlalchemy import create_engine

# Database path
db_path = "/Users/christopheredoziesunday/Documents/Chrom-Data-Analysis/qc_structured.db"
engine = create_engine(f"sqlite:///{db_path}")

# Output directory
os.makedirs('derived_metrics_outputs', exist_ok=True)

# 2. ========================= LOAD DATA FROM TABLES IN SQLITE ========================
df = pd.read_sql(
    """
    SELECT s.sample_id, s.instrument_id, s.run_date, s.month,s.retention_time_min, s.peak_area,
           s.peak_width_min, s.concentration_mgL, s.true_value_mgL,
           
           sm.error_mgL, sm.error_pct, sm.percent_recovery, sm.rsd_pct, sm.response_factor, 
           sm.peakarea_zscore, sm.retention_deviation, sm.is_outlier, sm.replicate_number, 
           sm.residual, sm.ewma, sm.cusum, sm.roll_mean, sm.roll_std, sm.roll_cv,
           
           c.slope, c.intercept, c.r2,
           
           cs.total_runs, cs.out_of_control, cs.mean_peak_area, cs.std_peak_area, cs.ucl, 
           cs.lcl, cs.out_of_control_runs,
           
           ss.resolution, ss.tailing, ss.plates
    
    FROM samples s

    LEFT JOIN sample_metrics sm
      ON s.sample_id     = sm.sample_id
     AND s.instrument_id = sm.instrument_id
     AND DATE(s.run_date) = DATE(sm.run_date)

    LEFT JOIN calibrations c
      ON s.instrument_id = c.instrument_id

    LEFT JOIN control_summary cs
      ON s.instrument_id = cs.instrument_id

    LEFT JOIN system_suitability ss
      ON s.sample_id     = ss.sample_id
     AND s.instrument_id = ss.instrument_id
     AND DATE(s.run_date) = DATE(ss.run_date);
    """, con=engine
)

# 3. ============================= DATA CLEANING =======================================
# Ensuring run_date is datetime for proper plotting
df['run_date'] = pd.to_datetime(df['run_date'], errors='coerce')

# 4. ========== BATCH-EXPORTING MULTIPLE TABLES TO CSV (TABLEAU-READY) =================
df.to_csv("derived_metrics_outputs/master_dataset.csv", index=False)

print("\n✔ All tables loaded into one tableau master DataFrame and exported successfully!")


✔ All tables loaded into one tableau master DataFrame and exported successfully!


### DASHBOARD TITLE:
CALIBRATION & INSTRUMENT PERFORMANCE ANALYTICS (QC & DRIFT DETECTION)

#### Description:
End-to-end analytical quality control dashboards designed to monitor calibration accuracy, instrument stability, precision, and method performance.

Integrates parity analysis, accuracy heatmaps, response factor trends, R² evaluation, EWMA, CUSUM, and rolling statistics to enable early detection of drift, bias, and non-linearity in analytical instruments.

Built for regulated laboratory environments to support root-cause diagnostics, data-driven decision-making, and proactive instrument health monitoring using Tableau, SQL, and Python-inspired QC methodologies.

### DASHBOARD DEVELOPMENT IN TABLEAU 
From the tableau_master.csv dataset, the following three dashboards are built:

#### 1. Calibration Performance Overview (For Detecting Drift, Bias, and Non-linearity)
This includes:
- Parity Plot per Instrument (True vs Measured)
- Calibration Accuracy Heatmap
- Response Factor Stability Timeline
- R² Linearity Bar Plot Per Instrument

#### 2. Instrument Health Monitoring (For Early Detection of Drifts and Faults)
This includes:
- Peak Area Control Chart
- EWMA CONTROL CHART (Exponentially Weighted Moving Average)
- CUSUM CHART (Cumulative Sum Control Chart)
- ROLLING STATISTICS CHART (Mean, STD, CV)

#### 3. Method Performance (To Monitor Precision, Trueness, and Outliers)
This includes:
- Replicate Precision Distribution
- Trueness Error Distribution
- Outlier Map (Z-score > 3)
