# Data Governance Command Center
---
This notebook acts as the central hub for orchestrating the entire Data Governance Pipeline. 
From here, you can profile raw data, detect PII, validate quality, clean the dataset, and mask sensitive information.

## Part 1: Exploratory Data Quality Analysis
Run the profiling script to generate the initial data quality report.

In [None]:
!python eda_quality.py

with open("outputs/data_quality_report.txt", "r") as f:
    print(f.read())

## Part 2: PII Detection
Scan for sensitive data (Names, Emails, Phones, Addresses) and assess exposure risk.

In [None]:
!python pii_detection.py

with open("outputs/pii_detection_report.txt", "r") as f:
    print(f.read())

## Part 3: Data Validation
Run the strict validation engine against the raw data.

In [None]:
!python data_validator.py

with open("outputs/validation_results.txt", "r") as f:
    print(f.read())

## Part 4: Data Cleaning
Normalize formats, handle missing values, and re-validate the data.

In [None]:
!python data_cleaning.py

with open("outputs/cleaning_log.txt", "r") as f:
    print(f.read())

## Part 5: PII Masking
Mask sensitive fields in the cleaned dataset to ensure compliance.

In [None]:
!python pii_masking.py

with open("outputs/masked_sample.txt", "r") as f:
    print(f.read())

## Part 6: Full Pipeline Orchestration
Run the entire end-to-end automated workflow.

In [None]:
!python pipeline.py

with open("outputs/pipeline_execution_report.txt", "r") as f:
    print(f.read())