# Environment setting
環境設定

In [None]:
import os
import requests
from pathlib import Path


# determine branch, default is main
branch = "main"

# Check if running in Google Colab
is_colab = "COLAB_GPU" in os.environ

if is_colab:
    # Download the utils.py file from GitHub
    utils_url = (
        f"https://raw.githubusercontent.com/nics-tw/petsard/{branch}/demo/utils.py"
    )
    response = requests.get(utils_url)

    if response.status_code == 200:
        # Save the utils.py file
        with open("utils.py", "w") as f:
            f.write(response.text)

        # Create an empty __init__.py
        Path("__init__.py").touch()
    else:
        raise RuntimeError(
            f"Failed to download utils.py. Status code: {response.status_code}"
        )

In [None]:
# Now import and run the setup
from utils import (
    get_yaml_path,
    setup_environment,
)


setup_environment(
    is_colab,
    branch,
    benchmark_data=[
        "adult-income",
    ],
)

In [3]:
from petsard import Executor

# YAML Configuration for PETsARD
PETsARD 的 YAML 設定

## Default Synthesis and Default Evaluation
預設合成與預設評測

In [None]:
yaml_file_case: str = "logging-configuration.yaml"

yaml_path_case: str = get_yaml_path(
    is_colab=is_colab,
    yaml_file=yaml_file_case,
    branch=branch,
)

Configuration content:
---
Executor:
  log_output_type: both
  log_level: DEBUG
  log_dir: demo_logs
  log_filename: PETsARD_demo_{timestamp}.log
Loader:
  log-data:
    filepath: 'benchmark/adult-income.csv'
Splitter:
  demo:
    num_samples: 1
    train_split_ratio: 0.8
Preprocessor:
  demo:
    method: 'default'
Synthesizer:
  demo:
    method: 'default'
Postprocessor:
  demo:
    method: 'default'
Evaluator:
  demo-diagnostic:
    method: 'sdmetrics-diagnosticreport'
  demo-quality:
    method: 'sdmetrics-qualityreport'
Reporter:
  output:
    method: 'save_data'
    source: 'Synthesizer'
  save_report_global:
    method: 'save_report'
    granularity: 'global'
...


### Execution and Result
執行與結果

In [5]:
exec_case = Executor(config=yaml_path_case)
exec_case.run()

2025-06-09 16:01:32,334 - PETsARD.Executor      - _get_config       - INFO     - Logger reconfigured with settings from YAML
2025-06-09 16:01:32,335 - PETsARD.Loader        - __init__          - INFO     - Initializing Loader
2025-06-09 16:01:32,335 - PETsARD.Loader        - __init__          - DEBUG    - Loader parameters - filepath: benchmark/adult-income.csv, method: None, column_types: None
2025-06-09 16:01:32,336 - PETsARD.LoaderConfig  - __post_init__     - DEBUG    - Initializing LoaderConfig
2025-06-09 16:01:32,337 - PETsARD.LoaderConfig  - __post_init__     - DEBUG    - File path information - dir: benchmark, name: adult-income, ext: .csv, ext code: 1
2025-06-09 16:01:32,337 - PETsARD.Loader        - __init__          - DEBUG    - LoaderConfig successfully initialized
2025-06-09 16:01:32,338 - PETsARD.Synthesizer   - __init__          - INFO     - Initializing Synthesizer with method: default, sample_num_rows: None
2025-06-09 16:01:32,339 - PETsARD.SynthesizerConfig - __post_i

Generating report ...

(1/2) Evaluating Data Validity: |██████████| 15/15 [00:00<00:00, 664.95it/s]|
Data Validity Score: 100.0%

(2/2) Evaluating Data Structure: |██████████| 1/1 [00:00<00:00, 645.18it/s]|
Data Structure Score: 100.0%



2025-06-09 16:01:36,709 - PETsARD.SDMetricsSingleTable - _eval             - INFO     - Successfully evaluating from data
2025-06-09 16:01:36,709 - PETsARD.SDMetricsSingleTable - _extract_scores   - DEBUG    - Extract scores level from SDMetrics
2025-06-09 16:01:36,710 - PETsARD.SDMetricsSingleTable - _extract_scores   - DEBUG    - Extracting properties level from SDMetrics
2025-06-09 16:01:36,710 - PETsARD.SDMetricsSingleTable - _extract_scores   - DEBUG    - Extracting details level from SDMetrics
2025-06-09 16:01:36,711 - PETsARD.SDMetricsSingleTable - _eval             - DEBUG    - Extracted scores: ['score', 'properties', 'details']
2025-06-09 16:01:36,711 - PETsARD.SDMetricsSingleTable - _eval             - DEBUG    - Extracting global level as PETsARD format
2025-06-09 16:01:36,712 - PETsARD.SDMetricsSingleTable - _eval             - DEBUG    - Extracting columnwise level as PETsARD format
2025-06-09 16:01:36,712 - PETsARD.SDMetricsSingleTable - _eval             - DEBUG    - Ex

Overall Score (Average): 100.0%

Now is petsard_Loader[log-data]_Splitter[demo_[1-1]]_Preprocessor[demo]_Synthesizer[demo] save to csv...


2025-06-09 16:01:37,023 - PETsARD.ReporterOp    - _run              - DEBUG    - Data reporting completed
2025-06-09 16:01:37,024 - PETsARD.ReporterOp    - run               - INFO     - Completed ReporterOp execution (elapsed: 0:00:00)
2025-06-09 16:01:37,025 - PETsARD.Executor      - _set_result       - DEBUG    - Collecting final results for Reporter
2025-06-09 16:01:37,025 - PETsARD.Executor      - run               - INFO     - Executing Reporter with save_report_global
2025-06-09 16:01:37,034 - PETsARD.ReporterOp    - run               - INFO     - Starting ReporterOp execution
2025-06-09 16:01:37,035 - PETsARD.ReporterOp    - _run              - DEBUG    - Starting data reporting process
2025-06-09 16:01:37,036 - PETsARD.ReporterOp    - _run              - DEBUG    - Reporting configuration initialization completed
2025-06-09 16:01:37,038 - PETsARD.ReporterOp    - _run              - DEBUG    - Data reporting completed
2025-06-09 16:01:37,038 - PETsARD.ReporterOp    - run       

Now is petsard[Report]_[global] save to csv...
Generating report ...

(1/2) Evaluating Column Shapes: |██████████| 15/15 [00:00<00:00, 95.47it/s]|
Column Shapes Score: 95.13%

(2/2) Evaluating Column Pair Trends: |██████████| 105/105 [00:00<00:00, 226.10it/s]|
Column Pair Trends Score: 61.47%





Overall Score (Average): 78.3%



2025-06-09 16:01:37,784 - PETsARD.SDMetricsSingleTable - _eval             - INFO     - Successfully evaluating from data
2025-06-09 16:01:37,785 - PETsARD.SDMetricsSingleTable - _extract_scores   - DEBUG    - Extract scores level from SDMetrics
2025-06-09 16:01:37,785 - PETsARD.SDMetricsSingleTable - _extract_scores   - DEBUG    - Extracting properties level from SDMetrics
2025-06-09 16:01:37,786 - PETsARD.SDMetricsSingleTable - _extract_scores   - DEBUG    - Extracting details level from SDMetrics
2025-06-09 16:01:37,786 - PETsARD.SDMetricsSingleTable - _eval             - DEBUG    - Extracted scores: ['score', 'properties', 'details']
2025-06-09 16:01:37,786 - PETsARD.SDMetricsSingleTable - _eval             - DEBUG    - Extracting global level as PETsARD format
2025-06-09 16:01:37,787 - PETsARD.SDMetricsSingleTable - _eval             - DEBUG    - Extracting columnwise level as PETsARD format
2025-06-09 16:01:37,788 - PETsARD.SDMetricsSingleTable - _eval             - DEBUG    - Ex

Now is petsard_Loader[log-data]_Splitter[demo_[1-1]]_Preprocessor[demo]_Synthesizer[demo] save to csv...


2025-06-09 16:01:38,111 - PETsARD.ReporterOp    - _run              - DEBUG    - Data reporting completed
2025-06-09 16:01:38,111 - PETsARD.ReporterOp    - run               - INFO     - Completed ReporterOp execution (elapsed: 0:00:00)
2025-06-09 16:01:38,112 - PETsARD.Executor      - _set_result       - DEBUG    - Collecting final results for Reporter
2025-06-09 16:01:38,113 - PETsARD.Executor      - run               - INFO     - Executing Reporter with save_report_global
2025-06-09 16:01:38,121 - PETsARD.ReporterOp    - run               - INFO     - Starting ReporterOp execution
2025-06-09 16:01:38,121 - PETsARD.ReporterOp    - _run              - DEBUG    - Starting data reporting process
2025-06-09 16:01:38,125 - PETsARD.ReporterOp    - _run              - DEBUG    - Reporting configuration initialization completed
2025-06-09 16:01:38,127 - PETsARD.ReporterOp    - _run              - DEBUG    - Data reporting completed
2025-06-09 16:01:38,127 - PETsARD.ReporterOp    - run       

Now is petsard[Report]_[global] save to csv...


In [None]:
exec_case.get_result()[
    "Loader[log-data]_Splitter[demo_[1-1]]_Preprocessor[demo]_Synthesizer[demo]_Postprocessor[demo]_Evaluator[demo-quality]_Reporter[save_report_global]"
]["[global]"]

Unnamed: 0,full_expt_name,Loader,Splitter,Preprocessor,Synthesizer,Postprocessor,Evaluator,Score,Data Validity,Data Structure,Column Shapes,Column Pair Trends
0,Loader[log-data]_Splitter[demo_[1-1]]_Preproce...,log-data,demo_[1-1],demo,demo,demo,[global],0.78296,1.0,1.0,0.951258,0.614662
