# Environment setting / 環境設定

In [1]:
import os  # noqa: I001
import sys
from pathlib import Path


# Handle utils.py for Colab
if "COLAB_GPU" in os.environ:
    import urllib.request

    demo_utils_url = (
        "https://raw.githubusercontent.com/nics-tw/petsard/main/demo/demo_utils.py"
    )
    exec(urllib.request.urlopen(demo_utils_url).read().decode("utf-8"))
else:
    # demo_utils.py search for local
    for p in [Path.cwd()] + list(Path.cwd().parents)[:10]:
        utils_path = p / "demo_utils.py"
        if utils_path.exists() and "demo" in str(utils_path):
            sys.path.insert(0, str(p))
            exec(open(utils_path).read())
            break

📂 Current working directory: demo/petsard-yaml/constrainer-yaml
✅ PETsARD demo_utils loaded. Use quick_setup() to initialize.


## Quick setup / 快速設定: Constrainer YAML

In [2]:
from demo_utils import display_results, display_yaml_info, quick_setup  # noqa: I001
from petsard import Executor  # noqa: I001


is_colab, branch, yaml_path = quick_setup(
    config_file=[
        "resample-mode_inline_constraints_configuration.yaml",
        "resample-mode_external-constraints-file.yaml",
        "validate-mode_single-data-source.yaml",
        "validate-mode_multiple-data-sources.yaml",
    ],
    benchmark_data=None,
    petsard_branch="main",
)

✅ Changed working directory to demo: petsard/demo
   📁 Notebook location: demo/petsard-yaml/constrainer-yaml/
   🔍 YAML search priority: 
      1. demo/petsard-yaml/constrainer-yaml/
      2. demo/
   💾 Output files will be saved in: demo/
🚀 PETsARD v1.7.0
📅 2025-10-16 19:40:20 UTC+8
🔧 Added to Python path: petsard/demo/petsard-yaml/constrainer-yaml
📁 Processing configuration files from subfolder: petsard-yaml/constrainer-yaml
✅ Found configuration (1/4): petsard/demo/petsard-yaml/constrainer-yaml/resample-mode_inline_constraints_configuration.yaml
✅ Found configuration (2/4): petsard/demo/petsard-yaml/constrainer-yaml/resample-mode_external-constraints-file.yaml
✅ Found configuration (3/4): petsard/demo/petsard-yaml/constrainer-yaml/validate-mode_single-data-source.yaml
✅ Found configuration (4/4): petsard/demo/petsard-yaml/constrainer-yaml/validate-mode_multiple-data-sources.yaml


# Execution and Result / 執行與結果

## Resample Mode: Inline Constraints Configuration / 反覆抽樣模式：內嵌約束配置

In [3]:
display_yaml_info(yaml_path[0])
exec = Executor(yaml_path[0])
exec.run()
display_results(exec.get_result())

📋 YAML Configuration Files / YAML 設定檔案

📄 File: resample-mode_inline_constraints_configuration.yaml
📁 Path: petsard/demo/petsard-yaml/constrainer-yaml/resample-mode_inline_constraints_configuration.yaml

⚙️ Configuration content / 設定內容:
----------------------------------------
---
Loader:
  load_benchmark_with_schema:
    filepath: benchmark://adult-income
    schema: benchmark://adult-income_schema
Synthesizer:
  default:
    method: default
Constrainer:
  inline_field_constraints:
    # Operating mode setting
    method: auto  # Operating mode, default 'auto' (auto-detect: has Synthesizer and not custom_data → resample)
    # Constraint conditions (use exclusively with constraints_yaml)
    field_constraints:      # Field constraint conditions, default none
                            # Age between 18 and 65
      - "age >= 18 & age <= 65"
    # Sampling parameters (resample mode only)
    target_rows: None        # Target number of output rows, defaults None to input data row count


## Resample Mode: External Constraints File / 反覆抽樣模式：外部約束檔案

In [4]:
display_yaml_info(yaml_path[1])
exec = Executor(yaml_path[1])
exec.run()
display_results(exec.get_result())

📋 YAML Configuration Files / YAML 設定檔案

📄 File: resample-mode_external-constraints-file.yaml
📁 Path: petsard/demo/petsard-yaml/constrainer-yaml/resample-mode_external-constraints-file.yaml

⚙️ Configuration content / 設定內容:
----------------------------------------
---
Loader:
  load_benchmark_with_schema:
    filepath: benchmark://adult-income
    schema: benchmark://adult-income_schema
Synthesizer:
  default:
    method: default
Constrainer:
  external_constraints:
    # Operating mode setting
    method: auto  # Operating mode, default 'auto' (auto-detect: has Synthesizer and not custom_data → resample)
    # Constraint conditions (use exclusively with setting)
    constraints_yaml: adult-income_constraints.yaml
    # Sampling parameters (resample mode only)
    target_rows: None        # Target number of output rows, defaults None to input data row count
    sampling_ratio: 10.0     # Sampling multiplier per attempt, default 10.0
    max_trials: 300          # Maximum number of attem

## Validate Mode: Single Data Source / 驗證檢查模式：單一資料來源

In [5]:
display_yaml_info(yaml_path[2])
exec = Executor(yaml_path[2])
exec.run()
display_results(exec.get_result())

📋 YAML Configuration Files / YAML 設定檔案

📄 File: validate-mode_single-data-source.yaml
📁 Path: petsard/demo/petsard-yaml/constrainer-yaml/validate-mode_single-data-source.yaml

⚙️ Configuration content / 設定內容:
----------------------------------------
---
Splitter:
  external_split:
    method: custom_data
    filepath:
      ori: benchmark://adult-income_ori
      control: benchmark://adult-income_control
    schema:
      ori: benchmark://adult-income_schema
      control: benchmark://adult-income_schema
Synthesizer:
  external_data:
    method: custom_data
    filepath: benchmark://adult-income_syn
    schema: benchmark://adult-income_schema
Constrainer:
  validate_single_data_source:
    method: auto          # Automatically selects validate mode
    source: Splitter.ori  # Specify single data source (optional if only one source exists)
    constraints_yaml: adult-income_constraints.yaml
Reporter:
  validation:
    method: save_validation
...
📊 Execution Results / 執行結果

[1] Splitter[

## Validate Mode: Multiple Data Sources / 驗證檢查模式：多個資料來源

In [6]:
display_yaml_info(yaml_path[3])
exec = Executor(yaml_path[3])
exec.run()
display_results(exec.get_result())

📋 YAML Configuration Files / YAML 設定檔案

📄 File: validate-mode_multiple-data-sources.yaml
📁 Path: petsard/demo/petsard-yaml/constrainer-yaml/validate-mode_multiple-data-sources.yaml

⚙️ Configuration content / 設定內容:
----------------------------------------
---
Loader:
  load_benchmark_with_schema:
    filepath: benchmark://adult-income
    schema: benchmark://adult-income_schema
Splitter:
  external_split:
    method: custom_data
    filepath:
      ori: benchmark://adult-income_ori
      control: benchmark://adult-income_control
    schema:
      ori: benchmark://adult-income_schema
      control: benchmark://adult-income_schema
Synthesizer:
  external_data:
    method: custom_data
    filepath: benchmark://adult-income_syn
    schema: benchmark://adult-income_schema
Constrainer:
  validate_multiple_data_sources:
    method: auto  # Automatically selects validate mode
    source:       # Use list format for multiple sources
      - Loader
      - Splitter.ori
      - Splitter.control
 