# Environment setting / 環境設定

In [1]:
import os  # noqa: I001
import sys
from pathlib import Path


# Handle utils.py for Colab
if "COLAB_GPU" in os.environ:
    import urllib.request

    demo_utils_url = (
        "https://raw.githubusercontent.com/nics-tw/petsard/main/demo/demo_utils.py"
    )
    exec(urllib.request.urlopen(demo_utils_url).read().decode("utf-8"))
else:
    # demo_utils.py search for local
    for p in [Path.cwd()] + list(Path.cwd().parents)[:10]:
        utils_path = p / "demo_utils.py"
        if utils_path.exists() and "demo" in str(utils_path):
            sys.path.insert(0, str(p))
            exec(open(utils_path).read())
            break

📂 Current working directory: demo/petsard-yaml/loader-yaml
✅ PETsARD demo_utils loaded. Use quick_setup() to initialize.


## Quick setup / 快速設定: Loader YAML

In [2]:
from demo_utils import display_results, display_yaml_info, quick_setup  # noqa: I001
from petsard import Executor  # noqa: I001


is_colab, branch, yaml_path = quick_setup(
    config_file=[
        "basic-loading.yaml",
        "using-schema-file.yaml",
        "multiple-data-loading.yaml",
    ],
    benchmark_data=[
        "adult-income",
        "adult-income_schema",
        "adult-income_ori",
        "adult-income_control",
        "adult-income_syn",
    ],
    petsard_branch="main",
)

✅ Changed working directory to demo: petsard/demo
   📁 Notebook location: demo/petsard-yaml/loader-yaml/
   🔍 YAML search priority: 
      1. demo/petsard-yaml/loader-yaml/
      2. demo/
   💾 Output files will be saved in: demo/
🚀 PETsARD v1.7.0
📅 2025-10-12 11:23:06 UTC+8
✅ Loaded benchmark dataset: adult-income
✅ Downloaded benchmark schema: adult-income_schema
✅ Loaded benchmark dataset: adult-income_ori
✅ Loaded benchmark dataset: adult-income_control
✅ Loaded benchmark dataset: adult-income_syn
🔧 Added to Python path: petsard/demo/petsard-yaml/loader-yaml
📁 Processing configuration files from subfolder: petsard-yaml/loader-yaml
✅ Found configuration (1/3): petsard/demo/petsard-yaml/loader-yaml/basic-loading.yaml
✅ Found configuration (2/3): petsard/demo/petsard-yaml/loader-yaml/using-schema-file.yaml
✅ Found configuration (3/3): petsard/demo/petsard-yaml/loader-yaml/multiple-data-loading.yaml


# Execution and Result / 執行與結果

## Basic Loading / 基本載入

In [3]:
display_yaml_info(yaml_path[0])
exec = Executor(yaml_path[0])
exec.run()
display_results(exec.get_result())

📋 YAML Configuration Files / YAML 設定檔案

📄 File: basic-loading.yaml
📁 Path: petsard/demo/petsard-yaml/loader-yaml/basic-loading.yaml

⚙️ Configuration content / 設定內容:
----------------------------------------
---
Loader:
  load_csv:
    filepath: benchmark/adult-income.csv
...
📊 Execution Results / 執行結果

[1] Loader[load_csv]
------------------------------------------------------------
📈 DataFrame: 48,842 rows × 15 columns
📋 Showing first 3 rows / 顯示前 3 行:

   age  workclass  fnlwgt   education  educational-num      marital-status         occupation relationship   race gender  capital-gain  capital-loss  hours-per-week native-country income
0   25    Private  226802        11th                7       Never-married  Machine-op-inspct    Own-child  Black   Male             0             0              40  United-States  <=50K
1   38    Private   89814     HS-grad                9  Married-civ-spouse    Farming-fishing      Husband  White   Male             0             0              50  U

## Using Schema File / 使用表詮釋資料檔案

In [4]:
display_yaml_info(yaml_path[1])
exec = Executor(yaml_path[1])
exec.run()
display_results(exec.get_result())

📋 YAML Configuration Files / YAML 設定檔案

📄 File: using-schema-file.yaml
📁 Path: petsard/demo/petsard-yaml/loader-yaml/using-schema-file.yaml

⚙️ Configuration content / 設定內容:
----------------------------------------
---
Loader:
  load_with_schema:
    filepath: benchmark/adult-income.csv
    schema: benchmark/adult-income_schema.yaml
...
📊 Execution Results / 執行結果

[1] Loader[load_with_schema]
------------------------------------------------------------
📈 DataFrame: 48,842 rows × 15 columns
📋 Showing first 3 rows / 顯示前 3 行:

   age  workclass  fnlwgt   education  educational-num      marital-status         occupation relationship   race gender  capital-gain  capital-loss  hours-per-week native-country income
0   25    Private  226802        11th                7       Never-married  Machine-op-inspct    Own-child  Black   Male             0             0              40  United-States  <=50K
1   38    Private   89814     HS-grad                9  Married-civ-spouse    Farming-fishing   

## Multiple Data Loading / 多個資料載入

In [5]:
display_yaml_info(yaml_path[2])
exec = Executor(yaml_path[2])
exec.run()
display_results(exec.get_result())

📋 YAML Configuration Files / YAML 設定檔案

📄 File: multiple-data-loading.yaml
📁 Path: petsard/demo/petsard-yaml/loader-yaml/multiple-data-loading.yaml

⚙️ Configuration content / 設定內容:
----------------------------------------
---
Loader:
  # Load training data
  load_train:
    filepath: benchmark/adult-income_ori.csv
    schema: benchmark/adult-income_schema.yaml

  # Load test data
  load_test:
    filepath: benchmark/adult-income_control.csv
    schema: benchmark/adult-income_schema.yaml

  # Load synthesizing data
  load_synthesizer:
    filepath: benchmark/adult-income_syn.csv
    schema: benchmark/adult-income_schema.yaml
...
📊 Execution Results / 執行結果

[1] Loader[load_train]
------------------------------------------------------------
📈 DataFrame: 39,073 rows × 15 columns
📋 Showing first 3 rows / 顯示前 3 行:

   age  workclass  fnlwgt   education  educational-num      marital-status         occupation relationship   race gender  capital-gain  capital-loss  hours-per-week native-countr