# Notebook cell: รัน evaluate job ด้วย ScriptProcessor

เราจะใช้ ScriptProcessor และใช้ image ของ XGBoost container (แบบเดียวกับ training) เพื่อให้มี Python + dependenciesครบ (อย่างไรก็ตามเราก็ pip install ซ้ำใน script อยู่แล้ว)

In [1]:
from sagemaker.processing import ScriptProcessor, ProcessingInput, ProcessingOutput
from sagemaker import image_uris
import sagemaker
import boto3
from time import gmtime, strftime

sess = sagemaker.Session()
role = sagemaker.get_execution_role()
region = boto3.Session().region_name
bucket = sess.default_bucket()

print("Region:", region)
print("Bucket:", bucket)



sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml
Region: us-east-1
Bucket: sagemaker-us-east-1-423623839320


In [2]:
# โหลด S3 URI จากขั้นก่อนหน้า
%store -r processed_test_data_s3_uri
%store -r model_artifact

print("Test data S3: ", processed_test_data_s3_uri)
print("Model artifact S3:", model_artifact)



Test data S3:  s3://sagemaker-us-east-1-423623839320/sagemaker-scikit-learn-2025-12-07-12-52-23-075/output/retail-test
Model artifact S3: s3://sagemaker-us-east-1-423623839320/retail-demand/xgboost-model-2025-12-06-13-01-48/sagemaker-xgboost-2025-12-06-13-01-48-299/output/model.tar.gz


In [3]:
# ใช้ XGBoost container version เดียวกับที่ train
xgb_image_uri = image_uris.retrieve(
    framework="xgboost",
    region=region,
    version="1.7-1", 
    py_version="py3",
    instance_type="ml.m5.xlarge",
)

print("XGBoost image for evaluation:", xgb_image_uri)

# Prefix สำหรับ output ของ evaluation
timestamp = strftime("%Y-%m-%d-%H-%M-%S", gmtime())
eval_output_s3 = f"s3://{bucket}/retail-demand/evaluation-{timestamp}/"

script_processor = ScriptProcessor(
    image_uri=xgb_image_uri,
    command=["python3"],
    role=role,
    instance_count=1,
    instance_type="ml.m5.xlarge",
    base_job_name="evaluate-retail-demand",
    sagemaker_session=sess,
)

script_processor.run(
    code="evaluate.py",
    inputs=[
        ProcessingInput(
            input_name="test-data",
            source=processed_test_data_s3_uri,
            destination="/opt/ml/processing/test",
        ),
        ProcessingInput(
            input_name="model-artifact",
            source=model_artifact,
            destination="/opt/ml/processing/model",
        ),
    ],
    outputs=[
        ProcessingOutput(
            output_name="evaluation",
            source="/opt/ml/processing/output/evaluation",
            destination=eval_output_s3,
        ),
    ],
    arguments=[
        "--test_data",
        "/opt/ml/processing/test",
        "--model_dir",
        "/opt/ml/processing/model",
        "--output_dir",
        "/opt/ml/processing/output/evaluation",
    ],
    wait=True,
    logs=True,
)



INFO:sagemaker:Creating processing-job with name evaluate-retail-demand-2025-12-13-13-26-40-023


XGBoost image for evaluation: 683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-xgboost:1.7-1


[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: pip install --upgrade pip
Collecting sagemaker==2.24.1
  Downloading sagemaker-2.24.1.tar.gz (397 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting attrs (from sagemaker==2.24.1)
  Downloading attrs-25.4.0-py3-none-any.whl.metadata (10 kB)
Collecting google-pasta (from sagemaker==2.24.1)
  Downloading google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes)
Collecting protobuf3-to-dict>=0.1.5 (from sagemaker==2.24.1)
  Downloading protobuf3-to-dict-0.1.5.tar.gz (3.5 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting smdebug_rulesconfig==1.0.1 (from sagemaker==2.24.1)
  Downloading smdebug_rulesconfig-1.0.1-py2.py3-none-any.whl.metadata (943 bytes)
Collecting importlib-metadata>=1.4.0 (from sagemaker==2.24.1)
  Downloading importlib_metadata-8.7.0-py3-none-any.whl.metadata (4

In [11]:
print("Evaluation outputs stored at:", eval_output_s3)
%store eval_output_s3


Evaluation outputs stored at: s3://sagemaker-us-east-1-423623839320/retail-demand/evaluation-2025-12-03-10-49-06/
Stored 'eval_output_s3' (str)


In [1]:
%store -r eval_output_s3
print(eval_output_s3)

s3://sagemaker-us-east-1-423623839320/retail-demand/evaluation-2025-12-03-10-49-06/


In [2]:
!mkdir -p report
!aws s3 cp $eval_output_s3 report/ --recursive 

download: s3://sagemaker-us-east-1-423623839320/retail-demand/evaluation-2025-12-03-10-49-06/bias_metrics.json to report/bias_metrics.json
download: s3://sagemaker-us-east-1-423623839320/retail-demand/evaluation-2025-12-03-10-49-06/evaluation_summary.csv to report/evaluation_summary.csv
download: s3://sagemaker-us-east-1-423623839320/retail-demand/evaluation-2025-12-03-10-49-06/data_profile.json to report/data_profile.json
download: s3://sagemaker-us-east-1-423623839320/retail-demand/evaluation-2025-12-03-10-49-06/residual_hist.png to report/residual_hist.png
download: s3://sagemaker-us-east-1-423623839320/retail-demand/evaluation-2025-12-03-10-49-06/pred_vs_actual.png to report/pred_vs_actual.png
download: s3://sagemaker-us-east-1-423623839320/retail-demand/evaluation-2025-12-03-10-49-06/evaluation_summary.json to report/evaluation_summary.json
download: s3://sagemaker-us-east-1-423623839320/retail-demand/evaluation-2025-12-03-10-49-06/shap_feature_importance.csv to report/shap_featur