Skip to content

Pipeline code upload location invalid (standalone job pattern applied) for steps included in Condition step #4142

@AndreiVoinovTR

Description

@AndreiVoinovTR

Describe the bug
I am deploying a SageMaker Pipeline with some Training and Processing steps included in a ConditionStep - as if_steps. I
am noticing that local code (scripts) for these Training and Processing steps is being uploaded to s3 to locations with patterns like s3://<bucket>/<prefix>/<base_job_name><tm_stamp>/... while it should be something like s3://<bucket>/<prefix>/<pipeline_name>/<code_hash>/....

I tried to trace it in the SDK code and I believe the issue is due to the fact that all the build_steps() calls for the whole list of steps (under this ConditionStep) are executed within a single _pipeline_config_manager context (of the 'parent' condition step) - with the code_hash property of None.

See https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/workflow/utilities.py#L120

To reproduce
Create and deploy a Pipeline with Processing/Training steps (with local script code) wrapped in a Condition step. Observe the resulting code upload s3 location.

Expected behavior
Local code for pipeline steps should be uploaded to a pipeline-specific location in s3.

System information
SageMaker Python SDK version: 2.188.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions