Skip to content

Error running a pipeline with a Processing Job using a LocalSession #4307

@svpino

Description

@svpino

Describe the bug
Trying to run a pipeline with a Processing Job using a LocalSession fails with the following error:

Pipeline step 'preprocess-data' FAILED. Failure message is: TypeError: create_processing_job() got multiple values for argument 

This is happening using sagemaker version `2.199.0.

To reproduce
A clear, step-by-step set of instructions to reproduce the bug.

processor = SKLearnProcessor(
    base_job_name="preprocess-data",
    framework_version="1.2-1",
    instance_type="ml.m5.xlarge",
    instance_count=1,
    role=role,
    sagemaker_session=local_sagemaker_session,
)

preprocessing_step = ProcessingStep(
    name="preprocess-data",
    step_args=processor.run(
        code="preprocessor.py",
        inputs=[
            ProcessingInput(source=dataset_location, destination="/opt/ml/processing/input"),
        ],
        outputs=[
             ...
        ],
    )
)

pipeline = Pipeline(
    name="sample-pipeline",
    parameters=[dataset_location],
    steps=[preprocessing_step],
    sagemaker_session=local_sagemaker_session,
)

pipeline.upsert(role_arn=role)

Expected behavior
The pipeline should run locally like it does in version 2.192.1.

Screenshots or logs
Here are the full logs when running the sample code:

Starting execution for pipeline sample-pipeline. Execution ID is b3417a61-042d-4174-84ec-7d39cd529451
Starting pipeline step: 'preprocess-data'
Pipeline step 'preprocess-data' FAILED. Failure message is: TypeError: create_processing_job() got multiple values for argument 'ProcessingJobName'
Pipeline execution b3417a61-042d-4174-84ec-7d39cd529451 FAILED because step 'preprocess-data' failed.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.199.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
  • Framework version: N/A
  • Python version: 3.9
  • CPU or GPU: Apple M1
  • Custom Docker image (Y/N): N

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions