Skip to content

feature: Allow passing a PipelineVariable-dependent location to ScriptProcessor constructor's "code" argument  #3846

@wimlewis-amazon

Description

@wimlewis-amazon

Describe the feature you'd like

I would like more control over how ScriptProcessor uploads/finds its code, in particular I need to use a pipeline variable to define the S3 URI at which the code lives.

Internally, ScriptProcessor handles its code argument by converting it to a ProcessingInput and then adding it to its own inputs array (then updating its entrypoint). A fairly small change is to allow the caller to pass in a ProcessingInput instance directly.

I have already implemented this locally, I'll be submitting a pull request shortly.

How would this feature be used? Please describe.

In my case, I am uploading a number of code assets to an S3 location that is not fully known until the pipeline is started; a pipeline parameter contains the specific s3 key prefix where this job run's files are to be found.

With this change, I can create a ProcessingInput whose source depends on the pipeline parameter but whose destination is a fixed string (allowing ScriptProcessor to use it).

Describe alternatives you've considered

My first thought was to allow passing a PipelineVariable to the code input. However, ScriptProcessor wants to be able to inspect the string (to detect whether it's local or remote, to get its basename, etc). That is difficult or impossible if code is a PipelineVariable.

Another approach would be for the user to pass the code input as a PipelineVariable but also pass extra information (basename, and an indication that it's an S3 URI) in another argument. But that's exactly the information that a ProcessingInput contains anyway, so it was simpler and more consistent with other sagemaker-sdk APIs to allow passing a ProcessingInput instance.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions