-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the feature you'd like
Definitions:
- Account A: where I will run SageMaker FrameworkProcessor jobs.
- Account B: where data exists that I need to access for
ProcessingInput.sourceandProcessingOutput.destination - Role A: IAM role in account A that has
SageMakerFullAccesspolicy attached. - Role B: IAM role in acount B that has S3 read and write access to the needed data.
Role A can assume role B, and role B has a trusted relationship with Role A.
Problem:
I would like to create a FrameworkProcessor (specifically a TensorFlowProcessor) instance that can run on Account A but read and write data to Account B to avoid having to copy data back and forth between the two accounts.
How would this feature be used? Please describe.
A role parameter could be added to the ProcessingInput and ProcessingOutput classes that would be assumed before accessing the data.
processor = TensorFlowProcessor(role=role_A,...)
processor.run(
inputs=[ProcessingInput(source=.., destination=.., role=role_B)],
outputs=[ProcessingOutput(source=..,destination=.., role=role_B)],
...
)
Describe alternatives you've considered
There is a role parameter in the TensorFlowProcessor constructor. However,
- Using Role A fails with error
No S3 objects found under S3 URL...: reason is object exists in Account B not A! - Using Role B fails with error:
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateProcessingJob operation: RoleArn: Cross-account pass role is not allowed.
Reason: we need the SageMaker permissions on Account A to be defined in the role.
How can I tell SageMaker to use Role A for creating and running the processing job but to assume role B to access the datasets in account B? For example, I am able to do that easily in SageMaker notebooks.
Please let me know if there is a way to achieve that with the current FrameworkProcessor implementation.