Skip to content

Fix IDC domain S3 path resolution#67987

Draft
qaziashikin wants to merge 5 commits into
apache:mainfrom
qaziashikin:idc_notebook_outputs_path_fix
Draft

Fix IDC domain S3 path resolution#67987
qaziashikin wants to merge 5 commits into
apache:mainfrom
qaziashikin:idc_notebook_outputs_path_fix

Conversation

@qaziashikin
Copy link
Copy Markdown
Contributor

@qaziashikin qaziashikin commented Jun 4, 2026

TLDR: Scoping notebook output reads to project S3 prefix

The hook reads notebook outputs from a fixed bucket-root key:

s3://<prefix>/.sys/notebooks/<notebook_id>/runs/<run_id>/notebook_outputs.json

That works for IAM domains (the bucket has no per-project prefix) but fails for IDC domains, whose project role only grants S3 access under the project's own scope:

s3:PutObject / s3:GetObject on <bucket>/${aws:PrincipalTag/AmazonDataZoneDomain}/${aws:PrincipalTag/AmazonDataZoneProject}/*

The kernel that writes the file is moving to use the project's full ProjectS3Path as the prefix, matching the role's allowed key space. Mirror that on the read side here:

  • get_project_s3_path now returns (bucket, prefix). prefix is the path component of the s3BucketPath provisioned resource.
  • get_notebook_outputs prepends prefix when constructing the output key, so reads target the same path the kernel writes to.

Testing

Tested by invoking the DAGs both IAM and IDC domains.

IAM

2026-06-04T01:59:02.203146Z [info     ] Exiting notebook run 51i8t94t6mq974. Status: SUCCEEDED [airflow.task.hooks.airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook.SageMakerUnifiedStudioNotebookHook]
2026-06-04T01:59:04.213415Z [info     ] Reading notebook outputs from s3://amazon-sagemaker-377228489309-us-west-2-bisglciuqpgv0w/shared/.sys/notebooks/bvnn0s06pp42a8/runs/51i8t94t6mq974/notebook_outputs.json [airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook]

IDC

2026-06-04T01:49:05.124510Z [info     ] Exiting notebook run cx2c783193fr8n. Status: SUCCEEDED [airflow.task.hooks.airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook.SageMakerUnifiedStudioNotebookHook]
2026-06-04T01:49:06.004935Z [info     ] Reading notebook outputs from s3://amazon-maxdome-430606112922-us-east-1-805540179/dzd-bkzf6ldy3hpbs7/d3f93aknrykrs7/dev/.sys/notebooks/4h1klahfqowoiv/runs/cx2c783193fr8n/notebook_outputs.json [airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook]

PASSED

=================== 1 passed, 1 warning in 443.18s (0:07:23) ===================

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

@boring-cyborg boring-cyborg Bot added area:providers provider:amazon AWS/Amazon - related issues labels Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant