SRE-747: Tag ECR :staging on the image, not a single-entry index#8823
Conversation
`tag-if-newer` advanced the mutable tag with `docker buildx imagetools create`, which always synthesizes a manifest list — even from a single source. On ECR (single-arch) that wrapped :staging in a single-entry image index, so its digest no longer matched the :sha/:run image: promotion couldn't walk :staging back to a run- tag, and Inspector findings sat on the index instead of the scanned child. Advance the ECR tag with `aws ecr put-image` instead (re-PUT the source manifest bytes under the mutable tag): same digest, flat image, no index. GHCR stays on imagetools create — a multi-arch manifest list is correct there. crane could unify both later (SRE-759).
PR SummaryMedium Risk Overview The advance step passes Reviewed by Cursor Bugbot for commit b5ee710. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Pull request overview
This PR adjusts the tag-if-newer composite action so that advancing a mutable tag preserves the correct manifest shape per registry—preventing ECR :staging from becoming a single-entry image index (which breaks digest-based promotion and Inspector findings).
Changes:
- Makes the “advance mutable tag” step registry-aware: uses
aws ecr batch-get-image+aws ecr put-imagefor ECR to re-PUT the exact manifest bytes (flat image, identical digest). - Keeps GHCR behavior unchanged by continuing to use
docker buildx imagetools createto produce/advance multi-arch manifest lists. - Updates the action’s inline documentation to explain the registry-specific behavior and requirements (AWS credentials/region for ECR).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8823 +/- ##
=======================================
Coverage 59.08% 59.08%
=======================================
Files 1343 1343
Lines 129731 129731
Branches 5866 5866
=======================================
+ Hits 76651 76654 +3
+ Misses 52177 52174 -3
Partials 903 903 Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Benchmark results
|
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2002 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1001 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 3314 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 1526 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 2078 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 1033 | Flame Graph |
policy_resolution_medium
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 102 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 51 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 269 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 107 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 133 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 63 | Flame Graph |
policy_resolution_none
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 8 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 3 | Flame Graph |
policy_resolution_small
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 52 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 25 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 94 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 26 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 66 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 29 | Flame Graph |
read_scaling_complete
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id;one_depth | 1 entities | Flame Graph | |
| entity_by_id;one_depth | 10 entities | Flame Graph | |
| entity_by_id;one_depth | 25 entities | Flame Graph | |
| entity_by_id;one_depth | 5 entities | Flame Graph | |
| entity_by_id;one_depth | 50 entities | Flame Graph | |
| entity_by_id;two_depth | 1 entities | Flame Graph | |
| entity_by_id;two_depth | 10 entities | Flame Graph | |
| entity_by_id;two_depth | 25 entities | Flame Graph | |
| entity_by_id;two_depth | 5 entities | Flame Graph | |
| entity_by_id;two_depth | 50 entities | Flame Graph | |
| entity_by_id;zero_depth | 1 entities | Flame Graph | |
| entity_by_id;zero_depth | 10 entities | Flame Graph | |
| entity_by_id;zero_depth | 25 entities | Flame Graph | |
| entity_by_id;zero_depth | 5 entities | Flame Graph | |
| entity_by_id;zero_depth | 50 entities | Flame Graph |
read_scaling_linkless
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | 1 entities | Flame Graph | |
| entity_by_id | 10 entities | Flame Graph | |
| entity_by_id | 100 entities | Flame Graph | |
| entity_by_id | 1000 entities | Flame Graph | |
| entity_by_id | 10000 entities | Flame Graph |
representative_read_entity
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1
|
Flame Graph |
representative_read_entity_type
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| get_entity_type_by_id | Account ID: bf5a9ef5-dc3b-43cf-a291-6210c0321eba
|
Flame Graph |
representative_read_multiple_entities
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_property | traversal_paths=0 | 0 | |
| entity_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=0 | 0 | |
| link_by_source_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true |
scenarios
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| full_test | query-limited | Flame Graph | |
| full_test | query-unlimited | Flame Graph | |
| linked_queries | query-limited | Flame Graph | |
| linked_queries | query-unlimited | Flame Graph |
🌟 What is the purpose of this PR?
Since #8801 (SRE-739), the ECR
:stagingtag has been applied to an image index instead of the image itself (SRE-747) — blocking promotion to production and breaking Inspector findings.Root cause: the
tag-if-neweraction advanced the mutable tag withdocker buildx imagetools create, whose only job is to assemble a manifest list — it wraps even a single source in an index. On ECR (single-arch, arm64) that produced a single-entry image index whose digest no longer matched the:sha/:runimage. Same problem class as SRE-613, different cause (there it was BuildKit provenance; here it's the retag tool).Fix: advance the ECR tag with
aws ecr put-image— re-PUT the exact source manifest bytes under the mutable tag (same digest, flat image, no index). GHCR keepsimagetools create, where a multi-arch manifest list is the correct, intended shape.🔗 Related links
🚫 Blocked by
🔍 What does this change?
tag-if-newernow branches the advance step by registry:*.dkr.ecr.*.amazonaws.com/*):aws ecr batch-get-imagefetches the:sha-<sha>manifest,aws ecr put-imagere-PUTs it under the mutable tag → identical digest, flat image. A retry after a partial success hitsImageAlreadyExistsException, which is the intended end state and is treated as success.imagetools create).git merge-base --is-ancestor) are untouched and stay shared.Pre-Merge Checklist 🚀
🚢 Has this modified a publishable library?
This PR:
📜 Does this require a change to the docs?
The changes in this PR:
🕸️ Does this require a change to the Turbo Graph?
The changes in this PR:
is_mainpath is not exercised by CI before merge — thestagejob (which runstag-if-newerfor ECR:staging) only runs onmainpushes, so this fix is first exercised on the firstmainrun after merge. Two assumptions that run must confirm:aws ecr put-imageaccepts thebatch-get-imagemanifest without an explicit--image-manifest-media-type(aws-cli v2 infers it from the OCI manifest'smediaType).imagetools inspectstill readsorg.opencontainers.image.revisionfrom the now-flat ECR image (via.image.config.Labels), not only from an index.:stagingindex self-heals on the first post-mergemainrun (put-image overwrites it with the flat image).🐾 Next steps
crane tagto collapse the ECR/GHCR branch into one registry-agnostic line (tradeoff: new dependency, buildx stays for the resolve step). Marked with aTODO(SRE-759)at the branch.🛡 What tests cover this?
469…amazonaws.com/hashintel/hash/graph→hashintel/hash/graph;ghcr.io/...→ imagetools branch). The ECR mutation itself is only exercisable on the first post-mergemainrun (see Known issues).❓ How to test this?
mainrun'sstagejob::stagingon ECR should resolve to the same digest as the run's:sha-<sha>/:run-<run_id>tags (no separate index digest).aws ecr describe-images --repository-name hashintel/hash/<service> --image-ids imageTag=staging→imageManifestMediaTypeshould be an image manifest, not...image.index.v1+json.📹 Demo
n/a — CI / infra change.