Skip to content

go-images shared manifest tags missing linux/amd64 after partial platform publish #2313

@qmuntal

Description

@qmuntal

The microsoft/go-images repository has GitHub issues disabled, and its README directs image issues to this repository, so opening this here as a root-cause/tracking issue for the linux/amd64 manifest outage reported in #2308, #2309, and #2310.

Summary

Several shared tags for mcr.microsoft.com/oss/go/microsoft/golang were republished with manifest lists that did not include linux/amd64, even though the amd64 platform images still existed as simple tags.

This affected at least the default/shared Linux tags for Go 1.25 and Go 1.26. AMD64 Docker/BuildKit builds failed with:

no match for platform in manifest: not found

Current status

Resolved by a full rebuild/republish on 2026-05-25.

Direct MCR manifest inspection now shows linux/amd64 is present again for the affected shared tags:

1.25                  linux/amd64, linux/arm/v7, linux/arm64
1.25.10               linux/amd64, linux/arm/v7, linux/arm64
1.25-bookworm         linux/amd64, linux/arm/v7, linux/arm64
1.25.10-1-bookworm    linux/amd64, linux/arm/v7, linux/arm64
1.25-bullseye         linux/amd64, linux/arm/v7, linux/arm64
1.25-azurelinux3.0    linux/amd64, linux/arm64
1.26                  linux/amd64, linux/arm/v7, linux/arm64
1.26.3                linux/amd64, linux/arm/v7, linux/arm64
1.26-bookworm         linux/amd64, linux/arm/v7, linux/arm64
1.26-bullseye         linux/amd64, linux/arm/v7, linux/arm64
1.26-azurelinux3.0    linux/amd64, linux/arm64

The versions repo also shows fresh 2026-05-25 platform entries for 1.25/bookworm and 1.26/bookworm, including amd64.

Expected repo configuration

The go-images manifest config expects amd64 to be part of these shared tags. For example, the 1.25 bookworm image group in microsoft/go-images includes shared tags like 1.25, 1.25.10, bookworm, and latest, and includes all three Linux platforms:

linux/amd64
linux/arm64
linux/arm/v7

What happened

This appears to be a two-part failure.

1. The run skipped Linux amd64 before building

In the problematic build (2983574), the Build > Linux_amd64 phase was skipped because its matrix output variable was null:

Evaluating: and(dependencies['GenerateBuildMatrix']['outputs']['matrix.LinuxAmd64'], ...)
Expanded: and(Null, ...)
Result: False

GenerateBuildMatrix did generate a linuxAmd64: matrix in its diagnostics. It listed six amd64 build legs:

1.25 azurelinux3.0
1.25 bookworm
1.25 bullseye
1.26 azurelinux3.0
1.26 bookworm
1.26 bullseye

But the Azure Pipelines step log only shows processed output variables for:

linuxArm32
linuxArm64

The expected output variables for these matrices were absent from the ADO step log:

linuxAmd64
windowsLtsc2025Amd64
windowsLtsc2022Amd64

The local repro with the same repo, same ImageBuilder tag (mcr.microsoft.com/dotnet-buildtools/image-builder:2980918), and same generateBuildMatrix command emitted all expected matrix output variables, including linuxAmd64 and Windows. The payloads were small (~0.9 KB to ~1.5 KB), so this does not look like an output-variable size limit.

Current hypothesis: the problematic run lost the tail of ImageBuilder stdout / Azure Pipelines logging-command output after processing linuxArm64. That caused Azure Pipelines to have no matrix.LinuxAmd64 output value, so the Linux_amd64 phase skipped before scheduling.

2. The partial run then published partial shared manifests

The go-images pipeline uses the .NET Docker tooling. In that tooling, manifest list creation moved to Post_Build and is based on the merged image-info artifacts from the current pipeline run.

The relevant upstream .NET Docker tooling change is dotnet/docker-tools@f6e7c49 (Move manifest list creation to Post_Build, copy via ACR import), from dotnet/docker-tools#2030 / dotnet/docker-tools#2038 for dotnet/docker-tools#2002.

Because only arm32/arm64 build legs ran, Post_Build merged only arm image-info fragments. It then created shared manifest lists from that partial image-info, and Copy Images imported those shared tags into the publish repo. The result was shared tags that no longer contained linux/amd64.

This second-stage failure mode is the same class of issue tracked upstream in dotnet/docker-tools#2107: path-filtered or otherwise partial builds can overwrite shared tags with partial manifest lists.

Historical check

I checked completed microsoft-go-images (official) runs on microsoft/main since 2026-01-01 for the Build > Linux_amd64 phase result.

Findings:

Build      Date        Overall result  Linux_amd64  Notes
2902587    2026-02-12  canceled        skipped      Before the Post_Build manifest-list change; canceled run, so no successful publish.
2923939    2026-03-11  failed          failed       Before the Post_Build manifest-list change; amd64 did start, but failed.
2944509    2026-04-06  failed          skipped      After the Post_Build manifest-list change; failed run, so not a successful publish.
2983574    2026-05-25  succeeded       skipped      After the Post_Build manifest-list change; this is the run that published partial manifests.

So the Linux_amd64 skip/null-matrix symptom may not be completely new, but I did not find evidence of a successful pre-f6e7c49 official publish where Linux_amd64 was skipped. The visible outage required both conditions:

  1. A successful publish run where Linux_amd64 was skipped and only arm image-info fragments were present.
  2. The newer Docker tools behavior where Post_Build creates shared manifest lists from the current run's merged image-info and Publish copies those manifest lists.

Before the Docker tools change, a skipped platform could have meant an incomplete/stale rebuild, but it would not have had the same path to overwrite public shared tags with a partial manifest list.

Versions repo evidence from the broken publish

Before the repair, the published image-info ledger showed new shared manifest entries created on 2026-05-25, but the amd64 platform entries were older than the arm entries.

For 1.25/bookworm before repair:

manifest: sha256:340fa1282d06aff83635d0620c4b79c476e3c436f37f9786fc680e4aef8d557d
manifest created: 2026-05-25T10:27:47Z

bookworm amd64  created 2026-05-22  digest sha256:67b7f5310b6bc8e43491a59edd97f51beef858f02a06d5000a822a4da535e084
bookworm arm32  created 2026-05-25  digest sha256:992fb13b52f7de993d04034a278cf8530ca6fdbbcc06448b1cd936d3649c7605
bookworm arm64  created 2026-05-25  digest sha256:fef2261a718cb4edfd0a3ed870aba6b9f1716b5d9e3e71611bd3d9b0216530f0

For 1.26/bookworm before repair:

manifest: sha256:ddf6e3355e530e9f3ec15bd1b14e728f6a863e59741d6f57136e23d0783e2494
manifest created: 2026-05-25T10:27:47Z

bookworm amd64  created 2026-05-22  digest sha256:c9d5d150ecdcc5ff8d1c12ac96df99b3a663e3639098316a3f795cf2c9ba3c8a
bookworm arm32  created 2026-05-25  digest sha256:a1cb738d7986d83de238ad9a30297ac0c72f50c0bcbfdd0a7637522ae1382aea
bookworm arm64  created 2026-05-25  digest sha256:68fcbe733c85e44afd32c25aff64ee2cab355d22cb9bc84b07264d98580f810c

This matched the observed pipeline behavior: arm platforms built and published; amd64 did not run in that build.

Mitigation used

A fresh full official rebuild/publish repaired the published tags. The repair run needed all Linux platform legs and had to publish from that same run:

Pipeline: microsoft-go-images
Branch: microsoft/main
sourceBuildPipelineRunId: $(Build.BuildId)  # default
publishRepoPrefix: public/                 # default
stages: build,test,sign,publish

Do not run publish only from the broken run's image-info, because that would republish the incomplete manifest list.

User workaround while broken

Affected amd64 users were able to bypass the broken shared manifest by using the amd64 simple tag temporarily:

FROM mcr.microsoft.com/oss/go/microsoft/golang:1.25.10-1-bookworm-amd64 AS builder

If they needed Azure Linux 3.0 specifically:

FROM mcr.microsoft.com/oss/go/microsoft/golang:1.25.10-1-azurelinux3.0-amd64 AS builder

These tags are architecture-specific and do not float to future patch releases, so users should switch back to the shared tag now that the manifest lists are repaired.

Additional evidence for output tail loss

The missing linuxAmd64 matrix output is likely caused by ImageBuilder emitting Azure Pipelines control messages through normal console logging at process shutdown.

Evidence:

  • GenerateBuildMatrixCommand.EmitVstsVariables emits the ##vso[task.setvariable ...] commands with _logger.LogInformation(PipelineHelper.FormatOutputVariable(...)), not direct Console.WriteLine(...).
  • ImageBuilder configures AddSimpleConsole(...) logging and returns from Program.cs on successful command completion.
  • ImageBuilder builds a host and returns host.Services, but the successful path does not explicitly dispose the host/service provider before process exit.
  • The .NET ConsoleLoggerProcessor queues log messages to a background thread (IsBackground = true) and drains that queue in Dispose() by calling CompleteAdding() and _outputThread.Join(1500).
  • In the bad run, the final logging-command block was partially observed: linuxArm32 and linuxArm64 were processed, while linuxAmd64 and Windows outputs were missing. In the repair run and local repro, the same command emitted all five output variables.

This makes a logger flush/tail-loss race a credible explanation: the final matrix output variables were queued through ILogger, the process exited successfully, and the tail of the console logger queue did not reach Azure Pipelines' logging-command parser in build 2983574.

Proposed prevention

Two defenses are needed:

  1. Prevent partial image-info from producing partial shared manifests. This is covered by the upstream direction in dotnet/docker-tools#2107, e.g. port unchanged platforms forward before createManifestList, or otherwise ensure shared manifests are created from a complete platform set.

  2. Make GenerateBuildMatrix output-variable emission more robust. Pipeline-control logging commands should probably be written directly to stdout and flushed, instead of relying on ordinary ILogger.LogInformation(...) output at process shutdown.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions