Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

w/ GitLab Registry: "invalid status code from registry 404" #16065

Closed
2xB opened this issue Oct 6, 2022 · 5 comments
Closed

w/ GitLab Registry: "invalid status code from registry 404" #16065

2xB opened this issue Oct 6, 2022 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@2xB
Copy link

2xB commented Oct 6, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

When pushing to a GitLab Container Registry, I sometimes randomly get Error: trying to reuse blob sha256:... at destination: Requesting bearer token: invalid status code from registry 404 (Not Found). Weirdly enough, when I upload multiple images from the same base image, I get this error message once for every image I try to push, the second push for each image works again.

Steps to reproduce the issue:

  1. Have a GitLab project with a GitLab Runner that can build and push Podman images (e.g. via Shell executor or custom executor)

  2. Use it very often

  3. Sometimes (rarely) get this error during podman push ...

Describe the results you received:

Error: trying to reuse blob sha256:... at destination: Requesting bearer token: invalid status code from registry 404 (Not Found)

Describe the results you expected:

Successfully pushing the image to the GitLab Container Registry

Additional information you deem important (e.g. issue happens only occasionally):

I have a full log of podman --log-level=debug push ... the time it fails. I probably can't post the full log, but if there's something to check in that log, please tell!

Output of podman version:

Client:       Podman Engine
Version:      4.2.1
API Version:  4.2.1
Go Version:   go1.18.5
Built:        Wed Sep  7 19:58:19 2022
OS/Arch:      linux/amd64

Output of podman info:

host:
  arch: amd64
  buildahVersion: 1.27.0
  cgroupControllers:
  - cpuset
  - cpu
  - cpuacct
  - blkio
  - memory
  - devices
  - freezer
  - net_cls
  - perf_event
  - net_prio
  - hugetlb
  - pids
  - rdma
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.4-2.fc36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.4, commit: '
  cpuUtilization:
    idlePercent: 94.36
    systemPercent: 0.85
    userPercent: 4.78
  cpus: 6
  distribution:
    distribution: fedora
    variant: container
    version: "36"
  eventLogger: file
  hostname: 04e8b959c867
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.4.0-126-generic
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 2666459136
  memTotal: 4914487296
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.6-2.fc36.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.6
      commit: 18cf2efbb8feb2b2f20e316520e0fd0b6c41ef4d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 4243320832
  swapTotal: 4294963200
  uptime: 26h 8m 44.00s (Approximately 1.08 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.imagestore: /var/lib/shared
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.9-1.fc36.x86_64
      Version: |-
        fusermount3 version: 3.10.5
        fuse-overlayfs: version 1.9
        FUSE library version 3.10.5
        using FUSE kernel interface version 7.31
    overlay.mountopt: nodev,fsync=0
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 40501673984
  graphRootUsed: 16835969024
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /tmp/custom-executor1517200462
  imageStore:
    number: 57
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.2.1
  Built: 1662580699
  BuiltTime: Wed Sep  7 19:58:19 2022
  GitCommit: ""
  GoVersion: go1.18.5
  Os: linux
  OsArch: linux/amd64
  Version: 4.2.1

Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):

podman/stable from quay.io, run inside another Podman inside a VirtualBox

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

podman/stable from quay.io, run inside another Podman inside a VirtualBox
@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 6, 2022
@godmar
Copy link

godmar commented Nov 12, 2022

Seeing this, too:

Getting image source signatures
Copying blob 5e20f8c18569 done  
Copying blob 737bba0c9709 done  
Copying blob b6f7abd6e37c done  
Copying blob c4274b5038ad done  
Copying blob b8163b8c9643 done  
Copying blob ad4ec4251bd6 done  
Copying blob 955ae623e991 done  
Copying blob b683911c7a17 done  
Copying blob d3b613e4fb58 done  
Copying blob db1368a8cb30 done  
Copying blob cbacafea21ba done  
Copying blob 18ad796038f4 done  
Copying blob 9a0cfd299f99 done  
Error: trying to reuse blob sha256:e1a3dbe31d144c87e8080650bb6b838476557d39ee44f2a991eeb7b9112087c1 at destination: Requesting bearer token: invalid status code from registry 404 (Not Found)
$ podman --version
podman version 4.2.0

A subsequent attempt, with an identical command, gives:

Getting image source signatures
Copying blob e1a3dbe31d14 done  
Copying blob be7d0bc88e59 skipped: already exists  
Copying blob e98bc7322851 skipped: already exists  
Copying blob a69b449af3f3 skipped: already exists  
Copying blob 28a3e39501a6 skipped: already exists  
Copying blob 2bd8259d2f9a skipped: already exists  
Copying blob a1e911549013 skipped: already exists  
Copying blob fb5af25374d6 skipped: already exists  
Copying blob cd0f7b3e11ea skipped: already exists  
Copying blob 1c7035554292 skipped: already exists  
Copying blob 0f7a2a4290d2 skipped: already exists  
Copying blob e7e772d1799e skipped: already exists  
Copying blob b654b5bdfa65 skipped: already exists  
Copying blob bc05c0b6d844 skipped: already exists  
Copying config 34405a8c2f done  
Writing manifest to image destination
Storing signatures

@mheon
Copy link
Member

mheon commented Nov 16, 2022

This looks like it could be an intermittent failure in the registry itself?

@mhio
Copy link

mhio commented Dec 5, 2022

I've seen the same with a buildah push after moving a build from kaniko.

1.2.3.4 - gitlab-ci-token [05/Dec/2022:06:41:34 +0000] "GET /jwt/auth?account=gitlab-ci-token&scope=repository%3Agroup%2Fproject%2Fnamespace%3Apull%2Cpush&service=container_registry HTTP/1.1" 200 964 "" "Buildah/1.27.2" 1.25
1.2.3.4 - gitlab-ci-token [05/Dec/2022:06:41:34 +0000] "GET /jwt/auth?account=gitlab-ci-token&scope=repository%3Agroup%2Fproject%2Fnamespace%3Apull%2Cpush&service=container_registry HTTP/1.1" 404 9830 "" "Buildah/1.27.2" 5.78
1.2.3.4 - gitlab-ci-token [05/Dec/2022:06:41:34 +0000] "GET /jwt/auth?account=gitlab-ci-token&scope=repository%3Agroup%2Fproject%2Fnamespace%3Apull%2Cpush&service=container_registry HTTP/1.1" 200 965 "" "Buildah/1.27.2" 1.25
1.2.3.4 - gitlab-ci-token [05/Dec/2022:06:41:34 +0000] "GET /jwt/auth?account=gitlab-ci-token&scope=repository%3Agroup%2Fproject%2Fnamespace%3Apull%2Cpush&service=container_registry HTTP/1.1" 200 966 "" "Buildah/1.27.2" 1.25
1.2.3.4 - gitlab-ci-token [05/Dec/2022:06:41:34 +0000] "GET /jwt/auth?account=gitlab-ci-token&scope=repository%3Agroup%2Fproject%2Fnamespace%3Apull%2Cpush&service=container_registry HTTP/1.1" 200 966 "" "Buildah/1.27.2" 1.25
1.2.3.4 - gitlab-ci-token [05/Dec/2022:06:41:34 +0000] "GET /jwt/auth?account=gitlab-ci-token&scope=repository%3Agroup%2Fproject%2Fnamespace%3Apull%2Cpush&service=container_registry HTTP/1.1" 200 967 "" "Buildah/1.27.2" 1.25

Which in the gitlab application log was:

"exception.class":"ActiveRecord::RecordNotFound",
"exception.message":"Couldn't find ContainerRepository",
"exception.backtrace":[
  "app/models/container_repository.rb:600:in `find_by_path!'",
  "app/models/container_repository.rb:592:in `find_or_create_from_path'",
  "app/services/auth/container_registry_authentication_service.rb:162:in `ensure_container_repository!'",
...

So yes it looks like the core problem is a gitlab issue.

That issue could be exacerbated by the 6 or so duplicate GET /jwt/auth requests buildah/podman is sending all at once? Just trying to reason why this popped up with the move to buildah.

@MLNW
Copy link

MLNW commented Dec 6, 2022

I'm experiencing the same issue with Podman 4.2.0 and GitLab 15.6.1-ee.

@mhio did you raise or find an issue on GitLab side regarding this issue?

@mhio
Copy link

mhio commented Dec 13, 2022

@MLNW I haven't followed up on the gitlab side.

We cleaned up a number of unrelated issues causing auth retries on this gitlab and these 404's also stopped occurring.

@containers containers locked and limited conversation to collaborators Dec 14, 2022
@rhatdan rhatdan converted this issue into discussion #16842 Dec 14, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants