Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEGV when pushing layer from Google Cloud Build -> Artifact Registry #1604

Open
devjgm opened this issue Mar 20, 2021 · 5 comments
Open

SEGV when pushing layer from Google Cloud Build -> Artifact Registry #1604

devjgm opened this issue Mar 20, 2021 · 5 comments
Labels
area/gcb area/registry For all bugs having to do with pushing/pulling into registries platform/cloud-build priority/p1 Basic need feature compatibility with docker build. we should be working on this next. product/artifact-registry provider/gcp

Comments

@devjgm
Copy link

devjgm commented Mar 20, 2021

Actual behavior

...
Step #0: INFO[0148] Pushing layer us-central1-docker.pkg.dev/jgm-cloud-cxx/google-cloud-cpp-cloudbuild-docker/fedora-image/cache:c9a94ae6a3449e0c2d330d44632921f139be607646f008d7542890696f91e26f to cache now
Step #0: fatal error: unexpected signal during runtime execution
Step #0: [signal SIGSEGV: segmentation violation code=0x1 addr=0xe5 pc=0x7fd759f3dcb4]
Step #0:
Step #0: runtime stack:
Step #0: runtime.throw(0x7e0c4a, 0x2a)
Step #0:        /usr/local/go/src/runtime/panic.go:1116 +0x72
Step #0: runtime.sigpanic()
Step #0:        /usr/local/go/src/runtime/signal_unix.go:726 +0x4ac
Step #0:
Step #0: goroutine 1 [syscall]:
Step #0: runtime.cgocall(0x6a88c0, 0xc000085a30, 0xc000010038)
Step #0:        /usr/local/go/src/runtime/cgocall.go:133 +0x5b fp=0xc000085a00 sp=0xc0000859c8 pc=0x40563b
Step #0: os/user._Cfunc_mygetpwuid_r(0x0, 0xc0000b2ed0, 0xf2f070, 0x400, 0xc000010038, 0x7fd700000000)
Step #0:        _cgo_gotypes.go:175 +0x4d fp=0xc000085a30 sp=0xc000085a00 pc=0x68af8d
Step #0: os/user.lookupUnixUid.func1.1(0x0, 0xc0000b2ed0, 0xc000088dd0, 0xc000010038, 0xc000085ad0)
Step #0:        /usr/local/go/src/os/user/cgo_lookup_unix.go:103 +0xd0 fp=0xc000085a80 sp=0xc000085a30 pc=0x68bd30
Step #0: os/user.lookupUnixUid.func1(0x7987e0)
Step #0:        /usr/local/go/src/os/user/cgo_lookup_unix.go:103 +0x45 fp=0xc000085ab8 sp=0xc000085a80 pc=0x68bda5
Step #0: os/user.retryWithBuffer(0xc000088dd0, 0xc000085b90, 0x7fd75c9a1c00, 0x20300000000000)
Step #0:        /usr/local/go/src/os/user/cgo_lookup_unix.go:247 +0x3e fp=0xc000085b10 sp=0xc000085ab8 pc=0x68babe
Step #0: os/user.lookupUnixUid(0x0, 0x0, 0x0, 0x0)
Step #0:        /usr/local/go/src/os/user/cgo_lookup_unix.go:96 +0x132 fp=0xc000085bd8 sp=0xc000085b10 pc=0x68b3d2
Step #0: os/user.current(0xc000085c58, 0x4d40bc, 0xc0000b4ee0)
Step #0:        /usr/local/go/src/os/user/cgo_lookup_unix.go:49 +0x49 fp=0xc000085c18 sp=0xc000085bd8 pc=0x68b269
Step #0: os/user.Current.func1()
Step #0:        /usr/local/go/src/os/user/lookup.go:15 +0x25 fp=0xc000085c40 sp=0xc000085c18 pc=0x68bbe5
Step #0: sync.(*Once).doSlow(0xa09c40, 0x7ecf20)
Step #0:        /usr/local/go/src/sync/once.go:66 +0xec fp=0xc000085c90 sp=0xc000085c40 pc=0x474c8c
Step #0: sync.(*Once).Do(...)
Step #0:        /usr/local/go/src/sync/once.go:57
Step #0: os/user.Current(0x47705b, 0xa41360, 0xc0000b25d0)
Step #0:        /usr/local/go/src/os/user/lookup.go:15 +0x105 fp=0xc000085cc0 sp=0xc000085c90 pc=0x68ade5
Step #0: github.com/GoogleCloudPlatform/docker-credential-gcr/util.unixHomeDir(0x4d4700, 0xc0000b4ee0)
Step #0:        /go/src/github.com/GoogleCloudPlatform/docker-credential-gcr/util/util.go:42 +0x25 fp=0xc000085cf0 sp=0xc000085cc0 pc=0x68e585
Step #0: github.com/GoogleCloudPlatform/docker-credential-gcr/util.SdkConfigPath(0x0, 0x0, 0x0, 0x0)
Step #0:        /go/src/github.com/GoogleCloudPlatform/docker-credential-gcr/util/util.go:34 +0x26 fp=0xc000085d58 sp=0xc000085cf0 pc=0x68e466
Step #0: github.com/GoogleCloudPlatform/docker-credential-gcr/store.dockerCredentialPath(0x7fd7834cb108, 0xc000088dc0, 0x7fd7834d4328, 0x18)
Step #0:        /go/src/github.com/GoogleCloudPlatform/docker-credential-gcr/store/store.go:215 +0x6d fp=0xc000085dc8 sp=0xc000085d58 pc=0x69186d
Step #0: github.com/GoogleCloudPlatform/docker-credential-gcr/store.DefaultGCRCredStore(...)
Step #0:        /go/src/github.com/GoogleCloudPlatform/docker-credential-gcr/store/store.go:84
Step #0: github.com/GoogleCloudPlatform/docker-credential-gcr/cli.(*helperCmd).Execute(0xc00000e360, 0x8392e0, 0xc000016018, 0xc00009e3c0, 0x0, 0x0, 0x0, 0x0)
Step #0:        /go/src/github.com/GoogleCloudPlatform/docker-credential-gcr/cli/dockerHelper.go:35 +0x35 fp=0xc000085e78 sp=0xc000085dc8 pc=0x6a6915
Step #0: github.com/GoogleCloudPlatform/docker-credential-gcr/vendor/github.com/google/subcommands.(*Commander).Execute(0xc000012100, 0x8392e0, 0xc000016018, 0x0, 0x0, 0x0, 0x39)
Step #0:        /go/src/github.com/GoogleCloudPlatform/docker-credential-gcr/vendor/github.com/google/subcommands/subcommands.go:209 +0x30d fp=0xc000085f20 sp=0xc000085e78 pc=0x69252d
Step #0: github.com/GoogleCloudPlatform/docker-credential-gcr/vendor/github.com/google/subcommands.Execute(...)
Step #0:        /go/src/github.com/GoogleCloudPlatform/docker-credential-gcr/vendor/github.com/google/subcommands/subcommands.go:492
Step #0: main.main()
Step #0:        /go/src/github.com/GoogleCloudPlatform/docker-credential-gcr/main.go:54 +0x63f fp=0xc000085f88 sp=0xc000085f20 pc=0x6a85bf
Step #0: runtime.main()
Step #0:        /usr/local/go/src/runtime/proc.go:204 +0x209 fp=0xc000085fe0 sp=0xc000085f88 pc=0x439c89
Step #0: runtime.goexit()
Step #0:        /usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x46b841
Step #0: Collecting git+git://github.com/googleapis/python-storage@8cf6c62a96ba3fff7e5028d931231e28e5029f1c
Step #0:   Cloning git://github.com/googleapis/python-storage (to revision 8cf6c62a96ba3fff7e5028d931231e28e5029f1c) to /tmp/pip-req-build-2spex1i1
Step #0:   Running command git clone -q git://github.com/googleapis/python-storage /tmp/pip-req-build-2spex1i1
CANCELLED
ERROR: context canceled

The above error happens when after Kaniko builds a docker layer and tries to push it from Cloud Build to artifact registry.

cloudbuild.yaml

options:
  machineType: 'N1_HIGHCPU_32'
  diskSizeGb: '512'

substitutions:
  _DISTRO: "unknown"
  _BUILD_NAME: "unknown"

timeout: 3600s

steps:
- name: 'gcr.io/kaniko-project/executor:latest'
  args: [
    '--context=dir:///workspace/ci',
    '--dockerfile=ci/cloudbuild/Dockerfile.${_DISTRO}',
    '--cache=true',
    '--destination=us-central1-docker.pkg.dev/$PROJECT_ID/google-cloud-cpp-cloudbuild-docker/${_DISTRO}-image:tag1',
  ]

- name: 'us-central1-docker.pkg.dev/$PROJECT_ID/google-cloud-cpp-cloudbuild-docker/${_DISTRO}-image:tag1'
  entrypoint: 'ci/cloudbuild/build.sh'
  args: [ '${_BUILD_NAME}' ]

The Dockerfile I'm building is here: https://github.com/googleapis/google-cloud-cpp/compare/master...devjgm:cloud-build?expand=1#diff-c1691ed788ae6246565bad5ac37a26da8a3ee735f4c2e8f07b5b205ad47b4f26

The artifact registry repo exists:

$ gcloud artifacts repositories list
Listing items under project jgm-cloud-cxx, across all locations.

                                                               ARTIFACT_REGISTRY
REPOSITORY                          FORMAT  DESCRIPTION        LOCATION     LABELS  ENCRYPTION          CREATE_TIME          UPDATE_TIME
google-cloud-cpp-cloudbuild-docker  DOCKER  Docker repository  us-central1          Google-managed key  2021-03-20T10:28:01  2021-03-20T10:28:01

Expected behavior

I believe I correctly followed the instructions at https://cloud.google.com/build/docs/kaniko-cache and I expected Kaniko to successfully upload the layers and final image to artifact registry, but instead it crashes.

To Reproduce

If this is not a known bug that I'm hitting, I can try to distill the repro steps to something smaller.

Additional Information

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
  • - [ x ]
Please check if this error is seen when you use --cache flag
  • - [ x ]
Please check if your dockerfile is a multistage dockerfile
@devjgm
Copy link
Author

devjgm commented Mar 20, 2021

Note: I changed to use gcr.io/kaniko-project/executor:edge instead of :latest, AND I changed my images to use gcr.io instead of us-central1-docker.pkg.dev and it works now.

@devjgm
Copy link
Author

devjgm commented May 25, 2021

This problem is still happening. I get the above stack trace when I run with gcr.io/kaniko-project/executor:v1.6.0

BUT, it works when I use gcr.io/kaniko-project/executor:v1.6.0-debug

@devjgm
Copy link
Author

devjgm commented May 25, 2021

Naive guess at the problem

My guess is that the problem is w/ the https://github.com/GoogleCloudPlatform/docker-credential-gcr/ library. The stack trace above seems to indicate this.

And the v1.6.0 and v1.6.0-debug images are using different versions of the docker-credential-gcr helper.

BROKEN

kaniko/deploy/Dockerfile

Lines 29 to 36 in 7b64954

# Get GCR credential helper
RUN GOARCH=$(cat /goarch) && CGO_ENABLED=0 && \
(mkdir -p /go/src/github.com/GoogleCloudPlatform || true) && \
cd /go/src/github.com/GoogleCloudPlatform && \
git clone https://github.com/GoogleCloudPlatform/docker-credential-gcr.git && \
cd /go/src/github.com/GoogleCloudPlatform/docker-credential-gcr && \
make deps OUT_DIR=/usr/local/bin && \
go build -ldflags "-linkmode external -extldflags -static" -i -o /usr/local/bin/docker-credential-gcr main.go

WORKS (the -debug version)

# Get GCR credential helper
ADD https://github.com/GoogleCloudPlatform/docker-credential-gcr/releases/download/v2.0.1/docker-credential-gcr_linux_$GOARCH-2.0.1.tar.gz /usr/local/bin/
RUN tar --no-same-owner -C /usr/local/bin/ -xvzf /usr/local/bin/docker-credential-gcr_linux_$GOARCH-2.0.1.tar.gz

@jonjohnsonjr
Copy link
Contributor

tl;dr if kaniko wants to build this static binary, they need to pass more flags: golang/go#24787 (comment)

devjgm added a commit to googleapis/google-cloud-cpp that referenced this issue May 26, 2021
This change likely fixes #6327. 

As of today, Kaniko `v1.6.0` is the newest released version. Using that version directly results in the crash described in GoogleContainerTools/kaniko#1604. However, using the `v1.6.0-debug` image works, I _think_ because of a different version of the `docker-credentials-gcr` helper as I described in GoogleContainerTools/kaniko#1604 (comment)

Updating Kaniko may also help #6336
@brightsider
Copy link

I have same issue and v1.6.0-debug helps me

@aaron-prindle aaron-prindle added product/artifact-registry provider/gcp platform/cloud-build area/gcb area/registry For all bugs having to do with pushing/pulling into registries priority/p2 High impact feature/bug. Will get a lot of users happy priority/p1 Basic need feature compatibility with docker build. we should be working on this next. and removed priority/p2 High impact feature/bug. Will get a lot of users happy labels May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/gcb area/registry For all bugs having to do with pushing/pulling into registries platform/cloud-build priority/p1 Basic need feature compatibility with docker build. we should be working on this next. product/artifact-registry provider/gcp
Projects
None yet
Development

No branches or pull requests

4 participants