Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False-positive cache-hit on COPY --from #870

Open
haizaar opened this issue Nov 20, 2019 · 7 comments
Open

False-positive cache-hit on COPY --from #870

haizaar opened this issue Nov 20, 2019 · 7 comments

Comments

@haizaar
Copy link

@haizaar haizaar commented Nov 20, 2019

Actual behavior
I have a two stage docker image that first build sphinx docs and then copies it to the second nginx image:

FROM python:3.7.5-alpine3.10 AS builder
RUN apk add --no-cache build-base libxml2-dev libxslt-dev graphviz
COPY requirements.txt /
RUN pip install --upgrade -r /requirements.txt
WORKDIR /build
COPY . ./
RUN make html
########################
FROM nginx:1.17.5-alpine AS runner
COPY --from=builder /build/_build/html /usr/share/nginx/html

I build it on Google Cloud Build using gcr.io/kaniko-project/executor:latest builder.

The problem is the last copy instruction - even through generated HTML files differ from the previously built revision, kaniko still prefers to use the cached version:

Step #0: build succeeded.
Step #0:
Step #0: The HTML pages are in _build/html.
Step #0:
Step #0: Build finished. The HTML pages are in _build/html.
Step #0: INFO[0062] Taking snapshot of full filesystem...
Step #0: INFO[0064] Pushing layer gcr.io/xxx/docs-server/cache:cd602b7af9c972414b3241eb3a192a606755174cf665f9806155078915975932 to cache now
Step #0: INFO[0068] Saving file /build/_build/html for later use.
Step #0: INFO[0068] Deleting filesystem...
Step #0: INFO[0068] Downloading base image nginx:1.17.5-alpine
Step #0: INFO[0069] Error while retrieving image from cache: getting file info: stat /cache/sha256:b4c0378c841cd76f0b75bc63454bfc6fe194a5220d4eab0d75963bccdbc327ff: no such file or directory
Step #0: INFO[0069] Downloading base image nginx:1.17.5-alpine
Step #0: INFO[0069] Checking for cached layer gcr.io/xxx/docs-server/cache:5de22bf6a7763f10cd0ffa0592b8f51c876286a1d0f50564cf2f21bfd6262dc4...
Step #0: INFO[0069] Using caching version of cmd: COPY --from=builder /build/_build/html /usr/share/nginx/html
Step #0: INFO[0069] Skipping unpacking as no commands require it.
Step #0: INFO[0069] Taking snapshot of full filesystem...
Step #0: INFO[0070] COPY --from=builder /build/_build/html /usr/share/nginx/html
Step #0: INFO[0070] Found cached layer, extracting to filesystem

Expected behavior
Detect that stage 0 image files' have changes and discard cache for COPY --from directive in the stage 1.

To Reproduce
Steps to reproduce the behavior:

  1. Run:
mkdir /tmp/bug; cd /tmp/bug
python3.7 -mvenv /tmp/bug-env && source /tmp/bug-env/bin/activate
mkdir docs
echo 'Sphinx==2.2.1' >> docs/requirements.txt
pip install -r docs/requirements.txt
cd docs
echo no | sphinx-quickstart -p foo -a foo -v 0.0.1 -r 0.0.1 -l en

Then add the above Dockerfile under docs/.

  1. Now add it all to git, commit and trigger Cloud Build using the following step:
steps:
  - name: gcr.io/kaniko-project/executor:latest
    args:
      - --cache=true
      - --cache-ttl=336h  # 2 weeks
      - --context=/workspace/docs
      - --dockerfile=/workspace/docs/Dockerfile
      - --destination=gcr.io/$PROJECT_ID/docs-server:$BRANCH_NAME-$COMMIT_SHA
  1. Change docs/index.rst, commit, trigger new build and see that resulting images still has the old version of the docs

Additional Information

  • Dockerfile - see above
  • Build Context - see above
  • Kaniko Image sha256:9c40a04cf1bc9d886f7f000e0b7fa5300c31c89e2ad001e97eeeecdce9f07a29

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@stroncium

This comment has been minimized.

Copy link

@stroncium stroncium commented Nov 20, 2019

I experience this problem too.

Current workaround for me is using gcr.io/kaniko-project/executor:v0.13.0.

@cvgw cvgw self-assigned this Nov 22, 2019
@cvgw cvgw mentioned this issue Nov 23, 2019
1 of 4 tasks complete
@mcfedr

This comment has been minimized.

Copy link
Contributor

@mcfedr mcfedr commented Nov 25, 2019

Just found this having the exact same issue using kaniko on cloud build. will try suggested work aorund/roll back.

@cvgw

This comment has been minimized.

Copy link
Contributor

@cvgw cvgw commented Nov 26, 2019

this appears to the same underlying problem as #845 and #589

@mcfedr

This comment has been minimized.

Copy link
Contributor

@mcfedr mcfedr commented Dec 2, 2019

@cvgw Any movement on getting that fix merged?

@cvgw

This comment has been minimized.

Copy link
Contributor

@cvgw cvgw commented Dec 2, 2019

@mcfedr hoping to get merged this week

@kzap

This comment has been minimized.

Copy link

@kzap kzap commented Dec 10, 2019

I can also confirm this does not exist in 0.13.0 but not sure why
hope the fixes can get merged soon and released

This is how I recreated the issue locally using a test Dockerfile and AWS ECR as a Cache repo

Testing

We can replicate this bug on our machine using Docker

  • Create our variables and test Dockerfile
# set some variables
docker_image="gcr.io/kaniko-project/executor:debug-v0.14.0"
ecr_endpoint="YOUR_AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com"
ecr_repo="YOUR_ECR_REPO_NAME"

# create our test Dockerfile
mkdir -p ./tmp/
cat <<'EOF' > ./tmp/Dockerfile
FROM busybox:latest as builder
COPY ./date.txt /tmp/date.txt

FROM busybox:latest
COPY --from=builder /tmp/date.txt /tmp/test.txt
EOF
  • Run Kaniko locally using docker
# run kaniko debug image with some env vars
docker run \
  --tty \
  --interactive \
  --rm \
  --volume "$(pwd)/tmp:/app" \
  --volume "$HOME/.aws":/root/.aws \
  --env "AWS_DEFAULT_REGION=us-east-1" \
  --env "ECR_ENDPOINT=${ecr_endpoint}" \
  --env "ECR_REPO=${ecr_repo}" \
  --entrypoint "" \
  "${docker_image}" \
  sh
  • Create ECR Config inside the Kaniko container
# create ECR config file for kaniko
mkdir -p /kaniko/.docker/
cat <<EOF > /kaniko/.docker/config.json
{
  "credHelpers": {
    "${ECR_ENDPOINT}": "ecr-login"
  }
}
EOF
  • Create the 1st image w/ cache from inside the Kaniko container
# Create first image
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
  --context "/app" \
  --dockerfile "/app/Dockerfile" \
  --destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
  --cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
  --cache=true \
  --cleanup

echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image1.txt"
  • Create 2nd Image w/ cache
# Create 2nd Image
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
  --context "/app" \
  --dockerfile "/app/Dockerfile" \
  --destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
  --cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
  --cache=true \
  --cleanup

echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image2.txt"
  • Create 3rd Image w/o using Cache so we can compare the output
# Create 3rd Image w/o Cache
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
  --context "/app" \
  --dockerfile "/app/Dockerfile" \
  --destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
  --cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
  --cache=false \
  --cleanup

echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image3.txt"
  • Run the 3 docker commands which output 3 text files and cat them
$ cat image{1..3}.txt
cat image1.
Tue Dec 10 14:30:10 UTC 2019
Tue Dec 10 14:30:10 UTC 2019
Tue Dec 10 15:09:24 UTC 2019
  • The files should differ slightly but they are the same, meaning it did not give us the latest file, it used the cache. The third file image3.txt gave us an updated date.txt file because it did not use the cache.

How to fix:

Lets try again to build but with docker image gcr.io/kaniko-project/executor:debug-v0.13.0

  • Run kaniko 0.13.0:
# Run Kaniko Image 0.13.0
docker_image="gcr.io/kaniko-project/executor:debug-v0.13.0"

# run kaniko debug image with some env vars
docker run \
  --tty \
  --interactive \
  --rm \
  --volume "$(pwd)/tmp:/app" \
  --volume "$HOME/.aws":/root/.aws \
  --env "AWS_DEFAULT_REGION=us-east-1" \
  --env "ECR_ENDPOINT=${ecr_endpoint}" \
  --env "ECR_REPO=${ecr_repo}" \
  --entrypoint "" \
  "${docker_image}" \
  sh
  • Create ECR config for Kaniko
# create ECR config file for kaniko
mkdir -p /kaniko/.docker/
cat <<EOF > /kaniko/.docker/config.json
{
  "credHelpers": {
    "${ECR_ENDPOINT}": "ecr-login"
  }
}
EOF
  • Create 4th Docker Image w/ Cache and 0.13.0
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
  --context "/app" \
  --dockerfile "/app/Dockerfile" \
  --destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
  --cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
  --cache=true \
  --cleanup

echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image4.txt"
  • Create 5th Docker Image w/ Cache and 0.13.0
echo $(date) > /app/date.txt
ecr_tag="test-$(date +%s)"
/kaniko/executor \
  --context "/app" \
  --dockerfile "/app/Dockerfile" \
  --destination "${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag}" \
  --cache-repo "${ECR_ENDPOINT}/${ECR_REPO}/cache" \
  --cache=true \
  --cleanup

echo "docker run -it --rm ${ECR_ENDPOINT}/${ECR_REPO}:${ecr_tag} cat /tmp/test.txt > image5.txt"
  • Run docker commands that were output and compare outputs
$ cat image{4..5}.txt
Tue Dec 10 15:32:45 UTC 2019
cat image5.	Tue Dec 10 15:35:38 UTC 2019
  • The timestamps are different so they are not being cached and we are getting the latest version of date.txt. This is what we want
@cvgw

This comment has been minimized.

Copy link
Contributor

@cvgw cvgw commented Dec 10, 2019

Once #899 is fixed this issue should be resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.