Skip to content

Harden restore-cache image verification and cache cleanup#12

Merged
jiaheng merged 3 commits into
mainfrom
copilot/fix-cache-workflow-issues
May 30, 2026
Merged

Harden restore-cache image verification and cache cleanup#12
jiaheng merged 3 commits into
mainfrom
copilot/fix-cache-workflow-issues

Conversation

Copilot AI commented May 30, 2026

Copy link
Copy Markdown

The restore-cache job could fail after a successful cache restore because the empty image check was matching unreliably, and cache cleanup could 404 on the delete request. This updates the workflow to verify the cached image by reference and to treat cache deletion misses as non-fatal.

  • Restore verification

    • Replace the label-based image lookup with a reference-based lookup for empty
    • Emit a clear failure message when the expected image is absent
  • Cache deletion

    • Build the cache key once and reuse it for the delete request
    • Log the computed key and DELETE URL for easier workflow diagnosis
    • Remove the ref query parameter from the cache deletion call
    • Allow cache deletion to fail without failing the job
  • Key consistency

    • Normalize the restore job’s cache key expression formatting to match the save/delete path exactly
id="$(docker images --quiet --filter 'reference=empty')"
[[ -n "$id" ]] || { echo 'Image "empty" not found'; exit 1; }

key='docker-cache-test-${{ matrix.os }}-${{ github.run_id }}-${{ github.run_attempt }}'
url="$GITHUB_API_URL/repos/$GITHUB_REPOSITORY/actions/caches?key=$key"
curl \
  --request DELETE \
  --header 'Accept: application/vnd.github.v3+json' \
  --header 'Authorization: ****** github.token }}' \
  "$url" \
|| echo "Cache not found (non-fatal)."
Original prompt

The GitHub Actions workflow for the restore-cache job in the jiaheng/docker-cache repository is failing due to the following problems:\n\n1. Docker Image Verification Fails:\n - The check to verify cached Docker images does not reliably find the expected empty image because the filter does not match correctly.\n\n2. Cache Deletion Request Fails:\n - The curl command used to delete the GitHub Actions cache returns a 404 Not Found error, likely due to incorrect $GITHUB_REF or cache key values.\n\n### Proposed Fixes\n\n1. Fix Docker Image Verification:\n Update the verification command to use a filter that reliably matches the Docker image by reference:\n bash\n id="$(docker images --quiet --filter 'reference=empty')"\n [[ -n "$id" ]] || { echo 'Image "empty" not found'; exit 1; }\n \n\n2. Fix Cache Deletion Request:\n - Log the constructed URL and key for debugging.\n - Allow the workflow to continue even if the cache deletion fails, treating it as a non-critical error.\n - Updated curl command: \n bash\n curl --request DELETE \\n --header 'Accept: application/vnd.github.v3+json' \\n --header "Authorization: Bearer $GITHUB_TOKEN" \\n "$GITHUB_API_URL/repos/$GITHUB_REPOSITORY/actions/caches?key=docker-cache-test-${{ matrix.os }}-${{ github.run_id }}-${{ github.run_attempt }}" \\n || echo "Cache not found (non-fatal)."\n \n\n3. Implement Workflow Changes:\n Apply the proposed fixes to the restore-cache job in .github/workflows/test.yaml.\n\n4. Validation:\n - Test the workflow on pull request branches to ensure it handles both success and failure scenarios gracefully.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.

Assistant: The job failure indicates issues with verifying and removing cached Docker images and appears to stem from errors in the restore-cache job, specifically when:

  1. Verifying Docker loaded cached empty images:
    • The script cannot find images with the description="empty image" label, leading to a non-zero exit code.
  2. Deleting the test cache:
    • A 404 Not Found error occurs when attempting a DELETE request to remove caches. This likely happens due to an incorrect $GITHUB_REF value or a misconstructed cache key.

Diagnosis and Solutions

Problem 1: Verifying Cached Docker Images

The command below fails to find an image:

id="$(docker images --quiet --filter "label=description=empty image" empty)"
[[ -n "$id" ]]

Root Cause: There might be an issue with tag usage (empty), the label definition in the Dockerfile, or the image not being created or cached as intended.

Solution:

  • Check the Dockerfile label. It's correctly defined:
    LABEL description="empty image"
  • Confirm the docker build step creates images with the right label during Build one tagged and one dangling empty test Docker image:
    docker build --tag empty . --file "$DOCKERFILE"
    docker build --tag empty --no-cache . --file "$DOCKERFILE"
    Labels and tags look fine here. Validate Docker build output logs to confirm success.

Update the verification check to be more robust:

id="$(docker images --quiet --filter 'reference=empty')"
[[ -n "$id" ]] || { echo 'Image "empty" not found'; exit 1; }

Problem 2: Cache Deletion HTTP Request

The curl DELETE operation fails with:

curl: (22) The requested URL returned error: 404
{
  "message": "Not Found"
}
  • Key Issue: $GITHUB_REF or key doesn't point to the existing cache on GitHub.
  • Cache Key Construction: Ensure cache keys are consistent:
    key: docker-cache-test-${{ matrix.os }}-${{ github.run_id }}-${{ github.run_attempt }}

Solution:

  • Log and verify variables passed to the DELETE request:

    echo "URL: $GITHUB_API_URL/repos/$GITHUB_REPOSITORY/actions/caches?key=docker-cache-test-${{ matrix.os }}-${{ github.run_id }}-${{ github.run_attempt }}&ref=$GITHUB_REF"
  • Use the following fallback for curl:

curl --request DELETE \
  --header 'Accept: application/vnd.github.v3+json' \
  --header "Authorization: Bearer $GITHUB_TOKEN" \
  "$GITHUB_API_URL/repos/$GITHUB_REPOSITORY/actions/caches?key=docker-cache-test-${{ matrix.os }}-${{ github.run_id }}-${{ github.run_attempt }}" \
  || echo "Cache not found (non-fatal)."

This approach ensures non-critical errors during cache deletion don’t halt workflows.

Updated Workflow Snippet

Here’s the relevant fix for the restore-cache job:

restore-cache:
  name: Restore Cache
  needs:
    - save-cache
  strategy:
    matrix:
      os:
        - ubuntu-22.04
...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

Copilot AI changed the title [WIP] Fix Docker image verification and cache deletion in restore-cache job Harden restore-cache image verification and cache cleanup May 30, 2026
Copilot AI requested a review from jiaheng May 30, 2026 02:10
@jiaheng jiaheng marked this pull request as ready for review May 30, 2026 02:13
@jiaheng jiaheng merged commit 1934b15 into main May 30, 2026
10 checks passed
@jiaheng jiaheng deleted the copilot/fix-cache-workflow-issues branch May 30, 2026 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants