Skip to content

Replace GitHub cache library with our own implementation#234

Merged
aomarks merged 17 commits intomainfrom
replace-github-cache-library
May 15, 2022
Merged

Replace GitHub cache library with our own implementation#234
aomarks merged 17 commits intomainfrom
replace-github-cache-library

Conversation

@aomarks
Copy link
Copy Markdown
Member

@aomarks aomarks commented May 15, 2022

Replaces @actions/cache with our own implementation for interacting with the GitHub Actions cache service.

API mismatch with @actions/cache

@actions/cache takes a key and a set of glob patterns, and the literal glob patterns themselves are automatically used to generate a cache "version". The key + version then serve as a compound key for the cache entry.

This didn't work well for us because the glob implementation it uses doesn't support exclusions the way we need, so it is incompatible with our glob behavior, and may also be different in other subtle ways. We really just want to provide a list of files/directories for the tarball, and set the key independently.

But if we list all the files directly, then we can never get a cache hit, because the full list of files needs to be part of the cache key, but knowing the list of files requires running the script, which is what we're trying to avoid by restoring from cache!

The workaround we had before

The workaround we had before was to reach into the internal/ directory of @actions/cache and call some of the lower level functions directly, which let us control exactly what the cache key + version was, while still passing an explicit list of files to tar.

This worked ok, but had a high risk of breakage, because it uses non-public APIs. #227 is an example of us breaking because of that. It also required us to write some of our own typings, and to turn off skipLibCheck in our tsconfig.json. It was still quite a lot of code, too.

In addition, we still had to deal with an edge case relating to empty directories. By default, tar includes all recursive contents of a directory -- so if we wanted to cache an empty directory that wasn't actually empty on disk, we had to resort to a weird "empty directories manifest" hack. But now we can just use tar --no-recursion flag which avoids that whole issue.

This PR

This PR completely replaces @actions/cache with a custom implementation that uses https.request and execFile('tar', ...).

As a big bonus, our install size decreased from 25MB to 2.4MB and our transitive dependency count decreased from 93 to 29.

Testing

Fixes #107

Copy link
Copy Markdown
Member

@rictic rictic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent comments, great work validating this

if (emptyDirs.size > 0) {
const emptyDirsManifest = JSON.stringify([...emptyDirs]);
emptyDirsManifestPath = this.#emptyDirectoriesManifestPath(
async #reserveUploadAndCommitTarball(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document what the boolean return means

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

const tarballPath = await this.#makeTarball([...files], compressionMethod);
async #upload(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

script: ScriptReference,
id: number,
tarballBytes: number
): Promise<boolean> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Base automatically changed from github-cache-testing to main May 15, 2022 15:59
@aomarks aomarks force-pushed the replace-github-cache-library branch from bcc7f25 to 21abafa Compare May 15, 2022 16:03
@aomarks aomarks enabled auto-merge (squash) May 15, 2022 16:04
@aomarks aomarks merged commit 0e93dcf into main May 15, 2022
@aomarks aomarks deleted the replace-github-cache-library branch May 15, 2022 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improvements around @actions/cache

2 participants