Skip to content

Use Azure storage SDK to download cache#497

Merged
konradpabjan merged 17 commits into
actions:masterfrom
dhadka:cache-sdk
Jul 10, 2020
Merged

Use Azure storage SDK to download cache#497
konradpabjan merged 17 commits into
actions:masterfrom
dhadka:cache-sdk

Conversation

@dhadka
Copy link
Copy Markdown
Contributor

@dhadka dhadka commented Jun 8, 2020

Use the Azure Storage SDK for Node (https://github.com/Azure/azure-storage-node) when downloading cache content hosted on Azure blob storage. This is intended to improve reliability and performance because the download call, downloadToBuffer, downloads the file in 4 MB chunks which can be independently retried and parallelized.

In testing, we observed download speeds reduced by about 50% using this approach and a significant improvement in reliability. Additionally, this follows the suggestions made by Azure in our investigation into download flakiness (actions/cache#267).

One limitation of downloadToBuffer is Node only supports buffers up to ~1 GB on 32-bit systems and ~2 GBs on 64-bit systems. While the vast majority of caches are < 1 GB, there are a good number of caches > 1 GB and we technically allow up to 5 GBs. As a result, any download that exceeds the buffer limit will be split up and downloaded in multiple calls to downloadToBuffer.

Proxies are still supported using the existing http_proxy and https_proxy environment variables.

Copy link
Copy Markdown
Contributor

@joshmgross joshmgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, left some comments around making this option more friendly to other action authors.

Could we add some more context to the PR description? Would be good to have some background and link to the actions/cache issue that this relates to

Comment thread packages/cache/src/internal/cacheHttpClient.ts Outdated
Comment thread packages/cache/src/internal/cacheHttpClient.ts Outdated
const disableAzureSdk = process.env['DISABLE_AZURE_SDK'] ?? ''

if (
archiveUrl.hostname.endsWith('.blob.core.windows.net') &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any guarantees that all blob storage downloads will be at this address?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guarantee? No. This is what they've always used and I don't see it changing anytime soon. If it were to change, we would just fall back to http-client.

Comment thread packages/cache/src/internal/cacheHttpClient.ts Outdated
@dhadka
Copy link
Copy Markdown
Contributor Author

dhadka commented Jun 16, 2020

@joshmgross Any idea why the toolkit npm test fails with:

    packages/cache/src/internal/cacheHttpClient.ts:290:35 - error TS2339: Property 'constants' does not exist on type 'typeof import("buffer")'.

    290     const maxSegmentSize = buffer.constants.MAX_LENGTH

When the action runs fine in the cache tests?

@joshmgross
Copy link
Copy Markdown
Contributor

@joshmgross Any idea why the toolkit npm test fails with:

    packages/cache/src/internal/cacheHttpClient.ts:290:35 - error TS2339: Property 'constants' does not exist on type 'typeof import("buffer")'.

    290     const maxSegmentSize = buffer.constants.MAX_LENGTH

When the action runs fine in the cache tests?

Looks like it was just a type issue with @types/node. Since all actions run on the LTS version of Node 12, I installed the latest version of @types/node@12

Comment thread package.json
"devDependencies": {
"@types/jest": "^24.0.11",
"@types/node": "^11.13.5",
"@types/node": "^12.12.47",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thboop @ericsciple FYI I bumped this package version since all actions run on Node 12 and it was causing issues in the test, see #497 (comment)

@dhadka
Copy link
Copy Markdown
Contributor Author

dhadka commented Jun 18, 2020

@joshmgross Can you please take one more pass? My last commit added some testing. In order to make it work with jest, I had to move around some methods, including adding the following:

  • requestUtils.ts - Contains the retry logic and methods that check HTTP status codes (needed to avoid circular reference)
  • downloadUtils.ts - Contains the actual download logic

If you're good, then this should be ready to merge (I don't have write permission on this repo so I can't merge).

Copy link
Copy Markdown
Contributor

@joshmgross joshmgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, but moving the exported functions to new files will break existing imports. We can either bump the major version or keep the functions where they are and have them call the new function locations.

This package is well tested, so I'm fine with releasing a version 1

retryOptions: {
// Override the timeout used when downloading each 4 MB chunk
// The default is 2 min / MB, which is way too slow
tryTimeoutInMs: options?.timeoutInMs ?? 30000
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do something similar to https://github.com/actions/toolkit/blob/master/packages/glob/src/internal-glob-options-helper.ts so that we can initialize a default options object and keep all the defaults in a single place.

Copy link
Copy Markdown
Contributor Author

@dhadka dhadka Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. The one issue I hit with this approach is that even though we fill in the default value if it's undefined, the interface still shows the properties as optional, so we still need some check if it's undefined. I ended up adding an assertDefined method that throws if the property is undefined, but that should never happen in practice...it's just to make TypeScript happy.

If you know of a better way to do this, please let me know.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I also tried using the non-null operator !, but lint doesn't like it)

Comment thread packages/cache/src/internal/cacheHttpClient.ts
Comment thread packages/cache/RELEASES.md Outdated
- Fix to await async function getCompressionMethod

### 1.0.0
- Downloads Azure-hosted caches using the Azure SDK for speed and reliability No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we list the breaking changes here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Are there any other breaking changes we would want to make now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on a follow-on PR to display progress. Let's not release 1.0.0 just yet.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return false
}
const retryableStatusCodes = [
HttpCodes.BadGateway,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we get throttled here? Should we retry on 429?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should retry on throttling - it probably will make it worse, no? :D

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note that I didn't add this code, just moved it to a different file for testing purposes)

I believe it's this way because a 429 would generally mean we need to wait for some period of time. The question then becomes whether it's better for us to wait and retry, or just "cache miss" and let the user rebuild the cache content. These retries are only used when accessing the ArtifactCache API, so this would mean RU is throttling the host and that host would likely remain throttled for some time.

}

if (copy) {
if (typeof copy.useAzureSdk === 'boolean') {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for checking these types? we are already in typescript, we should have good typechecking. I guess are these sent by customers and they can send anything they want? Why don't we validate them and fail if that's the case

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the purpose of this if condition is to check if the value is set (not undefined and not null). So I think this is just an easier expression. (This pattern is used elsewhere, such as

if (typeof copy.followSymbolicLinks === 'boolean') {
)

Copy link
Copy Markdown
Contributor

@aiqiaoy aiqiaoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@konradpabjan konradpabjan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

@konradpabjan konradpabjan merged commit 4964b0c into actions:master Jul 10, 2020
scottjacobsen pushed a commit to scottjacobsen/setup-ruby that referenced this pull request Aug 21, 2020
The @actions/cache version currently being used has performance issue
with large gem caches. See actions/cache#267

This was addressed here actions/toolkit#497

Bump the @actions/cache version to a newer release that contains that
fix.
eregon pushed a commit to ruby/setup-ruby that referenced this pull request Aug 22, 2020
The @actions/cache version currently being used has performance issue
with large gem caches. See actions/cache#267

This was addressed here actions/toolkit#497

Bump the @actions/cache version to a newer release that contains that
fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants