Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra round trips refreshing URL #1014

Open
turan18 opened this issue Dec 27, 2023 · 0 comments
Open

Extra round trips refreshing URL #1014

turan18 opened this issue Dec 27, 2023 · 0 comments
Labels
bug Something isn't working performance Improves performance

Comments

@turan18
Copy link
Contributor

turan18 commented Dec 27, 2023

Description

When creating our httpFetcher for each layer we loop through all hosts that contains the layer and attempt to fetch the "real" layer/blob URL (for registries that have multiple storage backends (eg: ECR -> S3). In the case of ECR, we get a pre-signed S3 URL with some expiration time. We use that URL to fetch span content, and if it expires we are required to go back to ECR to get a new pre-signed URL. If there are many concurrent span requests during this time, we will be forced to refresh the URL multiple times. This is generally not a common scenario, as we will most likely have fetched all the layer content before URL expiration. I still do think it's worth exploring ways we can either reduce these unneeded round-trips or remove them entirely as they open the door for network failures that can result in IO failures for the running container.

Steps to reproduce the bug

No response

Describe the results you expected

Re: #977 (comment)

We could potentially lock the URL over the entirety of refreshURL. This wouldn't solve the problem entirely, since there still could be races where other threads still manage to get a copy of the old URL, but it would make it harder to hit. In the worst case if many threads slip through, we will have a queue of span requests waiting, each trying to refresh the URL.

We could also use a RWMutex instead, where we use a read lock every time we try to read from URL and a write lock when we attempt to refresh the URL. This is functionally equivalent to the solution above, expect it doesn't require a normal lock to read the URL.

Host information

  1. OS:
  2. Snapshotter Version:
  3. Containerd Version:

Any additional context or information about the bug

No response

@turan18 turan18 added bug Something isn't working performance Improves performance labels Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance Improves performance
Projects
None yet
Development

No branches or pull requests

1 participant