Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure x-amz-meta-Surrogate-Control is set for /versions from S3 #3787

Merged
merged 1 commit into from May 10, 2023

Conversation

segiddins
Copy link
Member

This should help fastly not 503 when downloading the S3 compact index files

This should help fastly not 503 when downloading the S3 compact index files
@segiddins segiddins requested a review from indirect May 10, 2023 03:24
@indirect
Copy link
Member

Explanation from a Fastly engineer:

tl;dr: try setting a Surrogate-Control header.

okay, i am able to explain some more now:

  • the shield pop sets the TTL for any cachable object to 3600 seconds if vcl_fetch runs. when the revalidation logic runs, there is no VCL that is executed. this means that the updated timer for the object on the shield is set to 60 seconds from the origin's Cache-Control.
  • the edge pop sets the TTL by looking at Cache-Control: max-age=60
    this combination sets up an unfortunate side effect: the edge POPs cannot use objects with Age > 60 more than for the requesting client. this sets up a case where request collapsing is not able to use the object for more than one client, and each client serializes its request to the shield. the result is that many of them end up waiting too long for the response and get a first byte timeout. incidentally, this is why you do not see these errors coming from your shield POP in seattle.

the solution you want depends on which end user behavior you want. i am making the assumption you want end users to cache the object for no more than 60 seconds.

  • first, set beresp.http.Surrogate-Control = "max-age=3600, stale-while-revalidate=1800" in the place in vcl_fetch where you set beresp.ttl when checking if you're using your origin directly instead of the shield director (that is, you're checking if you're running on the shield).
  • second, complain at me when i have inevitably gotten something wrong.
    Your VCL should look something like this afterwards:
    if (beresp.cacheable) {
      set beresp.ttl = 3600s;
      set beresp.stale_if_error = 86400s;
      set beresp.stale_while_revalidate = 1800s;
      set beresp.http.Surrogate-Control = "max-age=3600, stale-while-revalidate=1800";
    }

Now, a caveat:

  • the shield pop will continue to revalidate with the origin every 60 seconds after the initial 3600 seconds expires. this is because the revalidation logic that happens automatically in varnish does not execute the user's vcl_fetch, so this updated logic won't hold.
  • you could attempt to have your origin send the Surrogate-Control header as well, in which case you'd expect consistent behavior everywhere

@indirect indirect enabled auto-merge May 10, 2023 03:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants