Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback: "blocking" #1007

Merged
merged 7 commits into from Apr 21, 2021
Merged

Conversation

jvarho
Copy link
Collaborator

@jvarho jvarho commented Apr 19, 2021

Implements fallback: "blocking" and tests for it.

Fixes #833

Actual change in default-handler was pretty minimal now that the rest of the SSG stuff works correctly.

@codecov
Copy link

codecov bot commented Apr 19, 2021

Codecov Report

Merging #1007 (4471d99) into master (5eedc12) will increase coverage by 0.08%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1007      +/-   ##
==========================================
+ Coverage   72.38%   72.47%   +0.08%     
==========================================
  Files          75       75              
  Lines        2908     2917       +9     
  Branches      675      676       +1     
==========================================
+ Hits         2105     2114       +9     
  Misses        671      671              
  Partials      132      132              
Impacted Files Coverage Δ
...ackages/libs/lambda-at-edge/src/default-handler.ts 93.41% <100.00%> (+0.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5eedc12...4471d99. Read the comment docs.

@adamdottv
Copy link

@jvarho just calling this out for visibility: if you wanted to take on #804, I'm offering a $2000 USD bounty for a PR that adds ISR (revalidate) to this repo.

@jvarho
Copy link
Collaborator Author

jvarho commented Apr 20, 2021

@adamelmore I'd also really like to see ISR – I'm currently using fallback + a separate lambda that deletes pages from S3 periodically to get a poor man's version of ISR.

I haven't been able to come up with a way to implement ISR without stepping outside lambda@edge, so I've been assuming CDK support and some core rework is needed first.

@adamdottv
Copy link

@adamelmore I'd also really like to see ISR – I'm currently using fallback + a separate lambda that deletes pages from S3 periodically to get a poor man's version of ISR.

I haven't been able to come up with a way to implement ISR without stepping outside lambda@edge, so I've been assuming CDK support and some core rework is needed first.

I think the consensus was to submit to an SQS queue from the Lambda@Edge function, and have a lambda triggered from that queue. Dedup on the queue would be used to prevent multiple regen operations within the revalidate window.

Noting that this would carry a bit of a latency hit; it would be faster to emit an SNS mitigation from Lambda@Edge to identically named topics in every region, with an SQS queue subscribed to each of those topics. The queue would handle dedup, and the function would still be triggered off of the queue. This would be much faster from the Lambda@Edge side (non-blocking), but also more complex.

Any of that seem clear to you?

@jvarho
Copy link
Collaborator Author

jvarho commented Apr 20, 2021

I think the consensus was to submit to an SQS queue from the Lambda@Edge function, and have a lambda triggered from that queue. Dedup on the queue would be used to prevent multiple regen operations within the revalidate window.

That's more or less along the lines I was thinking. However, it would only support revalidate values larger than the 5-minute minimum deduplication window for SQS queues, right?

Noting that this would carry a bit of a latency hit; it would be faster to emit an SNS mitigation from Lambda@Edge to identically named topics in every region, with an SQS queue subscribed to each of those topics. The queue would handle dedup, and the function would still be triggered off of the queue. This would be much faster from the Lambda@Edge side (non-blocking), but also more complex.

I don't really see the advantage. Whatever the edge lambda does, whether it's sending a message to a region-local SNS or a global SQS, it does not need to wait for confirmation, since at worst the message is lost and the next request sends a new one. The semantics of revalidation are (AFAIU) that rendering happens "at most" once in the revalidate window.

@jvarho
Copy link
Collaborator Author

jvarho commented Apr 20, 2021

Ah, sorry, I'm an idiot. The dedup could be based on metadata from S3, not just the key, so revalidate windows shorter than 5 minutes are certainly possible to support with SQS deduplication.

@adamdottv
Copy link

I don't really see the advantage. Whatever the edge lambda does, whether it's sending a message to a region-local SNS or a global SQS, it does not need to wait for confirmation, since at worst the message is lost and the next request sends a new one. The semantics of revalidation are (AFAIU) that rendering happens "at most" once in the revalidate window.

You could be right here. I had thought it would be more latency to put a message on an SQS queue (living in a specific region) from any arbitrary edge location, even ignoring any ack back from SQS, but I've not tested this.

@@ -164,6 +179,33 @@ describe("Pages Tests", () => {
});
});

describe("Dynamic SSG fallback blocking", () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant "Dynamic SSG fallback"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only that, but it duplicates (some of) the test above. It's a rebase artifact, fixed now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And since they duplicated the "not cached" test, they failed the e2e test run.

@dphang
Copy link
Collaborator

dphang commented Apr 20, 2021

@dphang
Copy link
Collaborator

dphang commented Apr 20, 2021

ran e2e tests again: https://github.com/serverless-nextjs/serverless-next.js/runs/2393671265?check_suite_focus=true. Looks like just one test had failed, can take a look later unless you have time to look at it

@jvarho
Copy link
Collaborator Author

jvarho commented Apr 21, 2021

ran e2e tests again: https://github.com/serverless-nextjs/serverless-next.js/runs/2393671265?check_suite_focus=true. Looks like just one test had failed, can take a look later unless you have time to look at it

I was able to reproduce. The request sometimes returns a cloudfront error because it has headers:

date: [object Object]
server: [object Object]

The actual response code and content are correct (the test successfully checks them), but cloudfront understandably doesn't like those headers. It's probably already an issue before this pull request, but retries can get around it if the test case returns the object directly from S3 and then cloudfront has a chance to cache it.

I think the bug is due to lambda-at-edge-compat rewriting the headers, but I haven't tried to understand that code yet.

@dphang
Copy link
Collaborator

dphang commented Apr 21, 2021

ran e2e tests again: https://github.com/serverless-nextjs/serverless-next.js/runs/2393671265?check_suite_focus=true. Looks like just one test had failed, can take a look later unless you have time to look at it

I was able to reproduce. The request sometimes returns a cloudfront error because it has headers:

date: [object Object]
server: [object Object]

The actual response code and content are correct (the test successfully checks them), but cloudfront understandably doesn't like those headers. It's probably already an issue before this pull request, but retries can get around it if the test case returns the object directly from S3 and then cloudfront has a chance to cache it.

I think the bug is due to lambda-at-edge-compat rewriting the headers, but I haven't tried to understand that code yet.

Hm, I didn't see this fail before in previous tests though...this one also failed in the previous e2e test run. So it might be some be related to your change? I know there's some existing strangeness in origin response, since we modifying whatever comes from s3 (the origin), sometimes I also saw "x-cache: Error from cloudfront" even if a proper page got returned

@jvarho
Copy link
Collaborator Author

jvarho commented Apr 21, 2021

Hm, I didn't see this fail before in previous tests though...this one also failed in the previous e2e test run. So it might be some be related to your change? I know there's some existing strangeness in origin response, since we modifying whatever comes from s3 (the origin), sometimes I also saw "x-cache: Error from cloudfront" even if a proper page got returned

I added a failing unit test and a fix for it. Looks to be already introduced in #544.

@jvarho
Copy link
Collaborator Author

jvarho commented Apr 21, 2021

Actually, looking at the headers brings up another issue. The initial SSG fallback data request (or page with blocking) had no cache control header so it took an extra request or two for cloudfront to cache it.

Also shows up in the most recent e2e run as an extra retry for "serves and caches page /optional-catch-all-ssg-with-fallback/not-found" and likewise for /not-prerendered.

@dphang
Copy link
Collaborator

dphang commented Apr 21, 2021

Actually, looking at the headers brings up another issue. The initial SSG fallback data request (or page with blocking) had no cache control header so it took an extra request or two for cloudfront to cache it.

Also shows up in the most recent e2e run as an extra retry for "serves and caches page /optional-catch-all-ssg-with-fallback/not-found" and likewise for /not-prerendered.

Nice, thanks for a great find! I never realized that missing cache-control headers were why it was had the x-cache error at origin response, thought it was just CloudFront being flaky. Let me run the tests once again

@dphang dphang merged commit ef3c9ad into serverless-nextjs:master Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC: Support getStaticPaths fallback: 'blocking'
3 participants