Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random 504 timeouts with SYNC_LOCK #32

Open
capJavert opened this issue Apr 12, 2021 · 5 comments
Open

Random 504 timeouts with SYNC_LOCK #32

capJavert opened this issue Apr 12, 2021 · 5 comments

Comments

@capJavert
Copy link
Contributor

capJavert commented Apr 12, 2021

Hello,

great package, we have cut our load times drastically after integrating next-boost.

The cache is working really well, even with thousands requests per seconds (we managed to hit 40k req/s with response time still under 500ms).

While cache is working perfectly for the most time i noticed that sometimes (on various loads) SYNC_LOCK timeout of 10s expired and then that manifests as 504 errors on our web server.

I initially thought that is happening due to revalidation logic eg:

  1. Page is under intense load
  2. Cache becomes invalid
  3. First request come and it is reported as miss
  4. Next request comes back but it is blocked by SYNC_LOCK
  5. Revalidation takes some time and then expires after 10s (timeout inside library)
  6. 504 is generated

But after running some tests with longer ttl i still got the same errors (even though all requests where hits because cache was still valid).

I am thinking that we are maybe getting some anomalies with read/write from cache?

Any thoughts on this or advices?

Thanks!

@rjyo
Copy link
Collaborator

rjyo commented Apr 12, 2021

Hi,

Thanks for the feedback on performance. Haven't tried that hard yet ;)

Yes, it must related to the logic here.
https://github.com/rjyo/next-boost/blob/master/src/cache-manager.ts#L51

It happens when

  1. next is rendering a certain url
  2. at the same time that url is requested again, and it tries to wait and get step 1's cached result
  3. next didn't finish the rendering in 10s, thus step 2 exits

Maybe you can:

  • change timeout to 60s
  • remove the timeout logic totally

Let me know whether these work for you.

Nice day!

@capJavert
Copy link
Contributor Author

capJavert commented Apr 13, 2021

  • change timeout to 60s
  • remove the timeout logic totally

Is it possible to do those two things without using something like patch-package?

Also, I was wondering, could the size of the database impact this?

How large databases sizes would hybrid-disk-cache support?

Our is about ~50-70mb on average.

@rjyo
Copy link
Collaborator

rjyo commented Apr 13, 2021

You can fork it and install the npm by "npm install user/repo".
https://docs.npmjs.com/cli/v7/commands/npm-install

How large databases sizes would hybrid-disk-cache support?

281 TB. https://www.sqlite.org/changes.html

@capJavert
Copy link
Contributor Author

I managed to minimalize 504 errors by reducing the size of the database by using paramFilter and some more strict rules. Now at ~20-30mb.

I would for sure say that 50+ mb databases have some kind of lag here and there even though sqlite supports 281 tb.

I will probably go forward and increase the MAX_WAIT to 20s to see does it remove the error completely.

Did you think about implementing true stale-while-revalidate logic as if cache is served right until it is replaced by the new entry, for example:

  1. Hit
  2. Hit
  3. Stale
  4. Stale (but still hits cache)
  5. Stale (but still hits cache)
  6. Stale (but still hits cache)
  7. Hit (cache got replaced)
  8. Hit
  9. ...

@capJavert
Copy link
Contributor Author

capJavert commented Apr 14, 2021

I did some investigating...

I will probably go forward and increase the MAX_WAIT to 20s to see does it remove the error completely.

I will not do this because this will just increase number of 504 requests because all requests that come while one request is locked will timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants