Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.x ?token query param is cached even if "Query String Caching" is OFF #467

Closed
Romanavr opened this issue Jan 31, 2023 · 22 comments
Closed
Labels
bug Something isn't working

Comments

@Romanavr
Copy link

Romanavr commented Jan 31, 2023

After updating a number of sites to craft 4.x, I noticed a problem that while job on warming up the cache, the job itself constantly fails and starts again
I canceled the job and tried to warmup the cache through the CLI
In the logs(log file from the blitz) after the warming is done, I noticed that pages with a token parameter (?token=XXXXX) appeared in the log file, while the cache warming up for query params was turned off
[2023-01-30 23:33:43] Inactivity timeout exceeded, more than 10000 ms elapsed from last data received [https://site.com/myurl?token=Eni7rmSnSoLVranG3QVgx3ftG9tCyy-A]
Enabling query string caching and adding a ?token to the exception solves the problem, but I would like to understand why this happens at all

I checked the database, there is no URL with this parameter in the blitz cache and site elements tables. As far as I understand, these pages come from craft(I mean token param itself), I don’t fully understand what this token does.
Maybe it's somehow related to craftcms/cms#12458 / craftcms/cms#5698 ?

PHP 8.1.14
Craft Pro 4.3.5
Blitz 4.2.3

@Romanavr Romanavr added the bug Something isn't working label Jan 31, 2023
@Romanavr Romanavr changed the title 4.x ?token query param is cached even "Query String Caching" 4.x ?token query param is cached even "Query String Caching" is OFF Jan 31, 2023
@Romanavr Romanavr changed the title 4.x ?token query param is cached even "Query String Caching" is OFF 4.x ?token query param is cached even if "Query String Caching" is OFF Jan 31, 2023
@bencroker
Copy link
Collaborator

Cache warmers have been replaced by generators in Blitz 4, which do use tokens for telling Craft that a URL is a generation request, so there's nothing to be concerned about there. You should not have to add an exception for token parameters, as Blitz already does that.

Can you help me understand what the current issue is? Be sure to check the queue.log and blitz.log log files for signs of errors.

@Romanavr
Copy link
Author

Romanavr commented Jan 31, 2023

Cache warmers have been replaced by generators in Blitz 4, which do use tokens for telling Craft that a URL is a generation request, so there's nothing to be concerned about there. You should not have to add an exception for token parameters, as Blitz already does that.

Can you help me understand what the current issue is? Be sure to check the queue.log and blitz.log log files for signs of errors.

Thanks for the reply.

This is from queue.log

2023-01-30 18:42:21 [queue.ERROR] [blitz] Inactivity timeout exceeded, more than 10000 ms elapsed from last data received [https://mysite.com/flip-your-standard?token=K_VxU-H72Vxqzq3NjHJIMhLsKM9PGh4c] {"memory":62770936}
2023-01-30 18:42:21 [queue.ERROR] [blitz] Allowed transfer timeout exceeded, took longer than 10000 ms [https://mysite.com/suburl?token=K_VxU-H72Vxqzq3NjHJIMhLsKM9PGh4c] {"memory":62854768}
2023-01-30 18:42:21 [queue.ERROR] [blitz] Allowed transfer timeout exceeded, took longer than 10000 ms [https://mysite.com/suburl/a-diet-we-can-all-agree-on?token=K_VxU-H72Vxqzq3NjHJIMhLsKM9PGh4c] {"memory":62844248}
2023-01-30 18:42:21 [queue.ERROR] [blitz] Allowed transfer timeout exceeded, took longer than 10000 ms [https://mysite.com/suburl/my-site-breakfast-challenge-7-inspirerende-zomerse-recepten-voor-een-week-vegan-ontbijten?token=K_VxU-H72Vxqzq3NjHJIMhLsKM9PGh4c] {"memory":62831184}
2023-01-30 18:42:21 [queue.ERROR] [blitz] Inactivity timeout exceeded, more than 10000 ms elapsed from last data received [https://mysite.com/suburl/test?token=K_VxU-H72Vxqzq3NjHJIMhLsKM9PGh4c] {"memory":62813440}

https://github.com/putyourlightson/craft-blitz/blob/4.2.3/src/services/CacheRequestService.php#L80
Here blitz checks for token param?
If yes, then probably It's necessary to add || $request->getToken() ?

@bencroker
Copy link
Collaborator

bencroker commented Jan 31, 2023

These timeouts are happening during the cache generation process only. Can you check whether those pages actually load within 10 seconds, which is the max time allowed per generation request.

@Romanavr
Copy link
Author

Romanavr commented Jan 31, 2023

These timeouts are happening during the cache generation process only. Can you check whether those pages actually load within 10 seconds, which is the max time allowed per generation request.

Yes, I can load these pages with a token through the browser within 10 seconds (literally takes 1 second), but the page content with token param is empty (as I understand it specically made for cache generators?).

@bencroker
Copy link
Collaborator

bencroker commented Jan 31, 2023

Hmm, which cache generation driver are you using and with what settings? And is the timeout happening only for the 5 pages shown above, or for more pages?

@Romanavr
Copy link
Author

Romanavr commented Jan 31, 2023

Hmm, which cache generation driver are you using and with what settings? And is the timeout happening only for the 5 pages shown above, or for more pages?

Blitz File Storage. HTTP generator(default). Reverse Proxy Purger - None. Do not cache URLs with query strings. No remote deployer.
The timeout occurs only for pages with a token, from log I can see ~20-25 such pages.

@bencroker
Copy link
Collaborator

bencroker commented Jan 31, 2023

The timeout occurs only for pages with a token, from log I can see ~20-25 pages.

Remember, a token is always used in cache generation, so it is failing for those 20-25 pages due to a timeout.

What concurrency value is selected under Cache Generation?

@Romanavr
Copy link
Author

What concurrency value is selected under Cache Generation?

3 concurrent requests. No custom URLs.

@bencroker
Copy link
Collaborator

In that case I'm not sure what the issue is. Do you have a CDN or firewall that might be blocking repeated HTTP requests from the server? Perhaps trying the Local Generator might make sense too.

@Romanavr
Copy link
Author

In that case I'm not sure what the issue is. Do you have a CDN or firewall that might be blocking repeated HTTP requests from the server? Perhaps trying the Local Generator might make sense too.

Thank you very much for the fast replies. I'll try something and let you know.

@bencroker
Copy link
Collaborator

bencroker commented Jan 31, 2023

Yes, it looks copied directly from the docs, which are correct.

@Romanavr
Copy link
Author

Romanavr commented Feb 2, 2023

After some research, I figured out a potential issue that causes my site to go down on cache generation.

Maybe I'm wrong, but here's what I found out:
As you mentioned and as I see in the source files, generator/warmer work differently in craft 4 rather than in craft 3.
From the logs, I can see that ampphp drops the connection after some timeout (10s?), however, under the hood, PHP continues to execute the script(ignore_user_abort=off in my case), and at the same moment the generator continues to warm up the URLs, triggering more PHP execution which eventually overload the server and it crashes (in the nginx access.log I can see that after 499 code, the server responds with 503, the DB crashes)

Did Gizzly work differently? Gizzly waited for a 200 server response or timeout, but never dropped the connection himself after certain time like it works with the amphp?

@bencroker
Copy link
Collaborator

bencroker commented Feb 2, 2023

You're right that the new HTTP Generator, which uses AMPHP, uses a timeout of 10 seconds, whereas the old Guzzle Warmer did not enforce a timeout. I could make the value configurable, but it seems wise to set a timeout. Your server should enforce its own timeout (via the max_execution_time setting), but if requests are really take that long to fulfill then I can see how the requests from the generator could build up fast and possibly overwhelm the server.

A few things you could try are:

  1. Reduce the HTTP Generator concurrency.
  2. Use the Local Generator instead.
  3. Figure out why requests are taking so long to be served (recommended!).

@Romanavr
Copy link
Author

Romanavr commented Feb 3, 2023

You're right that the new HTTP Generator, which uses AMPHP, uses a timeout of 10 seconds, whereas the old Guzzle Warmer did not enforce a timeout. I could make the value configurable, but it seems wise to set a timeout. Your server should enforce its own timeout (via the max_execution_time setting), but if requests are really take that long to fulfill then I can see how the requests from the generator could build up fast and possibly overwhelm the server.

A few things you could try are:

1. Reduce the HTTP Generator concurrency.

2. Use the Local Generator instead.

3. Figure out why requests are taking so long to be served (recommended!).

Thank you for the detailed answer!
If possible, please add such option so It'll work like under the craft 3.x.
Some pages take too long to load due to the scale of the server (this is we're actually using blitz).
Could you explain what is the main disadvantage of a local generator? Or where can I read about it?
I just want to understand the reason why it is worth using/not using it.

@bencroker
Copy link
Collaborator

bencroker commented Feb 3, 2023

If some pages are taking over 10 seconds to respond then it sounds like there might be opportunities for optimising using eager-loading. Be sure to check out the Blitz Hints utility if you haven't yet.

The Local Generator uses PHP processes to mock requests rather than making external HTTP requests. I imagine it might solve your issues.

@Romanavr
Copy link
Author

Romanavr commented Feb 3, 2023

@bencroker
Thanks for all your advices & tips.
If you get a chance, please make an option to disable automatic self-timeout for the HTTP generator (ampphp).
In my pont of view, even It make sense:
If I have heavy unoptimized pages, Craft requires (not strictly as I know from experience with fortrabbit) max_execution_time to be at least 120 seconds. https://i.imgur.com/dwYAYbc.png
So even I have a few unoptimized pages(response time >10s) and a server that doesn't scale well this may cause a RIP for the server.

In any case, I really appreciate your help, thanks again.

@Romanavr Romanavr closed this as completed Feb 3, 2023
@bencroker
Copy link
Collaborator

bencroker commented Feb 4, 2023

I added a timeout config setting to the HTTP Generator in 8d383e8, for the next release.

return [
    '*' => [
        // The generator settings.
        'cacheGeneratorSettings' => [
            'concurrency' => 3,
            // Override the default value of 10000 milliseconds
            'timeout' => 30000,
        ],
    ],
];

@bencroker
Copy link
Collaborator

bencroker commented Feb 4, 2023

FYI I set the new default timeout to 120 seconds, which is the minimum PHP max execution time that Craft requires.

@Romanavr
Copy link
Author

Romanavr commented Feb 4, 2023

FYI I set the new default timeout to 120 seconds, which is the minimum PHP max execution time that Craft requires.

Make sense, thanks!

@bencroker
Copy link
Collaborator

Just released in version 4.3.0.

@holmey
Copy link

holmey commented Nov 6, 2023

I added a timeout config setting to the HTTP Generator in 8d383e8, for the next release.

return [
    '*' => [
        // The generator settings.
        'cacheGeneratorSettings' => [
            'concurrency' => 3,
            // Override the default value of 10000 milliseconds
            'timeout' => 30000,
        ],
    ],
];

Should this feature be documented in the default configuration file?

//'cacheGeneratorSettings' => ['concurrency' => 3],

@bencroker
Copy link
Collaborator

@holmey not necessarily, as the value here is meant as an example rather than documentation, and it is unlikely that the majority of people will ever have to change it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants