Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider enabling nginx caching #195

Closed
Nutomic opened this issue Oct 23, 2023 · 11 comments · Fixed by #237
Closed

Consider enabling nginx caching #195

Nutomic opened this issue Oct 23, 2023 · 11 comments · Fixed by #237
Labels
enhancement New feature or request help wanted Extra attention is needed
Milestone

Comments

@Nutomic
Copy link
Member

Nutomic commented Oct 23, 2023

Lemmy now sends proper cache-control headers so that responses for unauthenticated users can be cached. This would help to reduce resource usage on the server. However it could also lead to weird issues like stale responses. I previously attempted to add caching in #75, but back then it was quite hacky as Lemmy didnt set cache-control.

You can enable nginx caching by adding the following at the top of config file:

proxy_cache_path /var/cache/lemmy/voyager.lemmy.ml/ levels=1:2 keys_zone=lemmy_cache_voyager.lemmy.ml:10m max_size=100m use_temp_path=off;

And this next to proxy_pass:

proxy_cache lemmy_cache_voyager.lemmy.ml;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
proxy_no_cache $cookie_jwt $http_authorization;
proxy_cache_bypass $cookie_jwt $http_authorization;
# for debugging, should probably be disabled in prod
add_header x-cache-status $upstream_cache_status;

Caching is enabled on https://voyager.lemmy.ml/ so you can give it a try and look at x-cache-status header.

@codyro codyro self-assigned this Oct 23, 2023
@ticoombs
Copy link
Collaborator

ticoombs commented Dec 3, 2023

I've had a brief look at this previously, and had a look today. Looks like it works well. I had a MISS on the initial /communities and then a HIT immediately after.
I also saw code in lemmy that changed the auth related $jwt and whether we sent the cache status or not. So I'm not entirely sure I would want to enable this cache just yet.

I've enabled caching multiple times in $jobs and there are so many edge cases that it's sometimes difficult to find them without very thorough testing. (Mind you, I wasn't able to ask the dev's to change/fix bad items/code so I'm sure we can get everything sorted).

If we want to do this lets try a 1.4.0 release, after the dust settles on Lemmy 0.19.0.

Tasks/Thoughts

  • Ideally we would want to enable the cache in the nginx on the server level as that would reduce the amount of network traffic that the docker containers need to parse. That would unfortunately mean we force people to use nginx as a proxy. Which I think is probably okay for lemmy-ansible, but might be an issue for others.
  • How will the caching affect the other containers? (pictrs?)
  • can we safely test PII/auth flows across multiple accounts
  • if users have a CDN/proxy infront (ddos guard/etc) and all the CDN users come from X number of IPs how does this work then?

@codyro
Copy link
Collaborator

codyro commented Dec 3, 2023

can we safely test PII/auth flows across multiple accounts

Have any thoughts on how to achieve this? You hit the nail on the head with strange edge cases that are difficult to account for, which has been my main trepidation.

@ticoombs ticoombs added enhancement New feature or request help wanted Extra attention is needed labels Dec 16, 2023
@Nutomic
Copy link
Member Author

Nutomic commented Dec 28, 2023

How will the caching affect the other containers? (pictrs?)

Pictrs also sends cache-control headers, so there should be no problem.

can we safely test PII/auth flows across multiple accounts

Can you give any examples of potential edge cases? What I can say is that the logic for cache-control headers in Lemmy is really simple. If the request was made by an authenticated user it gets cache-control: private, otherwise cache-control: public; max-age=60. Additionally some routes like nodeinfo or federation endpoints have hardcoded cache-control headers. With this simple logic I dont see much potential for problems, but of course its still good to test things before putting into production.

if users have a CDN/proxy infront (ddos guard/etc) and all the CDN users come from X number of IPs how does this work then?

This is completely irrelevant for cache-control headers.

@ticoombs
Copy link
Collaborator

If the request was made by an authenticated user it gets cache-control: private

Which has now invalidated all caching of images ;) But that's not an lemmy-ansible problem.
image

When you only cache non-logged in users you only get minimal returns, especially when nearly everyone is logged in. Microcaching logged in users with bearertokens as part of the binary-key would have better gains.

@Nutomic
Copy link
Member Author

Nutomic commented Jan 2, 2024

@ticoombs That should be fixed by LemmyNet/lemmy#4337

@codyro codyro removed their assignment Jan 17, 2024
@poVoq
Copy link

poVoq commented Feb 20, 2024

Any news on this? What would be the current blockers to enable this?

How would alternative web UIs be affected?

@dessalines
Copy link
Member

We could probably add this with the next lemmy release then. @poVoq or someone could do a PR to this repo to add those blocks above.

@Nutomic
Copy link
Member Author

Nutomic commented Feb 26, 2024

Mainly it would be good if some instances could try it in production before we enable it by default. To check if there are problems like receiving stale data, cached data being served while logged in, or worst case private api responses being cached and served to other users.

@kroese
Copy link

kroese commented Feb 26, 2024

I use the above settings in production, and never noticed any issue. But my instance is only used by a handful of users, so I cannot be sure that any rare issues would have been noticed.

@ticoombs ticoombs added this to the 1.5.0 milestone Mar 24, 2024
@ticoombs
Copy link
Collaborator

I've tested this and it looks good. Shall make it into 1.5.0

@kroese
Copy link

kroese commented Mar 25, 2024

There is one minor issue that I noticed for a while now, and that is when I visit the frontpage I am not logged in.

As soon as I hit the Refresh button in the browser, I am logged in again (and stay logged in for the next hours without any problems). But when visiting the site the next day, I need to hit Refresh again.

To be honest, I cannot say for certain that this behaviour is caused by the above Nginx config, as I also have Cloudflare caching that domain. So it could also be related to some setting in my Cloudflare configuration for the domain.

But it might be worth keeping an eye on this, because this issue started appearing after I enabled the Nginx caching (and I was using the Cloudflare cache long before that).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants