-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WordPress sometimes returns JSON to a Webbrowser instead of the HTML page. #580
Comments
This is normally a caching issue! This results also in ActivityPub issues, because the API requests also see HTML from time to time. |
@uk3 what caching plugins and/or hosting platform are you on? |
Austrian company "easyname". I am running redis, varnish and PHP Opcache. (All three are provided by the hoster) |
redis and OPcache should be no issue, because they only cache objects or PHP code. varnish on the other hand is an output cache and could cause issues when not configured to differentiate by Maybe this link helps https://dustinrue.com/2023/09/wordpress-activitypub-and-cloudflare/ or this one #446 (reply in thread) |
Hi,
Now (since i was able to reproduce it with the hints), it seems to be fixed and different content is served based on the Accept-Header. |
I will close it then, please re-open when still an issue! |
Looks like this is hitting other users: https://mozilla.social/@wedistribute/111660443230760980 . More background: https://snarfed.org/2023-12-28_we-distribute-in-mozilla-social-lrhodes-rndanger-tchamb-mozilla-social We discussed a bit on #indieweb-dev. Consider adding |
Just noting that |
Interesting! I assume you mean |
Well, there is another problem with the |
You can read some more about how this can happen where standard web requests from different browsers are not all standard in the requests they may send. https://www.fastly.com/blog/best-practices-using-vary-header
My concern is adding this into the plugin, in a generic blanket way, has the risk of even more toll on a users web server and caching than they had expected. This setting should be an option only with sufficient warning and caution for monitoring to users. |
Also, I understand that my specific quote is about the |
Definitely understood! You can include multiple headers in However, as far as I can tell, without One alternative would be to change the plugin to use different URLs as AP object ids, instead of reusing post URLs as is. That would avoid this problem more or less entirely. However, AP object ids can't easily be changed once they've been used in the wild, so realistically the plugin would need to preserve object ids that it's already published as is, which means that existing users would all still need |
@timnolte @snarfed what about adding the Vary header latest possible and only if we are sure it is an ActivityPub request? Maybe here: wordpress-activitypub/includes/functions.php Line 338 in 245cda8
Then we shouldn't get the issue of cache flooding on classic requests, but only on AP requests? Does that make sense? |
The problem is, I expect HTML responses are also getting cached and served to AP requests for AS2 JSON. If you only serve (And apologies if I got heated here! I guess I care about correctness and the AP ecosystem, but I know you all do too. In the end, of course, it's entirely up to you all what you choose to do!) |
I totally get the passion. As a support developer at my agency and a WordPress plugin developer I'm also thinking about the potential impacts and the wider aspects of just WordPress hosting. I spent a good amount of time myself digging into a working OpenLiteSpeed caching configuration that can target the AP JSON & HTML requests properly. My biggest concern is that, depending on the caching setup someone is using on their site, if they are on a more limited hosting plan a caching setup that inflates the cache exponentially could end up costing users a lot of money unexpectedly. That is the last thing I'd want to see happen to someone. If the choice was between potentially serving bad data to an end user or potentially costing a site owner extra money they didn't plan for I'd live with potentially serving bad data. |
FYI, I'm also not saying we should completely eliminate the idea of adding |
Good points! Thanks for the detailed explanation. Sounds like we may need an option for whether to serve |
@snarfed the thing is, that the To my Vary header proposal (apologies if this sounds naive, as the Vary concept is relatively new to me): If we don't add the header to HTML requests, they'll be cached as before. For AP JSON requests, we'd instruct the cache to use a different (special) bucket through the Vary header. Why might this not work? I assume the special bucket won't be used to serve requests without the Vary header, ensuring they still receive HTML responses. On AP JSON requests, the cache will check the designated AP bucket instead, right? |
You're right about internal WP caching plugins! You need The problem with only |
So you are saying that the external cache is only checking the headers if there is nothing in the cache? But how could the cache know where to look at? I would assume, that it at least has to check the Vary header for every request?!? But to maybe limit it otherwise: Could we add the So maybe adding it here: wordpress-activitypub/includes/functions.php Line 321 in 245cda8 |
When there's nothing in the cache, it passes the request through to your server to get a response to serve. When the cache has an HTML response with no More: https://www.fastly.com/blog/best-practices-using-vary-header#normalization
Definitely! I expect post URLs are the majority of most WordPress sites' URLs, so I don't know how much benefit this will have, but yes, good idea regardless. |
Btw that article also has the good idea to minimize cache bloat by normalizing headers like The catch is that your external cache(s) need support for that kind of programmatic normalization, which may not always be easy or available. |
🤦♂️ I always make the same mistake! The Vary header is set by the responder, not the requester!!! Sorry @snarfed and thanks for your patience! |
@capjamesg and I are seeing this intermittently right now on wedistribute.org, eg https://wedistribute.org/podcast/bridgyfed-ryan-barrett/ , cc @DeadSuperHero. @pfefferle do you want to maybe consider reopening this issue until you've added the |
@snarfed the |
The site, including Content Negotiation, works fine on my end (and it would not work when having a caching issue). |
So I feel like the real question here is whether this plugin is going to start providing detection and support for various caching and server configurations or if that needs to be implemented into the caching plugins/services. Trying to get something into this plugin may be a pretty large maintenance effort, in terms of having to change things as those plugins/services change. I have actually started working on implementing support for the ActivityPub plugin within the LiteSpeed Cache plugin as there are very specific |
@pfefferle thanks for reopening, and @timnolte thanks for looking into server config! Re |
So knowing that the LiteSpeed web servers need some specific caching configuration for ActivityPub I started thinking about Nginx caching configuration. I happened upon a Kinsta article that indicates they are sending the |
Interesting, nice sleuthing @timnolte! Reading the bottom of that nginx ticket, it looks like they actually fixed this back in May 2022, so nginx now does handle multiple Vary headers. (I also doubt the interpretation of |
Grr, stupid GitHub app commenting is garbage. Reposting as I was trying to delete the appearance of a duplicate comment. So to restate. The Nginx fix was to address a 43 character |
@snarfed also I don't see any issues with the Kinsta header settings. What should be tested is setting up Nginx caching with the the |
Huh. Are we maybe looking at different nginx changes? I'm looking at https://trac.nginx.org/nginx/changeset/cd73509f21e2daa817bf5e9074d266277915c941/nginx , linked from https://trac.nginx.org/nginx/ticket/1423#comment:7 . The message for that changeset is Upstream: handling of multiple Vary headers (ticket #1423). Previously, only the last header value was used when caching. Looking at the code, it adds a loop over the It does tweak a usage of |
@snarfed OK, I think I was perhaps misreading "multiple header values" in terms of multiple values in the |
Quick summary
Sometimes and for some users WordPress returns a JSON object to a regular browser request interested of the normal website.
This (sometimes) persists on reload
Steps to reproduce
No idea 🤷
What you expected to happen
The normal website.
What actually happened
I see JSON in the browser
Impact
Some (< 50%)
Available workarounds?
No and the platform is unusable
Logs or notes
No response
The text was updated successfully, but these errors were encountered: