Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quirks with revalidation on new and unpublished posts #97

Open
tddostu opened this issue Nov 23, 2022 · 2 comments
Open

Quirks with revalidation on new and unpublished posts #97

tddostu opened this issue Nov 23, 2022 · 2 comments
Labels

Comments

@tddostu
Copy link

tddostu commented Nov 23, 2022

Hello, me again from #93. Thanks @stipsan for sorting that so quickly! I thought I'd make a new issue as this seems slightly different.

Unforuntately I've found a couple of other quirks with revalidation 馃槵

New posts:

When adding a new post, sometimes, some pages do not get revalidated. The webhook appears to send the right requests on all attempts, but some don't seem to update.

In the video below:

  • I publish 'New Post 3', the webhook sends the correct routes, but it does not update the actual site.
  • I then publish 'New Post 04', the webhook sends the correct routes again, but the site shows only the previous 'New Post 3' update (I promise I'm hitting refresh on the far right multiple times after an update, it's just out of the capture area 馃檪).
  • Finally I publish 'New Post 05', the webhook sends the correct routes, all it all works as expected with all pages up to date.

A similar thing (not in the video) happens when I update an authors image. The revalidation routes sent are correct, but only some actaully update (some posts have the old author image, or even a different author if that was changed). The behaviour seems the same as the post issue, with only some pages actually get the new/updated content via on-demand ISR, despite all being requested to do so.

Sometimes however, it seems to all work. I can't quite figure out a pattern behind when it does.

Deleted/unpublished posts:

When deleting a post, everything seems to work fine, revalidation occurs and the post is removed. However if I instead unpublish it, it only requests revalidation for the homepage and the unpublished route, leaving the now unpublished post in the 'More stories' section of other posts.

Nextjs_Sanity-ISR_Issue02.mp4
@stipsan
Copy link
Member

stipsan commented Nov 24, 2022

Hi and thanks again for another wonderful report 鈽猴笍

TL;DR

There's a known race condition due to a mismatch in how GROQ webhooks are intended to be used, and the constraints of Vercel's on-demand incremental static regeneration.
We added a short-term fix for it in the next-sanity/webhook package, that you can enable by
adding true as the third argument in the parseBody function

const { body, isValidSignature } = await parseBody(
req,
process.env.SANITY_REVALIDATE_SECRET
)

We've made it enabled by default in the latest version of next-sanity, and this starter will be updated to that version soon.

Deep dive into what's happening, why, and how we'll long-term fix it

These quirks are the result of a race condition where the GROQ webhook is dispatched before the indexes have reached eventual consistency. We designed our webhooks to have the lowest possible latency. It's functionally similar to the visibility=async option in our client libs. So it waits until there's an index available that can resolve possible GROQ queries defined in the Filter and Projection before it http POST to your pages/api/revalidate. This only takes a few ms, while it can take up to a second for eventual consistency to happen on all the indexes.

That's why the subsequent queries in pages/api/revalidate can return stale data. And since Vercel's infra is also super low latency (it takes 200ms from you call res.revalidate and until getStaticProps is called, queries run, react renders and new static HTML file is generated).

This is the result of a fundamental mismatch in the constraints GROQ webhooks are designed to solve really well and the architecture of Vercel's on-demand incremental static regeneration.
The problem of "my content changed so my static site needs to rebuild & redeploy" was already solved with other tooling and ISR didn't exist yet. Back then the main grievance with webhooks where that they always seemed to both include too much data in the payload but also not enough so you had to run subsequent queries.
That's why GROQ webhooks don't have a visibility option. The "Filter" and "Projection" GROQ queries are supposed to be used instead of running subsequent queries in your webhook handler and are always consistent. And with no subsequent queries there's no need to wait for eventual consistency before firing the webhook.

That's why, while GROQ Webhooks are the best solution for implementing on-demand ISR that we have to offer, it's not a good solution. It might be tempting to just slap a "visibility" option on GROQ webhooks that lets you opt-in to only fire the webhook after eventual consistency is reached. But that would be like trying to improve a hammer's 馃敤 ability to drive in screws 馃敥. What we really want is a good screwdriver 馃獩, right?
So that's what we're working on. We're partners with Vercel and actively exploring how we can build a deep integration with their invalidation APIs. Automatic revalidation, of getStaticProps handlers, or the new React Server Components in Next 13, shouldn't be harder to setup than pressing a button in our Vercel Integration.

While we're working hard to build that we'll continue to keep next-sanity/webhook up to date with the recommendations from our Content Lake team. If it becomes necessary to increase the timeout to ensure eventual consistency for subqueries, or it turns out we can decrease it, we'll publish a new version.

Hope this helps, and if anything is unclear I'm happy to elaborate 馃槃

@tddostu
Copy link
Author

tddostu commented Nov 24, 2022

Hi @stipsan, thanks so much for the detailed answer and short-term fix! I'll be checking those out shortly.

How curious. When trying to troubleshoot, I had a sneaking suspicion that I might be hitting Sanity or Vercel too fast for the update to have propagated or been 'ready' perhaps. Somewhat relieved to hear I wasn't totally off the mark in this line of thought.

Glad to hear that it's being looked into by Sanity along with Vercel, I think ISR is a wonderful addition to both services. Combined with NextJS's preview abilities I've found it wonderful, clients love getting a full fidelity preview and having the change go live immediately.

Also spotted that you sorted out deleted/unpublished posts in #94, amazing! Thanks again for your quick and detailed replies, along with the excellent solutions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants