Skip to content
This repository has been archived by the owner on Aug 3, 2023. It is now read-only.

Bug: Downtime on Site update #783

Closed
ashleymichal opened this issue Oct 14, 2019 · 8 comments · Fixed by #1115
Closed

Bug: Downtime on Site update #783

ashleymichal opened this issue Oct 14, 2019 · 8 comments · Fixed by #1115
Assignees
Labels
bug Something isn't working regression Something is broken, but works in previous releases sites

Comments

@ashleymichal
Copy link
Contributor

ashleymichal commented Oct 14, 2019

When we publish a site worker, often the worker and its accompanying asset manifest make it to the edge ahead of the new KV keys, and/or the "old" KV keys are removed before the worker is updated. This causes a window of downtime when a page is updated. We need to somehow delay removing "old" assets from KV until everything is live. There are a few ways to implement this, will update this issue with possible strategies soon.

@ashleymichal ashleymichal added bug Something isn't working regression Something is broken, but works in previous releases design sites labels Oct 14, 2019
@ashleymichal ashleymichal changed the title Bug: Downtime on initial Site publish Bug: Downtime on Site update Oct 14, 2019
@ashleymichal
Copy link
Contributor Author

@dihmeetree
Copy link

dihmeetree commented Dec 9, 2019

Any updates on this? :) @ashleygwilliams @ashleymichal

@ashleymichal ashleymichal self-assigned this Dec 9, 2019
@ashleymichal
Copy link
Contributor Author

hey @lolcoolkat I'll likely be investigating this on Friday, and will comment here with my conclusions.

@dihmeetree
Copy link

dihmeetree commented Dec 11, 2019

Thank you! @ashleymichal

@ashleymichal
Copy link
Contributor Author

so here's what happens.

Say i have an asset /sweet-blog-post.html. The first time Wrangler uploads it, it hashes the content and inserts it between the file stem and the extension, e.g. /sweet-blog-post.abcdef1234567890.html, and adds the following entry to the asset manifest:

"/sweet-blog-post.html": "/sweet-blog-post.abcdef.html"

If I update my sweet blog post, the next time Wrangler goes to upload, the content hash changes (e.g. /sweet-blog-post.fedcba.html. Wrangler then compares the local content with the remote content, and deletes the old version, updating the asset manifest to reflect the changes.

Unfortunately, the old version is deleted before the new asset manifest is uploaded with the worker, and so there is a window during which the old worker is trying to reference the deleted asset. This is sub-optimal.

We can reduce this window significantly by delaying deletes until after the worker has been published.

@ashleymichal
Copy link
Contributor Author

ashleymichal commented Jan 3, 2020

i'm going to defer work on this until the work on Multi-Route support has been approved and merged, there's a lot of changes to the publish command there, and there will be here as well.

My proposed solution implementation:

  • wee refactor to separate logic for publishing a site vs publishing a vanilla workers project. this just helps with cognitive load.
  • rename kv::bucket::sync to kv::bucket::diff, modify to return tuple (exclude_keys, delete_keys), and have function publish_site call upload before worker publish, and delete after.
  • (bonus) refactor to keep bucket logic specific to sites; at the moment only sites have buckets, and i'm not sure we plan to change that; at the moment it's sort of confusing that we have a bucket config on site but also on all kv namespaces.

@ashleymichal
Copy link
Contributor Author

status update: i've got a good amount of work done locally, but the number of in flight PRs, releases, and feature work being done coupled with a CI bug has me a little nervous about going much further with this, given the number of potential merge conflicts. will resume work once 1.7.0 is out.

@AngusMorton
Copy link

AngusMorton commented Feb 19, 2020

@ashleymichal are there any updates for this? This issue is a massive blocker for us, so, if the fix is going to take time then maybe we need to get a smaller interim fix in (e.g. an option to never delete resources from the KV store?).

If I can contribute to getting this solved let me know.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working regression Something is broken, but works in previous releases sites
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@AngusMorton @ashleymichal @dihmeetree and others