Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NetworkFirst waitUntil #2744

Merged
merged 5 commits into from
Feb 17, 2021

Conversation

joshkel
Copy link
Contributor

@joshkel joshkel commented Feb 4, 2021

NetworkFirst uses two promises (one for the network request, and one for the timeout). If the network request succeeds, then the timeout promise's timer is canceled. However, it attempted to wait until both promises resolved before the handler resolved as done; this meant that, if the network request succeeded, then it never resolved its handlerDone / awaitComplete promise.

Note: There are two potential issues with this PR:

  • I wasn't sure what to do about the "If Promise.race() resolved with null" case, when the NetworkFirst strategy falls back to waiting on the network request. Should the code ensure that handler.waitUntil(networkRequest) is called in the case of a network timeout + cache miss?
  • I updated a test to try and cover this case, but I couldn't figure out how to run the test suite to confirm my changes. (I found the test_server gulp task and navigated to http://localhost:3004/test/workbox-strategies/sw, but I get console errors, so I must be doing something wrong.)

R: @jeffposnick @philipwalton

Fixes #2721

NetworkFirst's design uses two promises (one for the network request, and one for the timeout). If the network request succeeds, then the timeout promise's timer is canceled. However, it attempted to wait until both promises were done before marking the event as done; this meant that, if the network request succeeded, then it never resolved its handlerDone / awaitComplete promise.

This fixes GoogleChrome#2721.
@jeffposnick
Copy link
Contributor

Thanks a ton for investigating this and proposing a fix, @joshkel!

I'll take a look, but @philipwalton touched this code more recently, so it would be great if he could give a review as well.

(Regarding manual test running, instead of using http://localhost:3004/test/workbox-strategies/sw, it should be http://localhost:3004/test/workbox-strategies/sw/, with a trailing /, or else the service worker won't register properly. This always annoys me too, and we should fix the server to automatically redirect.)

Copy link
Member

@philipwalton philipwalton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @joshkel, and nice catch! I left a few comments in the test, but overall the approach LGTM.

});

await eventDoneWaiting(event);
await donePromise;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't you think you need both eventDoneWaiting() and donePromise (I expect them to resolve at the same time. Were you seeing a situations where that was not the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the scenario I was seeing, if the network request succeeded, then it canceled the timeout, and so donePromise never resolved. However, the handlePromise still resolved, so the bug (the unresolved promise) went unnoticed until I tried to upgrade the service worker; at that point, the promise kept the old service worker in the "running" state.

For this scenario ("should return the cached response if the network request times out"), both promises resolved even before my changes. But, since the "successful completion before timeout" scenario involved a hard-to-notice bug with only one promise remaining, it seemed worthwhile to add an explicit test for the "timeout" scenario, too.

I'm happy to delete this if you don't think it's worth keeping.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for donePromise is virtually identical (actually it was copy/pasted from) the logic for eventDoneWaiting(), so you shouldn't need both. I'd prefer to just use donePromise since it uses library code rather than a test helper.

request,
event,
});

await eventDoneWaiting(event);
await donePromise;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also want to test that the handler completes before 1000ms passes, right? I.e. from my read of your changes this test would still pass with the old logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to my changes, donePromise never resolved, instead of resolving after 1000ms.

But verifying that it completes quickly is a good addition; thank you. Please see the latest commit.

(I also changed the networkTimeoutSeconds from 1000 to 10; I thought that might avoid potential confusion between 1000 seconds and 1000 ms, and 10 seconds is still plenty to make the test serve its purpose of failing if the promise doesn't resolve.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, just remove the call to eventDoneWaiting().

Let me know if for some reason the tests don't pass with that change (it shouldn't have any effect).

@jeffposnick jeffposnick self-requested a review February 9, 2021 20:50
Copy link
Contributor

@jeffposnick jeffposnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for digging into this, @joshkel!

To the point you made in your PR description: yes, I think that technically you should add to line 127:

handler.waitUntil(networkPromise); // add this
response = await networkPromise;

in the case where there's a cache miss and you need to fall back to the network.

Improve typings for waitUntil; since the revised logic uses a callback, it can't rely on type inference as much.
@joshkel
Copy link
Contributor Author

joshkel commented Feb 15, 2021

Thanks, @jeffposnick. I don't think that adding

handler.waitUntil(networkPromise);

is the best solution, because that partially undoes the timeout - if the network is slow, the initial response can still come through when the timeout expires, but the handler's waitUntil / doneWaiting won't resolve until the network request resolves.

I pushed an alternate approach; please let me know what you think. Thanks.

@jeffposnick
Copy link
Contributor

Thanks again for your continued work on getting this straight! This latest update is 👍 from me, but I would appreciate a final look from @philipwalton before we merge.

packages/workbox-strategies/src/NetworkFirst.ts Outdated Show resolved Hide resolved
});

await eventDoneWaiting(event);
await donePromise;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for donePromise is virtually identical (actually it was copy/pasted from) the logic for eventDoneWaiting(), so you shouldn't need both. I'd prefer to just use donePromise since it uses library code rather than a test helper.

request,
event,
});

await eventDoneWaiting(event);
await donePromise;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, just remove the call to eventDoneWaiting().

Let me know if for some reason the tests don't pass with that change (it shouldn't have any effect).

Follow coding style; remove eventDoneWaiting calls, since that duplicates donePromise.
@joshkel
Copy link
Contributor Author

joshkel commented Feb 16, 2021

Done. Thanks for the reviews, @jeffposnick and @philipwalton.

Copy link
Member

@philipwalton philipwalton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks again for catching this!

This was referenced Mar 16, 2021
@joshkel joshkel deleted the networkfirst-waituntil branch February 1, 2023 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Service worker stuck in busy state with NetworkFirst requests
3 participants