Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for all resources to load before indicating Watcher is idle #16

Merged
merged 3 commits into from
Sep 14, 2023

Conversation

jwir3
Copy link
Contributor

@jwir3 jwir3 commented Sep 7, 2023

Issue: CAP-1082

What Changed

This adds a set of promises to the Watcher class that represent network requests made by the client prior to the archive being created. For each of the network requests, as long as it is not ignored (i.e. within the baseUrl), the promise will resolve when the network request resolves. Once all of these promises are resolved, OR when a 4s timer has expired, the Watcher will indicate it is idle.

How to test

You can run the tests using yarn test. A new test was added with a large image that takes an amount of time to load. Alternatively, run the test-archiver after having linked a project that can utilize it with yarn link.

Change Type

  • maintenance
  • documentation
  • patch - This should introduce no new behavior and instead should be backward compatible.
  • minor
  • major
📦 Published PR as canary version: 0.0.25--canary.16.982fd10.0

✨ Test out this PR locally via:

npm install @chromaui/test-archiver@0.0.25--canary.16.982fd10.0
# or 
yarn add @chromaui/test-archiver@0.0.25--canary.16.982fd10.0

@jwir3 jwir3 requested a review from tmeasday September 7, 2023 20:56
@linear
Copy link

linear bot commented Sep 7, 2023

CAP-1082 Wait for all resources to load

Right now we simply wait 1s after takeArchive is called.

Copy link
Member

@tmeasday tmeasday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you missed an important but hairy case -- when a request triggers a second request.

Also I think we need to figure out a way to test this (the timeouts and the cascading requests). Might be something useful in the capture tests for that.

// eslint-disable-next-line @typescript-eslint/no-implied-eval
let timeoutId: any;
const timeoutPromise = new Promise((r) => {
timeoutId = setTimeout(r, 4000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add the unresolved requests/promises to the log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that's a good idea. I'll add logging information for this.


await Promise.race([
timeoutPromise,
Promise.all([...this.unfulfilledRequests.values(), ...this.unsettledResolvers.values()]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a request creates further requests in resolving?

I think probably we need to do something more complicated, ie check at the end of each request if "all requests are resolved".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can add this logic. I had another concern, too, that Promise.all might resolve before any of the requests are actually added to the set. I suppose I could add another 500ms wait at the very beginning, but this seems like the incorrect solution.

Comment on lines 24 to 26
private unfulfilledRequests: Map<string, Promise<any>>;

private unsettledResolvers: Map<string, any>;
Copy link
Member

@tmeasday tmeasday Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these names could be more descriptive. Also why is the second one string=>any?

I'm not sure I quite see why they need to be separate lists either? But maybe I am missing something (if so can you put a comment explaining ;) )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one holds Promises, but the second one holds functions that resolve Promises. They probably don't need to be separate lists - I'm not 100% sure why I separated them, other than I think I was experimenting with a different way of doing some of these. Honestly, I don't think that the unsettledResolvers really needs to exist. I'll refactor this a bit and see if I can coalesce them.

Comment on lines 100 to 103
let resolver: any = null;
const prom = new Promise((resolve, reject) => {
resolver = resolve;
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth pulling this out to a helper to make it clearer, maybe something like:

const complete = this.registerUnfulfilledRequest(request.url);

// later
complete();

Then the registerUnfulfilledRequest could look something like:

registerUnfulfilledRequest() {
  const complete = // something similar to what you have
 
  complete.then(() => {
     if (/* all requests are resolved */) {
       this.allRequestsDoneComplete(); // this triggers a "super" promise representing all requests
     }
   }

  return complete;
}

@jwir3 jwir3 force-pushed the jwir3/cap-1082-wait-for-all-resources-to-load branch from 6ef64af to 0517b98 Compare September 8, 2023 18:47
@jwir3
Copy link
Contributor Author

jwir3 commented Sep 8, 2023

Also I think we need to figure out a way to test this (the timeouts and the cascading requests). Might be something useful in the capture tests for that.

How would you recommend approaching the testing of this? I can look through the capture tests, which is a good idea, but my naive initial idea was to add something like this to the express server:

      app.get('/timeout', async (req, res) => {
        await setTimeout(() => {
          res.sendStatus(204);
        }, 7000);
      });

Then make a test similar to the following:

    // eslint-disable-next-line jest/expect-expect
    it('should timeout if a request takes too long', async () => {
      const timeoutUrl = `${baseUrl}/timeout`;
      const indexPath = `/?inject=${encodeURIComponent(`<img src="${timeoutUrl}">`)}`;

      const complete = await createResourceArchive(page);

      await page.goto(new URL(indexPath, baseUrl).toString());
      const archive = await complete();

      // expectArchiveContains(archive, [indexPath, '/img.png', '/style.css']);
    });

This doesn't achieve what I want, though, because the test itself times out, rather than the setTimeout promise rejecting. Perhaps I'm doing something wrong?

@codykaup or @jmhobbs thoughts on how I might be able to implement a test for timeouts?

@codykaup
Copy link
Contributor

It looks like we have some capture tests for this and we accomplish it by setting the timeout to basically 0 then run the test. Since all the requests take longer than our timeout, we get the error we want.

I'll admit, I'm not up-to-date on this project so maybe this is warranted but it seems like we're doing a lot of heavy lifting with promises while watching these requests. The spirit of our internal navigation watcher is much like Playwright's waitUntil argument. We're essentially keeping track of requests that start/end/fail and reset a timer each time something updates. As soon as don't see any updates for X amount of time or we hit our max timeout, we exit.

@tmeasday
Copy link
Member

tmeasday commented Sep 11, 2023

I think setting the timeout much lower in the test is probably the key unlock here. Otherwise your testing strategy sounds right to me @jwir3.

I'll admit, I'm not up-to-date on this project so maybe this is warranted but it seems like we're doing a lot of heavy lifting with promises while watching these requests. The spirit of our internal navigation watcher is much like Playwright's waitUntil argument. We're essentially keeping track of requests that start/end/fail and reset a timer each time something updates. As soon as don't see any updates for X amount of time or we hit our max timeout, we exit.

I think this a matter of personal preference and so I would definitely leave it up to @jwir3 or y'all.

But I would say the technique of await Promise.all(requestsThatHadStartedWhenIdleWasCalled) isn't going to work because there may be other promises added later. I guess that makes keeping a list of promises less useful, rather than just a list of booleans of "has this request finished?".

To my mind, a conceptually similar approach is a single "summary" promise of "are all requests done?" that we (maybe) resolve when each request is done. So something like:

wait() {
  await Promise.race([ timeoutPromise, this.allRequestsDonePromise ])
}

// In handler
this.client
  .send(method, params)
  .then((value) => {
     // Or simply remove from list
     this.pendingRequests.set(request.url, true);

     if (!Object.values(this.pendingRequests).any(done => !done)) {
       this.this.allRequestsDoneResolver()
     }
  });

@jwir3 jwir3 force-pushed the jwir3/cap-1082-wait-for-all-resources-to-load branch 2 times, most recently from 1a5e701 to 8820447 Compare September 13, 2023 16:14
@jwir3
Copy link
Contributor Author

jwir3 commented Sep 13, 2023

@tmeasday @tevanoff @skitterm

I think this is ready for review now. I'm actually using playwright's built-in functionality to wait for the network to be idle. This is similar to what we do in capture, but, for some reason, we weren't able to use playwright's functionality there. It's possible that the capture use case is so much more complex than this one that this is the reason, but I'm honestly not sure.

Note: The CI tests are still failing, so I'll clean that up today if you want to review the code. Fixed now.

@jwir3 jwir3 force-pushed the jwir3/cap-1082-wait-for-all-resources-to-load branch 2 times, most recently from fe63176 to 941a61b Compare September 13, 2023 16:34
This was previously aliased instead of being actually renamed. This renames
both the file in which the function is declared, as well as the function
itself.
This replaces the 1s wait inside of Watcher with waiting on a set of two
promises. The first is a global network timer. The global network timer
will reject if it expires and there are still network requests pending.

The second promise is a network idle promise that will resolve if
playwright detects that the network has not been used in some amount of
time.

Fixes CAP-1082.
This adds a jest test for when the archiver times out (i.e. the resources
requested do not load in the total time allotted), which, by default, is
2s.

Refs CAP-1082.
@jwir3 jwir3 force-pushed the jwir3/cap-1082-wait-for-all-resources-to-load branch from 941a61b to 982fd10 Compare September 13, 2023 16:35
Copy link
Member

@tmeasday tmeasday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh neat! Well that's simpler :)

I wonder if we can use that same thing in capture somehow. I guess we'll learn more in this project.

Copy link
Contributor

@tevanoff tevanoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks good to me!

@jwir3 jwir3 merged commit 3fbbf9a into main Sep 14, 2023
2 checks passed
@thafryer
Copy link
Member

🚀 PR was released in v0.0.25 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants