Skip to content

Conversation

@karlseguin
Copy link
Collaborator

  • Add 2 internal notifications 1 - http_request_start 2 - http_request_complete

  • When Network.enable CDP message is received, browser context registers for these 2 events (when Network.disable is called, it unregisters)

  • On http_request_start, CDP will emit a Network.requestWillBeSent message. This does not include all the fields, but what we have appears to be enough for puppeteer.waitForNetworkIdle.

  • On http_request_complete, CDP will emit a Network.responseReceived message. This does not include all the fields, bu what we have appears to be enough for puppeteer.waitForNetworkIdle.

We currently don't emit any other new events, including any network-specific lifecycleEvent (i.e. Chrome will emit an networkIdle and networkAlmostIdle).

To support this, the following other things were done:

  • CDP now has a notification_arena which is re-used between browser contexts. Normally, CDP code runs based on a "cmd" which has its own message_arena, but these notifications happen out-of-band, so we needed a new arena which is valid for handling 1 notification.

  • HTTP Client is notification-aware. The SessionState no longer includes the *http.Client directly. It instead includes an http.RequestFactory which is the combination fo the client + a specific configuration (i.e. *Notification). This ensures that all requests made from that factory have the same settings.

  • However, despite the above, some requests do not appear to emit CDP events, such as loading a <script src="X">. So the page still deals directly with the *http.Client.

  • Playwright and Puppeteer (but Playwright in particular) are very sensitive to event ordering. These new events have introduced additional sensitivity. The result sent to Page.navigate had to be moved to inside the navigate event handler, which meant passing some cdp-specific data (the input.id) into the NavigateOpts. This is the only way I found to keep both happy - the sequence of events is closer (but still pretty far) from what Chrome does.

@karlseguin karlseguin marked this pull request as ready for review May 21, 2025 11:11
@karlseguin karlseguin force-pushed the http_request_notifications branch 2 times, most recently from f49d2fe to ea3319c Compare May 22, 2025 05:55
- Add 2 internal notifications
  1 - http_request_start
  2 - http_request_complete

- When Network.enable CDP message is received, browser context registers for
  these 2 events (when Network.disable is called, it unregisters)

- On http_request_start, CDP will emit a Network.requestWillBeSent message.
  This _does not_ include all the fields, but what we have appears to be enough
  for puppeteer.waitForNetworkIdle.

- On http_request_complete, CDP will emit a Network.responseReceived message.
  This _does not_ include all the fields, bu what we have appears to be enough
  for puppeteer.waitForNetworkIdle.

We currently don't emit any other new events, including any network-specific
lifecycleEvent (i.e. Chrome will emit an networkIdle and networkAlmostIdle).

To support this, the following other things were done:
- CDP now has a `notification_arena` which is re-used between browser contexts.
  Normally, CDP code runs based on a "cmd" which has its own message_arena, but
  these notifications happen out-of-band, so we needed a new arena which is
  valid for handling 1 notification.

- HTTP Client is notification-aware. The SessionState no longer includes the
  *http.Client directly. It instead includes an http.RequestFactory which is
  the combination fo the client + a specific configuration (i.e. *Notification).
  This ensures that all requests made from that factory have the same settings.

- However, despite the above, _some_ requests do not appear to emit CDP events,
  such as loading a <script src="X">. So the page still deals directly with the
  *http.Client.

- Playwright and Puppeteer (but Playwright in particular) are very sensitive to
  event ordering. These new events have introduced additional sensitivity.
  The result sent to Page.navigate had to be moved to inside the navigate event
  handler, which meant passing some cdp-specific data (the input.id) into the
  NavigateOpts. This is the only way I found to keep both happy - the sequence
  of events is closer (but still pretty far) from what Chrome does.
This might not be specific to network notification, but the issue happens all
the time testing scenarios that rely on network notification, so it's hard
to ignore.
@karlseguin karlseguin force-pushed the http_request_notifications branch from ea3319c to f59e3cd Compare May 24, 2025 01:01
@karlseguin karlseguin merged commit 7cc332a into main May 24, 2025
9 checks passed
@karlseguin karlseguin deleted the http_request_notifications branch May 24, 2025 02:10
@github-actions github-actions bot locked and limited conversation to collaborators May 24, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants