Define session creation without HTTP driver #97

jgraham · 2021-03-30T14:12:24Z

Some implementations want to support connecting without requiring an initial HTTP request. This is required for feature parity with CDP-based clients e.g. Puppeteer that are able to establish a connection to the browser directly without going through a seperate driver binary.

The following discussion will assume a WebSockets based transport, but the same issues would apply to an implementation that wanted to allow connections over e.g. a unix pipe without the initial HTTP handshake.

Currently once a session is created, the HTTP layer returns a websocket url of the form ws://localhost:<port>/session/<sessionid> for the client to connect to. Since this requires the session id to be known it's clear that this doesn't work well for establishing a session directly over WebSockets. An obvious implementation would be to allow connecting to ws://localhost:<port>/session and defining a command like

SessionNewCommand = {
  method: "session.new",
  params: SessionNewParameters
}

SessionSubscribeParameters = {
  ? alwaysMatch: Capabilities,
  ? firstMatch: [*Capabilities],
}

Capabilities = {
  *text: any
}

Then, if you send this command when there's no existing session it would create a session and return a response with the matched capabilities and otherwise it would error.

One question is whether the session itself should reuse the connection; it would be the "wrong" resource since the session id wouldn't be in the path. But I don't immediately see a practical problem with reusing it in this case, and the alternative would require the client to establish a new connection which adds latency.

One wart might be if you want to allow reconnecting to the session once it dropped; in this case it might be necessary to connect ot the URL including the session id (to account for nodes accepting multiple sessions). That is also probably sufficient reason not to change the spec to make /session the only supported ws resource; in a node that supports multiple sessions that would make it hard to work out which session to reconnect to.

Another question is with this setup is how to to communicate the ws port to the local end. This is analogous to the problem of how to communicate the HTTP server address to local ends, and is usually solved either by putting the local end in control and allowing the client to select the address through remote-specific options (with some risk of races) or by communicating the address back through stdout of the client.

The text was updated successfully, but these errors were encountered:

whimboo · 2021-04-08T08:37:32Z

Some implementations want to support connecting without requiring an initial HTTP request. This is required for feature parity with CDP-based clients e.g. Puppeteer that are able to establish a connection to the browser directly without going through a seperate driver binary.

The following discussion will assume a WebSockets based transport, but the same issues would apply to an implementation that wanted to allow connections over e.g. a unix pipe without the initial HTTP handshake.

When reading the design document from Google in how the BiDi handler will be implemented there seems to be a strong position to make use of pipes only instead of a websocket connection. I think it would be good to get some kind of feedback from Google's and Microsoft's side here. Maybe @foolip and @bwalderman could give some insights? For our current CDP implementation in Firefox we do not use Pipes and as such would have to add support for that, which might require additional platform work to be done.

One question is whether the session itself should reuse the connection; it would be the "wrong" resource since the session id wouldn't be in the path. But I don't immediately see a practical problem with reusing it in this case, and the alternative would require the client to establish a new connection which adds latency.

The connection that was created for the client will remain and could automatically be attached to the WebDriver session as created via the /session end-point. A current PoC of mine for Firefox works fine that way.

One wart might be if you want to allow reconnecting to the session once it dropped; in this case it might be necessary to connect ot the URL including the session id (to account for nodes accepting multiple sessions). That is also probably sufficient reason not to change the spec to make /session the only supported ws resource; in a node that supports multiple sessions that would make it hard to work out which session to reconnect to.

Yes, if a client wants to reconnect to such an initiated WebDriver session, it would have to know the session id. Given that with the former session creation the session has a listener running on /session/%uuid% the new connection attempt could be correctly assigned to the existent session. Using an unknown session id should fail with an invalid session id error. But also trying to connect to /session again, should fail as already described above.

Another question is with this setup is how to to communicate the ws port to the local end. This is analogous to the problem of how to communicate the HTTP server address to local ends, and is usually solved either by putting the local end in control and allowing the client to select the address through remote-specific options (with some risk of races) or by communicating the address back through stdout of the client.

I think this also depends on the above question regarding the usage of Pipes. Without them it might still have to be printed to stderr / stdout.

jgraham · 2021-04-08T08:48:35Z

I don't think the websockets-vs-pipes thing makes a big difference here. There will need to be an out-of-band way to communicate the entrypoint for communication irrespective of whether that's a file handle, a ws address or something else. The same problem exists for the HTTP drivers; you have to communicate the address of the HTTP server out of band.

whimboo · 2021-04-08T09:02:05Z

I had a look at the Puppeteer source code and as of right now they have both the WebSocket and Pipe connections implemented, whereby Pipe isn't used by default.

https://github.com/puppeteer/puppeteer/blob/943477cc1eb4b129870142873b3554737d5ef252/src/node/LaunchOptions.ts#L99-L103

Also CC'ing @mathiasbynens.

bwalderman · 2021-04-09T05:21:15Z

The Chromium BiDi handler design document proposes using pipes but it could just as easily be implemented over websockets, or something else. The transport mechanism isn't particularly important. One thing I do want to call out though is that the Chromium in-browser BiDi handler in this design wouldn't support the notion of multiple sessions.

One of the benefits of establishing a direct connection to the browser, as previously mentioned, is that clients would not need an intermediate driver binary. However, in Chromium at least, for this idea to work, the browser process first needs to be launched out-of-band, and the connection would implicitly be associated with a single session. In Chromium, there is a 1-to-1 relationship between a WebDriver session and a browser process. So in Chromium, to support the creation of multiple sessions over some kind of connection (whether that's a websocket or a unix pipe), the thing servicing the connection would have to be something apart from the browser process(es) which can launch the browser processes, and associate them with session IDs. Which essentially means we're back to having a driver binary.

So the BiDi handler design leaves session management up to its client (i.e. ChromeDriver, or Puppeteer). The client establishes a direct pipe connection to a browser process and all traffic going over that connection implicitly belongs to a single WebDriver session. This design is not being proposed as a replacement for a establishing a websocket connection as defined in the spec. It's meant more as an internal implementation detail for ChromeDriver and Puppeteer. ChromeDriver would still maintain a mapping of session IDs to browser connections and would serve a spec-compliant websocket resource for each session so that ChromeDriver users can establish a WebDriver session through HTTP and then connect to a websocket as described in the spec.

whimboo · 2021-10-06T15:33:07Z

This actually landed via PR #99.

foolip mentioned this issue May 3, 2021

Specify closing a session #104

Open

foolip mentioned this issue May 13, 2021

Consider whether to allow multiple sessions in browsers #103

Open

sadym-chromium mentioned this issue Jun 24, 2021

Add sessionId and reconnection GoogleChromeLabs/chromium-bidi#6

Open

whimboo mentioned this issue Sep 13, 2021

Add the ability to start a session directly with the bidi connection #99

Merged

whimboo closed this as completed Oct 6, 2021

whimboo added enhancement New feature or request module-session Session module labels Oct 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define session creation without HTTP driver #97

Define session creation without HTTP driver #97

jgraham commented Mar 30, 2021 •

edited

Loading

whimboo commented Apr 8, 2021

jgraham commented Apr 8, 2021

whimboo commented Apr 8, 2021

bwalderman commented Apr 9, 2021

whimboo commented Oct 6, 2021

Define session creation without HTTP driver #97

Define session creation without HTTP driver #97

Comments

jgraham commented Mar 30, 2021 • edited Loading

whimboo commented Apr 8, 2021

jgraham commented Apr 8, 2021

whimboo commented Apr 8, 2021

bwalderman commented Apr 9, 2021

whimboo commented Oct 6, 2021

jgraham commented Mar 30, 2021 •

edited

Loading