Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce and encourage use of an interface that is more easily testable #1159

Closed
exarkun opened this issue Mar 27, 2019 · 23 comments
Closed

Comments

@exarkun
Copy link
Contributor

exarkun commented Mar 27, 2019

I've had some difficulty writing good unit tests for WebSocketClientProtocol-based implementations. The tools available seem to encourage one of two paths:

  • Directly instantiate the relevant objects and then call methods on them as though they were in use by the real Autobahn WebSocket implementation.
  • Set up real network servers and clients that interact over real sockets so that the real Autobahn WebSocket implementation is driving them.

The first of these options has the drawback that there's no guarantee the tests will actually drive the application code correctly (and even if it is initially correct the chances of it diverging from reality increase as Autobahn is maintained and changed).

The second of these options has the drawback that it is prone to many different kinds of spurious failure (most code that takes this approach doesn't even manage to get the part where you start listening on a port correct 😢 ). It also draws in a lot more code and so failures can become harder to spot. Failures can also end up being caused by other code that's not the main focus of the test. It's also a lot slower/more expensive as a result of running so much more code.

I think it would be very useful if Autobahn itself offered and encouraged the use of a higher-level interface that allows details about connections/sockets/networks to be contained in an object that can easily be substituted for a testing fake. A model for this which appears to be working well is the regular HTTP client interface in Twisted, twisted.web.client.IAgent. This has one primary network-enabled implementation in Twisted, twisted.web.client.Agent and the third-party(-ish) treq library provides a test-focused implementation, treq.testing.RequestTraversalAgent.

Application code written against IAgent can accept an implementation and do what it needs to do. Real-world use of this application code passes in an Agent instance (or possibly a wrapper but that's not directly relevant here). Test use of this same application code passes in a correctly initialized RequestTraversalAgent. The application code doesn't know or care which implementation it gets. Agent goes off and makes real requests over a real network. RequestTraversalAgent decomposes the request and serves up a response to it using a twisted.web.resource.IResource provider - with no network, only Python method calls and correct copying of bytes from one place to another in memory.

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

Okay, this sounds good. I agree with your "drawbacks" at the top. I also agree that while there are unit-tests in Autobahn they tend to be of the second variety (and thus lean on Mock a lot) -- it would be great to improve these.

There also isn't a great story for users of Autobahn: how should they test their websocket-using application?

So, I'm certainly curious to see what a better approach looks like. What would an IAgent-type approach for Autobahn look like? There's one method in IAgent: request(method, uri, headers=None) so I'm guessing we'd want something kind-of similar but maybe connect(uri, ...) since there's no "method" in a WebSocket connection request?

@exarkun
Copy link
Contributor Author

exarkun commented Mar 27, 2019

There's one method in IAgent: request(method, uri, headers=None) so I'm guessing we'd want something kind-of similar but maybe connect(uri, ...) since there's no "method" in a WebSocket connection request?

I think that's right... I had imagine an "open" which I bet does exactly what you are imagining for "connect". I am a rank WebSocket newb though, so input from folks who are more familiar with its capabilities would be really great.

I think ws_agent.open(...) would return a Deferred (or async-thingy-of-your-choice) that fires with ... the connected protocol instance? Or maybe the post-handshaked instance? I'm not sure, really. I've only thought about the connection setup utility/consequences here so far. Assuming you actually got a connection set up, then what would you want to do? This kind of API could be an opportunity to continue to move away from subclass-based APIs. If the returned object had more methods for interacting with the connection there'd be a lot less motivation to write WebSocketClientProtocol subclasses. Rather, you could write functions that just operate on such instances, calling methods and so forth.

@exarkun
Copy link
Contributor Author

exarkun commented Mar 27, 2019

Also, yea, I guess I should be more clear about my perspective: I am thinking about all of this as a developer writing an application with Autobahn. For development of Autobahn I think you would certainly want and need more testing strategies than just this one.

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

I think ws_agent.open(...) would return a Deferred (or async-thingy-of-your-choice) that fires with ... the connected protocol instance? Or maybe the post-handshaked instance? I'm not sure, really. I've only thought about the connection setup utility/consequences here so far.

Yes, that sounds approximately correct. The way Autobahn is split up is: there's "framework agnostic" code, where we try to make most stuff live and "framework specific" code (twisted vs asyncio), the details of which are MOSTLY hidden behind txaio. Basically this means most "async code" in Autobahn has to use txaio.* APIs (but e.g. something in autobahn.twisted.* can use twisted-specific APIs).

This kind of API could be an opportunity to continue to move away from subclass-based APIs. If the returned object had more methods for interacting with the connection there'd be a lot less motivation to write WebSocketClientProtocol subclasses. Rather, you could write functions that just operate on such instances, calling methods and so forth.

Yes, so this is very interesting to me. We HAVE gone down this path with the WAMP stuff, taking a "listener" approach to many of these things. There's also a decorator-based approach behind Component which is our high-level WAMP API (and a run() method that acts like react() to deal with logging etc to start+stop any number of Component instances.

Bringing a similar approach down to WebSocket too isn't conceptually that hard at this point. This would look like: replace a method like onConnect with a hook you can attach via a listener approach (e.g. on('connect', my_callback)). Where this might get a little less-clear is for return values: if we adopt the approach of "onConnect can return options, like headers" then what do you do if multiple hooks are registered but return different things?

...but that aside, I do like the idea of (at some point) extending this approach. In WAMP-land, the important thing is the Session -- in WebSocket land the important thing is a Protocol instance. So that would point to IWebSocketAgent.open() returning a future that fires with a WebSocketClientProtocol instance. This instance would have an on('message', ...) hook where you can hang any number of "things which respond to an incoming message". Probably also an on('close', ..) but I think that's all we'd need here?

For the "options we might want to give to connect" I think those would have to be part of the agent or .open call, unless we fire the open future "really early" (before we've sent headers, that is) so you'd have a chance to connect an on('connect', ..) "listener" (but at that point, it's arguably not really a listener anymore ... i.e. the above issues).

So, the life-cycle would look like this:

  • create an IWebSocketAgent instance
  • .open() on it gives a future
  • that future fires with an already-connected WebSocketClientProtocol instance
  • OR that future errbacks with some connection error
  • .on("message", ..) to attach your message-handler (edge-case: what happens if the server sends a message "right away on connect"?)
  • `.on("close", ..) to attach a cleanup/close handler

If we mimicked the Component decorator API, then maybe the IWebSocketAgent is where you "attach" handlers, so it has .open() but also .on_message and .on_close (which are decorators that deal with adding that function into the protocol instance).

This might look something like this, then:

async def main(reactor):
    agent = autobahn.twisted.websocket.create_agent(reactor, ...)

    @agent.on_message
    def print_it(message):
        print(f"got: {message}")

    @agent.on_close
    def done(reason):
        print(f"websocket closed: {reason}")

    client = await agent.connect(...)
    print(f"connected: {client}")
    # ...

Needs some more thought probably ;)

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

Also, yea, I guess I should be more clear about my perspective: I am thinking about all of this as a developer writing an application with Autobahn. For development of Autobahn I think you would certainly want and need more testing strategies than just this one.

Yes, absolutely. However, I think if we have a good story for users of Autobahn, we can improve our own unit-tests using similar machinery :)

@oberstet
Copy link
Contributor

oberstet commented Mar 27, 2019

since there's no "method" in a WebSocket connection request?

yep, a WebSocket opening handshake always begins with a HTTP/GET and Upgrade header - POST/PUT/DELETE/WHATEVER is not valid.

also: the requested URL in the HTTP/GET can contain query parameters, but must not contain a fragment identifier (AB server is checking that)

for completeness: the WebSocket spec in principle allows a client to talk regular HTTP (all methods) first, and then (without closing the connection) can once upgrade to WebSocket - but cannot go back to the former (doing regular requests). Autobahn does not implement that ... and I am not aware of any library or server that does. But the spec has it. Not sure if it's in the text explicitly, but it was discuess on the IETF WG - and I could dig out the posts .. though I'd prefer not to, since the WG mailing list history is huge;)


rgd unit testing: I definitely agree that the unit tests (at the websocket level) in AB could be improved, and I am curious to learn here.

rgd functional testing at the websocket protocol level: guess it's no news that Autobahn pioneered the so-called "Autobahn testsuite" which has >600 fuzzing tests - but this is different from unit tests. It tests blackbox via network and is trying to fuzz the testee with various invalid stuff.

https://github.com/crossbario/autobahn-testsuite/
https://github.com/crossbario/autobahn-python/tree/master/wstest

the testsuite itself is using a version pinned AB and twisted

we don't currently have that fuzzing tests automated in CI here in AB - which we should, but never found time (we run it "occasionally sometimes" .. which isn't the right way of course)

another "defect": the testsuite has over 600 tests, that really are I would claim quite exhaustive rgd coverage of websocket protocol spec compliance - BUT: the tests are only for anything after the initial websocket opening handshake has finished (we never had time to add tests for that .. and that would probably been a rabbit hole, since eg testing only the HTTP request line in the same rigorosity as the testsuite does for the actual websocket protocol (after the opening HS) would be quite some work. HTTP is surprisingly complex when magnified to the finest detail level ..

@exarkun
Copy link
Contributor Author

exarkun commented Mar 27, 2019

Bringing a similar approach down to WebSocket too isn't conceptually that hard at this point. This would look like: replace a method like onConnect with a hook you can attach via a listener approach (e.g. on('connect', my_callback)). Where this might get a little less-clear is for return values: if we adopt the approach of "onConnect can return options, like headers" then what do you do if multiple hooks are registered but return different things?

A possible answer to this less-clearness could be that the .open(...) future evaluates to a non-handshaken-yet-websocket which has a method, say, handshake that accepts the handshake params and returns another future that evaluates to the ready-to-pass-messages websocket. This excludes the possibility of multiple listeners for the "connect" event without the application building that multiplexing on top, and in doing so finding its own answer to the question of "what if different things want different stuff".

I'm not super clear on when the current onConnect fires so I'm not sure this idea actually makes sense. If onConnect corresponds to the underlying transport connection, though, then I think it might.

I wonder about the other one-off events not just being futures. What's the reasoning behind proto.on("close", callback) vs proto.on_close.addCallback(callback) (or whatever the future API is)?

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

Hrmm, I might have been getting confused by server vs client code. It looks like the client onConnect fires when the handshake is completed. In the server code, it fires after the transport opens but before the handshake is completed (because it can return headers).

I do think the most-clear approach is to pass in any "options needed for the handshake" instead of "something gets them via a callback, at some point". I'm not sure if we need two methods, though? Could we just pass in connections options to open ...? (I haven't traced through the code for that approach, but since all the options are on Factory right now and that's given to the connect I don't see why it can't work).

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

I wonder about the other one-off events not just being futures. What's the reasoning behind proto.on("close", callback) vs proto.on_close.addCallback(callback) (or whatever the future API is)?

Well, you'd at least need proto.when_closed() -> Deferred or similar because there might be multiple listeners (and if there's just one Deferred, "what arg you get" would depend on "which callback got added first"). That said for strictly one-time notifications, that could certainly work, I think? Part of the reason is to keep the API the same and for some of them you really do want a callback approach because it'll happen multiple times (e.g. on-message). I do remember discussing "when_*" but don't recall all the reasons :/

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

Oh, I remember another reason: if you want to be able to use the API from Agent (like with Component) and you'll be doing re-connections, then those "one-time notifications" are really "one time per connection". So this means you can do: agent.on('close', ...) or protocol.on('close', ..) and they have the same API. Maybe this isn't a great reason in the websocket case...

@exarkun
Copy link
Contributor Author

exarkun commented Mar 27, 2019

Part of the reason is to keep the API the same and for some of them you really do want a callback approach because it'll happen multiple times (e.g. on-message). I do remember discussing "when_*" but don't recall all the reasons :/

Maybe different APIs for basically different things (one-time vs n-time) is okay? Also, a direction to explore for message delivery might be tubes (serious? or not? I don't know. 😄) Anyhow, I don't feel too strongly about this. I think anything that gives you composability without subclassing is already going to be a huge step forward. Also, I guess it's all wandering a little far from the original topic.

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

So, I think I'm proposing: IWebSocketClientAgent.open() takes something that contains all the options (or, maybe, has kwargs that are the options?). This object would be very much like how Factory is already (ab)used, right?

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

Also, I guess it's all wandering a little far from the original topic.

Well, I think if we're proposing "a new thing that sets up websocket connections" this is all relevant. We don't have to implement it all right now ;)

I don't see Autobahn growing tubes as a dependency, but it should be possible to use these APIs to construct an "adaptor to tubes"...? (If not, maybe these APIs aren't ideal).

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

Maybe different APIs for basically different things (one-time vs n-time) is okay?

Okay, so lets see. If you want to run some code "when the websocket closes", one re-factoring of the above example would look like this:

async def main(reactor):
    agent = autobahn.twisted.websocket.create_agent(reactor, ...)

    @agent.on_message
    def print_it(message):
        print(f"got: {message}")

    client = await agent.open(reactor, connection_args, some_options_object, ...)
    print(f"opened: {client}")
    await client.when_connected()
    # question: is this even worth it? seems like making open() only fire when
    # the connection is successfully established is better
    print("connected")
    await client.when_closed()
    print(f"websocket closed: {reason}")
    # ...

However, that example doesn't include re-connections. If this "agent" just doesn't handle re-connections (i.e. something higher-level does) then the above looks pretty good. It might make sense, then, to have a when_connected() or similar too, to mirror when_closed (edit: added to the example above). Actually, I think that doesn't make sense now: open/close are mirrors and you don't need a when_opened because you called agent.open() already.

A "higher-level thing that does re-connections" could provide a listener-based API instead (because "connected" and "closed" would happen multiple times in that case). I guess the question: is it an advantage to have the same API in the "higher level thing" and the protocol, or not?

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

Oh, haha, there's already is_closed which is "your platform's future" on WebSocketProtocol. Arguably, this should really be when_closed() though .... :/

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

So I guess approximately (from Twisted perspective):

#in autobahn.websocket.protocol or similar ("the generic code")
class IWebSocketClientAgent:
    def open(self, loop, uri_or_transport_config, options):
        """
        returns a future that fires with a `WebSocketClientProtocol` instance
        which has already done its handshake, or an error.
        """"

    def on_message(self, callback):
        """
        Call `callback` whenever a message is received on the protocol (that is, add
        `callback` as a listener in `proto.on("message", ..)` for any protocol we create).
        (can be used as a decorator, too)
        """

# in autobahn.twisted.websocket
def create_client_agent():
    """
    return a new instance implementing `IWebSocketClientAgent`
    """"
    return _TwistedWebSocketClientAgent()

Note: "loop" can be reactor, or event-loop depending on which framework we're using

@oberstet
Copy link
Contributor

oberstet commented Mar 27, 2019

rgd onConnect:

For servers, this is called during the opening handshake (from the HTTP/GET the client sent) with an instance of autobahn.websocket.types.ConnectionRequest, and it can:

  • return the subprotocol (a string) to be spoken
  • return None to continue with no subprotocol
  • return a pair (subprotocol, headers)
  • raise a autobahn.websocket.types.ConnectionDeny to dismiss the client

Then, when the request comes back to the client, onConnect will be called with an instance of autobahn.websocket.types.ConnectionResponse to give the client a chance to notice the negotiated subprotocol (and websocket extensions), and maybe bail out.

For servers, onOpen will be called right after onConnect (when that has not raised) and sending the response (succeedHandshake).

For clients, onOpen will be called right after onConnect (when that has not raised).


The state machine for the WebSocket protocol state goes like this:

(STATE_PROXY_CONNECTING) => STATE_CONNECTING => STATE_OPEN => STATE_CLOSING => STATE_CLOSED

Thus, there is also a closing handshake.

A WebSocket peer (a server or a client) can send "close" (websocket level message). The other peer must reply with close to that (it cannot continue regular websocket messaging). Finally, a server must then drop the connection.

Hence, we have 2 timeouts for the closing handshake:

  • closeHandshakeTimeout: both clients and server have that to wait for the passive peer to respond to the initiated close
  • serverConnectionDropTimeout: only servers have that, as they are supposed to finally drop the connection (at the TCP level)

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

Ah, okay so maybe (based on @oberstet's comment above) the lifecycle should be:

  • create agent
  • proto = await agent.connect(...)
  • proto.on("message", msg_listener)
  • await proto.when_opened()
  • N "message" listener notifications
  • await proto.when_closed()

@meejah
Copy link
Contributor

meejah commented Mar 27, 2019

I guess practically speaking I'd want to encourage agent.on_message instead of waiting for the protocol and then adding it ("in case immediate server message") but that doesn't stop the above being the API.

@exarkun
Copy link
Contributor Author

exarkun commented Mar 27, 2019

I guess practically speaking I'd want to encourage agent.on_message instead of waiting for the protocol and then adding it ("in case immediate server message") but that doesn't stop the above being the API.

I'm not sure I completely understand this message. Maybe that means the rest of this comment is tilting at windmills... Nevertheless...

agent.on_message seems like it makes re-use of an agent pretty difficult. You won't really be able to call agent.open twice in a way that connects to two different "sorts" of websocket server (two different "websocket applications"? not sure what people call these things) because now you have two unrelated message streams going to one callback.

I think you want to group all of the "here's how to connect" stuff into agent state and "here's what to do with the connection" somewhere else.

So, the reactor/loop is part of agent state. All message handling state is part of some per-connection thing (eg a protocol). I agree you don't want to accidentally miss any messages but I'm think there are lots of ways to deal with this.

a. Guarantee that no messages can be delivered until after the loop regains control after the open() future has a result. This means a pattern like:

agent = create_agent(reactor, ...)
client = await agent.open(url, ...)
@client.on_message
def stuff(...):
    pass
await client.on_closed()

is perfectly safe. Implementation-wise, this might mean a buffer but it also just might mean taking care to dispatch the opened-and-handshaked-and-stuff event before delivering messages, which is pretty much naturally what the implementation would be, I would think (can't parse and dispatch messages if you haven't figured out the handshake succeeded, can you?).

b. Just buffer up some messages until the first call to client.on_message. Not as nice because now you have a buffer you have to worry about the bounds on, but still probably possible.

Alternatively, support an api like:

client = await agent.open(url, on_message, ...)

and you're definitely guaranteed to have the callback before any messages arrive. So, adapting your interface proposal:

#in autobahn.websocket.protocol or similar ("the generic code")
class IWebSocketClientAgent:
    def open(self, uri_or_transport_config, options):
        """
        returns a future that fires with a `WebSocketClientProtocol` instance
        which has already done its handshake, or an error.
        """"

    def on_message(self, callback):
        """
        Call `callback` whenever a message is received on the protocol (that is, add
        `callback` as a listener in `proto.on("message", ..)` for any protocol we create).
        (can be used as a decorator, too)
        """

# in autobahn.twisted.websocket
def create_client_agent(reactor):
    """
    return a new instance implementing `IWebSocketClientAgent`
    """"
    return _TwistedWebSocketClientAgent(reactor)

# in autobahn.twisted.testing
def create_memory_agent(resource):
    """
    return a new instance implementing `IWebSocketClientAgent`.

    connection attempts will be satisfied by traversing the Upgrade
    request path starting at `resource` to find a `WebSocketResource`
    and then exchange data between client and server using purely
    in-memory buffers.
    """

Or, this but with an open signature like def open(self, uri_or_transport_config, on_message, options):.

With create_memory_agent now explicitly included it may be more clear why we don't want a reactor/loop passed to open... What would an in-memory agent do with that argument? At best, ignore it, either surprising application/test code or making that code pass a nonsense value just to satisfy the signature.

@meejah
Copy link
Contributor

meejah commented Mar 28, 2019

You won't really be able to call agent.open twice in a way that connects to two different "sorts" of websocket server (two different "websocket applications"? not sure what people call these things) because now you have two unrelated message streams going to one callback.

Okay, yeah this makes sense I think -- I was taking the "this looks like this other WAMP thing" analogy too far in my mind. So, I'm convinced that on_message should NOT be part of agent and all we need to ensure is that there's a way to see ALL the messages.

I think your first pattern makes sense, and will work correctly if we say "a message cannot arrive until we do onConnect", which definitely comes AFTER the open is "completely done".

Implementation-wise, this might mean a buffer but it also just might mean taking care to dispatch the opened-and-handshaked-and-stuff event before delivering messages, which is pretty much naturally what the implementation would be, I would think (can't parse and dispatch messages if you haven't figured out the handshake succeeded, can you?).

It might be that we'd need to insert a callLater(0, ..) somewhere -- it might be possible in the spec to include data with the reply to the Upgrade: (i.e. a first message). I don't know either way though. Hopefully, this just isn't possible and then I think you're correct that it'll "just work".

@meejah
Copy link
Contributor

meejah commented Mar 28, 2019

@exarkun Added a "proof-of-concept" branch which I think represents what's been discussed up to here

@ambsw-technology
Copy link

Was this closed by #1186 or are there residual improvements to be implemented?

@meejah meejah closed this as completed Mar 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants