binary websocket message will cause parser to fail #182

Closed
dainis opened this Issue Aug 6, 2013 · 23 comments

Projects

None yet

6 participants

@dainis
Contributor
dainis commented Aug 6, 2013

WS library will emit same event message for text and binary websockets
But in case of binary websocket it is buffer and not a string, so when messages gets down to parser it will fail with an error Object 5 has no method 'charAt'

Probably based on WS flags received data should converted to string

Contributor
rauchg commented Aug 6, 2013

@dainis we should fail gracefully with an error event, as if there was a parser error

Contributor
dainis commented Aug 7, 2013

wouldn't it be just better to toString that buffer in websocket's onData handler?

lemonzi commented Sep 6, 2013

It would be really nice to transparently wrap Buffers into Base64-encoded strings, set a flag and decode them back to an ArrayBuffer (or to a normal array if ArrayBuffers are not supported) at the client-side. It would be a good workaround for supporting binary data.

I guess there's already been a lot of discussion on why not support native binary, but this would make the API ready for it, and hopefully somebody will implement it some day (especially now, where as far as I can see the only overhead added to the messages is a number, easy to insert into a binary array).

Contributor
rauchg commented Sep 6, 2013
Contributor
3rd-Eden commented Sep 6, 2013

@lemonzi @guille Implementing that kinda defeats the purpose of saying that engine.io is a low level library when you're going to implement high level abstractions like this directly in to engine.io.

You can just add your own library on top of engine.io to take care of the binary encoding/decoding. This is exactly what i've been doing with primus. It supports different message encoders which include EJSON, the JSON format used by meteor to transform arraybuffers, and BinaryPack for doing msgpack like encoding on binary.

lemonzi commented Sep 6, 2013

@3rd-Eden That's true if binary messages always rely on textual encoding, but my point was that implementing this functionality in the core would allow the use of native binary transports where available (AFAIK in websockets, but very likely in more).
Doing it in base64 adds a significant overhead (about 30% if I recall well), and it's costly to decode, especially on the client side where resources are limited, so doing it natively would be a win.
The higher level would be then implementing things like float arrays on top of the raw byte stream (which is not trivial, one reason being that endianness is up to the browser in typed arrays).

lemonzi commented Sep 10, 2013

So, it would be:

  • If using websockets (or xhr with custom encoding?), use native binary.
  • If not, use base64 encoding.
    • If ArrayBuffer is available, use it. If not, use Array.
    • If atob and btoa are available, use them. If not, use a manual implementation.

A preliminary test: http://jsperf.com/base64-to-uint8array

Contributor
rauchg commented Sep 30, 2013

@3rd-Eden I actually think binary support is lower-level than strings :)

Contributor

@guille Not when you start adding another layer of base64 decoder for it

Contributor
rauchg commented Oct 1, 2013

Definitely but I see that more as the "fallback" just like engine already
has polling for websocket. Its ugly but supporting binary is extremely
important
On Sep 30, 2013 11:07 AM, "Arnout Kazemier" notifications@github.com
wrote:

@guille https://github.com/guille Not when you start adding another
layer of base64 decoder for it


Reply to this email directly or view it on GitHubhttps://github.com/LearnBoost/engine.io/issues/182#issuecomment-25387597
.

Contributor
laino commented Jan 14, 2014

Sooooooo. How's it looking? I still have to use JSON instead of msgpack. I would really appreciate some work on this, with kisses and all.

Contributor
rauchg commented Jan 14, 2014

This is one of the features that I'm prioritizing this month. It's probably gonna look like this

var socket = new engine.Socket('ws://localhost', { binary: true });

if you want to disable base64 fallback and have it emit an error instead if the browser doesn't support binary:

var socket = new engine.Socket('ws://localhost', { binary: true, base64: false });

thoughts?

Contributor
rauchg commented Jan 14, 2014

One of the things I'm unsure about is whether we should emit error or throw right away.

lemonzi commented Jan 17, 2014

It would be more flexible to have hybrid sockets, accepting both text and
binary messages, as happens with websockets. Probably trickier to
implement, though, because any javascript object being sent would have to
be traversed first to check whether it contains binary data or not. But it
would be awesome to send objects with mixed text and binary data to the
client with a single call.

To avoid breaking things, the binary flag could activate the hybrid mode.
And another flag, defaulting to hybrid, could restrict it to only accept
binary messages. The base64 flag would work the same way.

Is it possible to do so with transports such as Flash?

Contributor
rauchg commented Jan 17, 2014

@lemonzi that's what we're going to be doing in socket.io :). You'll be able to emit streams, or json objects, or a hybrid.

Engine.IO itself needs to maintain the same functionality as websocket. We'll be supporting binaryType, and have a plaintext and a binary protocol depending on what the user specifies.

lemonzi commented Jan 17, 2014

But websockets is hybrid, isn't it? The binaryness can be specified for
every frame, at least with the WS library and the native client api (
https://github.com/einaros/ws#sending-binary-data)

Contributor
rauchg commented Jan 17, 2014

Sure. My understanding was that you wanted per-frame. After discussing it on #socket.io the other day we decided to do binaryType like WS as opposed to the constructor. Forgot to update here.

The other thing we discussed was implementing responseType for XHR as the fallback

lemonzi commented Jan 17, 2014

binaryType is fine, but as far as I understood it only defines which format to use after decoding binary data, not whether to work with binary or text. Any frame can be sent as text or binary by setting an option flag in node WS, and via type checking in the client API. I might have got it wrong though, the docs are a bit confusing...

Contributor
rauchg commented Jan 17, 2014

@lemonzi what i'm also considering is that the client advertises their capabilities in the handshake in terms of binary support and then the server responds appropriately (encoding in base64 when needed). We would only keep the base64 flag to force "no base64 fallback".

lemonzi commented Jan 27, 2014

Well, I assumed something like this. What's the altermative? The client or server directly using a certain protocol based on whether text data is being sent over websockets or not?

@rase- rase- referenced this issue Feb 1, 2014
Merged

Binary support #218

Contributor

The referenced PR was merged if that means anything towards closing this issue.

lemonzi commented Apr 29, 2014

I had been out for a while. Good to know it's already working! So yes, I guess the issue can be closed now...

Contributor
rauchg commented Apr 29, 2014

Thanks for staying on top of important issues @defunctzombie

@rauchg rauchg closed this Apr 29, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment