Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom serializer issue #29

Closed
dmill-bz opened this issue Mar 14, 2016 · 20 comments
Closed

Custom serializer issue #29

dmill-bz opened this issue Mar 14, 2016 · 20 comments

Comments

@dmill-bz
Copy link
Collaborator

Hey JB,

I'm trying to implement gremlin-javascript with a custom server side serializer. For the client this should be pretty transparent as the response message structure doesn't change (only the data being sent back). It however requires that the request message sent from the client have a different mimetype (application/gremlinbin). The following does not seem to work:

Gremlin.createClient(8182, "localhost", {accept:"application/gremlinbin"});

It's pretty easy to test manually as gremlin-server will complain about having to use GraphSONMessageSerializerV1d0 by default because it can't find the serializer associated with that mimetype.

Any hint as to what would need to be fixed for this? I'm a little lost running through your code but with some guidance I could probably pull a fix.

@dmill-bz dmill-bz changed the title Custom serializer Custom serializer issue Mar 14, 2016
@dmill-bz
Copy link
Collaborator Author

Ok I think I'm starting to understand. This might be more of an overhaul than expected. Are you by any chance using a text opcode? (value of 1). This might be the default for your ws client since I don't see you setting it anywhere.
This means that the server will always interpret the message as an application/json as per the documentation:

The above JSON represents the "body" of the request to send to Gremlin Server. When sending this "body" over websockets Gremlin Server can accept a packet frame using a "text" (1) or a "binary" (2) opcode. Using "text" is a bit more limited in that Gremlin Server will always process the body of that request as JSON. Generally speaking "text" is just for testing purposes.

Fixing/adding this would imply:

  • Changing the opcode to binary(2)
  • Creating a header for the message
  • Packing the header + body(message) into a binary.
  • unpacking accordingly

ouch

@jbmusso
Copy link
Owner

jbmusso commented Mar 14, 2016

The above should indeed make gremlin-javascript send the appropriate accept header. There currently is a test case for that.

Are you getting any specific error? If I get your point right, you will most likely run into issues if you use client.execute() or client.stream() for these methods internally expect each data parts of the responses to be parsed as Arrays (see here for execute() and here for stream()). I need to find a way to make these configurable (and wow, cleanup the code responsible for this).

For now, I think your best bet would be to use the lower level client.messageStream() method which makes no assumption regarding the type of data returned by Gremlin server (it performs no concat() operation and simply re-emits "as is" each raw messages returned by Gremlin server).
Alternatively, you can also override the default executeHandler by supplying your own handler when creating the client (this is undocumented yet).

Gremlin.createClient(8182, "localhost", {
  accept: "application/gremlinbin",
  executeHandler: function(stream, callback) {
    // Handle stream yourself and fire the callback when done.
  }
})

That handler is simply a function with a (stream, callback) signature. You need to be a bit familiar with Node.js canonical callback function signature ((error, values...)) and streams.
If you decide to use client.messageStream() or supply a custom executeHandler, you'll most likely want to use a higher-level stream library such as Highland.js.

@jbmusso
Copy link
Owner

jbmusso commented Mar 14, 2016

Woops, I didn't see your next message. Let me re-read that carefully.

@dmill-bz
Copy link
Collaborator Author

Thanks for all the info on client.messageStream(). I think you make a good point and I will probably have to use this to handle the response data. But right now I can't get the server to run my custom serializer (probably because of the above, though not 100% positive).

Thanks for looking into this though. Appreciate the time you're putting up

@dmill-bz
Copy link
Collaborator Author

I'm adding more info as I go. Hopefully this cuts down on research time and bugs and stuff:

Found the following pertaining to binary data from ws:

Sending binary data

var WebSocket = require('ws');
var ws = new WebSocket('ws://www.host.com/path');

ws.on('open', function open() {
  var array = new Float32Array(5);

 for (var i = 0; i < array.length; ++i) {
    array[i] = i / 2;
  }

  ws.send(array, { binary: true, mask: true });
});

Setting mask, as done for the send options above, will cause the data to be
masked according to the WebSocket protocol. The same option applies for text
data.

Note that for binary at least gremlin-server requires that you send masked packets :

NOTE Gremlin Server will only accept masked packets as it pertains to websocket packet header construction.

In terms of the message to be sent a breakdown from the gremlin-server docs would be that your binary message needs to be 1 byte (mime length x, 127 max obviously) + x bytes (mimeType) + payload (current message)

As far as the server response goes, If memory serves me well you will get the same thing you get now. A text opcode, no masking. (I will double check this info)

@dmill-bz
Copy link
Collaborator Author

So I've looked into a couple more things.

I think binary info should be packed using Uint8Array(). Assuming UTF-8 strings you should be able to just place the chars into the array.

The response from the server actually uses a binary opcode but it really only contains the payload so, again, assuming UTF-8 strings you can treat the response no different that you do now (as if it were a text opcode). It also has mask set to false so nothing to do here either (though I think in your case ws would handle this on it's own anyways).

So I guess it's not all that bad, changing WebSocketGremlinConnection.sendMessage() to add bin headers and send binary might be the only thing needed.

@dmill-bz
Copy link
Collaborator Author

Can you tell me what message.data || message is about in https://github.com/jbmusso/gremlin-javascript/blob/master/src/GremlinClient.js#L68 ? I'm going to need to work around this

@jbmusso
Copy link
Owner

jbmusso commented Mar 14, 2016

Thanks for working on a PR, I appreciate it. I remember discussing this with @spmallette a while ago.
Unless I misunderstand your question, I don't think you can get accept from message returned by Gremlin server. You'll have to re-use the value set at the client-level (using this.options.accept) or somehow store/attach that value on a per-command basis.

@dmill-bz
Copy link
Collaborator Author

woops I deleted my previous comment. But thanks that answers my question

@jbmusso
Copy link
Owner

jbmusso commented Mar 14, 2016

Can you tell me what message.data || message is about

Sure. message.data || message is the JavaScript-ish way for saying "get value of message.data if available (truthy), else fall back to message" (if undefined).

I did so because the client can theoretically be used both in Node.js and in browsers that supports WebSocket. If I recall correctly, there is a discrepancy between the WebSocket API provided by the ws module (which is used in Node.js) and the WebSocket API in the browser.
The former API (ws) puts the received data in message.data while the latter (browser) puts the data in message (leaving message.data undefined). This may have changed though, I haven't checked in a while.

@jbmusso
Copy link
Owner

jbmusso commented Mar 14, 2016

We may have to create a BinaryWebSocketGremlinConnection class unless we should only care about sending binary messages (and just drop the current way of sending messages). Also, there may be a performance impact under heavy load (unsure).
Since Gremlin server also has an HTTP channelizer, I was thinking about creating an HTTPGremlinConnection class as well so the client can be used in environments that do not support WebSocket (old browsers). This is off-topic and I will address this later.

@dmill-bz
Copy link
Collaborator Author

Ok thanks for the info. I'm currently using it in browser (via browserify) and message.data is populated so it may have changed. I will keep this BC though now that I understand what it's about.

The idea behind the custom serializer was to allow console like output (while retaining all graphson data) for http://www.gremlinbin.com/bin/view/56e70e4457f36


Regarding the implementation of this change I'm still trying to figure out what the best course of action is. In theory with the switch to binary there is no more need for the old text way so we can stick to WebSocketGremlinConnection.

The only thing that this changes is how messages are packed and unpacked and currently both of these are done in GremlinClient which means that switching Connection classes wouldn't work after this change (in the event of a HTTPGremlinConnection) .
Ideally a Message Object would be sent to the Connection that would handle the packing/unpacking.

I don't know if that change is warranted in this PR though.

@jbmusso
Copy link
Owner

jbmusso commented Mar 14, 2016

I agree that the packing/unpacking of messages shouldn't be done at the GremlinClient level. This should indeed be addressed in a separate PR/commit - I'll work on this :).
I really like the work you're doing with gremlinbin and I think this is important for the TinkerPop/graphdb community. If it gets popular, I'll have to slim down dependencies so the browserify/webpack bundle doesn't weigh several hundred kb (as I think it currently does right now).

dmill-bz added a commit to dmill-bz/gremlin-javascript that referenced this issue Mar 15, 2016
… sends the correct mimeType to the server
@dmill-bz
Copy link
Collaborator Author

Hey,

I'm using webpack to pack 852574f?diff=split and it doesn't work unfortunately. It chokes on :

const rawMessage = JSON.parse(data.toString());
                                   ^

The browser doesn't get the toString() on the Blob

@dmill-bz
Copy link
Collaborator Author

Do you have any ideas what the other options may be? Maybe another dependency (though I don't really like the sound of that)? I'd love to get this fixed sometime this weekend. (I might give this another crack)

@jbmusso
Copy link
Owner

jbmusso commented Mar 20, 2016

I don't think there's an easy way out when using Wepack or browserify. It looks like the Node.js ws package API and the browsers WebSocket API are slightly different. I think we have to handle this manually and handle both scenarios with different handlers that are conditionally loaded depending on the environment. We may have to define two entry points in the package.json file and load different message handlers (one for Node.js, one for browsers). This would require some refactoring.

When in browser, we may have some luck playing around with the ws.binaryType option (defaults to "blob" but can be set to "arraybuffer").

Interesting reads on the topic:

@dmill-bz
Copy link
Collaborator Author

Ok I'm going to look at those.
Maybe it's a good time to do the packing in the Connection then? Are there any easy conditional statements for checking the environment?

@dmill-bz
Copy link
Collaborator Author

ah wait indeed the ws.binaryType may actually do the trick.

@dmill-bz
Copy link
Collaborator Author

Ok I think I got this to work with minor changes. I'm going to make a PR against https://github.com/jbmusso/gremlin-javascript/tree/PommeVerte-switch-to-binary so you can test the node end implications

@dmill-bz
Copy link
Collaborator Author

Got the PR in #34 . Let me know how that goes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants