socketio_data:encode does not count escapes when determining the length of a JSON string #57

Closed
andymck opened this Issue Aug 12, 2011 · 20 comments

Comments

Projects
None yet
2 participants
@andymck

andymck commented Aug 12, 2011

I noticed when sending JSON that data is truncated when received in firefox.

I tracked this down to an issue in socketio_data:encode, after encoding the term via jsx:encode, the length of the returned Json is determined using:

Length = integer_to_list(length(JSON) + ?JSON_FRAME_LENGTH),

However length(JSON) does not count escape sequences, only escaped chars, so the following JSON string:

"[{"name":"andrew","address":"5 nowhere st"}]"

returns 44, where it should return 52.

This results in payloads sent to the browser being truncated to that of the value return from length(JSON) above....though it is only truncated on specific transport mechanisms. I have seen it using firefox which was using jsonp-polling.

@ferd

This comment has been minimized.

Show comment Hide comment
@ferd

ferd Aug 12, 2011

Collaborator

I'm wondering if this can be related to #56

Do you have the original Erlang data structure so that I can make a test case out of it?

Collaborator

ferd commented Aug 12, 2011

I'm wondering if this can be related to #56

Do you have the original Erlang data structure so that I can make a test case out of it?

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 12, 2011

sure does look related. Sorry i missed that.

let me see if i can get you the original erlang structure....but the issue
can be seen with any erlang tuple list which includes binary strings....the
resulting json will have the binary strings escaped and its those that
length does not count.

thanks

On Fri, Aug 12, 2011 at 4:36 PM, ferd <
reply@reply.github.com>wrote:

I'm wondering if this can be related to
#56

Do you have the original Erlang data structure so that I can make a test
case out of it?

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 12, 2011

sure does look related. Sorry i missed that.

let me see if i can get you the original erlang structure....but the issue
can be seen with any erlang tuple list which includes binary strings....the
resulting json will have the binary strings escaped and its those that
length does not count.

thanks

On Fri, Aug 12, 2011 at 4:36 PM, ferd <
reply@reply.github.com>wrote:

I'm wondering if this can be related to
#56

Do you have the original Erlang data structure so that I can make a test
case out of it?

Reply to this email directly or view it on GitHub:
#57 (comment)

@ferd

This comment has been minimized.

Show comment Hide comment
@ferd

ferd Aug 12, 2011

Collaborator

I have doubts on the idea of 'any binary string'. The current test suite does have these. Here's a simple counter-example:

1> socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there \"bud\\\"dy!">>}], json=true})}).
 [#msg{content = [{<<"hi">>,<<"there \"bud\\\"dy!">>}],
       json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of quotation marks in binaries or not. This is why I'm more or less wondering whether there were unicode sequences in there, which might mess up the length of the string, or the original data structure. It doesn't seem like escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there \"buddy">>}]}).                                                    
[126,109,126,49,126,109,126,{<<"hi">>,<<"there \"buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work. The list right there should be caught by the encoder. I'll try to at least make this more strict. It's purely luck (and wanting to implement something simple) that the decoder hasn' caught that one.

Collaborator

ferd commented Aug 12, 2011

I have doubts on the idea of 'any binary string'. The current test suite does have these. Here's a simple counter-example:

1> socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there \"bud\\\"dy!">>}], json=true})}).
 [#msg{content = [{<<"hi">>,<<"there \"bud\\\"dy!">>}],
       json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of quotation marks in binaries or not. This is why I'm more or less wondering whether there were unicode sequences in there, which might mess up the length of the string, or the original data structure. It doesn't seem like escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there \"buddy">>}]}).                                                    
[126,109,126,49,126,109,126,{<<"hi">>,<<"there \"buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work. The list right there should be caught by the encoder. I'll try to at least make this more strict. It's purely luck (and wanting to implement something simple) that the decoder hasn' caught that one.

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 12, 2011

Did you run the bif length on that example ? it will return a count lesser
than that of the string
The issue i was pointing out is not to do with encoding and decoding, i
agree that works fine....

However after you get the encoded json back, the bif length() is called on
that string and its that value which results in the truncating of what is
sent to the client..as length() is not counting escaped sequences.....so the
encoder works fine buts the number of bytes being sent over the wire is
incorrect ( when using firefox ).

I suspect the issue is to do with the decoding, the encoded message is in
the format

?FRAME ++ Length ++ ?FRAME ++ ?JSON_FRAME ++ JSON

That same length is being used in the decode function....so as it has a
value less than the size of the encoded json string, then that is resulting
in the decoder truncating the json...

Hth,

Andy

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
reply@reply.github.com>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 12, 2011

Did you run the bif length on that example ? it will return a count lesser
than that of the string
The issue i was pointing out is not to do with encoding and decoding, i
agree that works fine....

However after you get the encoded json back, the bif length() is called on
that string and its that value which results in the truncating of what is
sent to the client..as length() is not counting escaped sequences.....so the
encoder works fine buts the number of bytes being sent over the wire is
incorrect ( when using firefox ).

I suspect the issue is to do with the decoding, the encoded message is in
the format

?FRAME ++ Length ++ ?FRAME ++ ?JSON_FRAME ++ JSON

That same length is being used in the decode function....so as it has a
value less than the size of the encoded json string, then that is resulting
in the decoder truncating the json...

Hth,

Andy

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
reply@reply.github.com>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 12, 2011

Sorry that was a rushed response... up to my eyes here

I will be looking at this in more detail on monday and will be able to get
you more solid info then

Thanks

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
reply@reply.github.com>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 12, 2011

Sorry that was a rushed response... up to my eyes here

I will be looking at this in more detail on monday and will be able to get
you more solid info then

Thanks

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
reply@reply.github.com>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 15, 2011

Hi, havent had much time to play around with this but here is a description
of the problem from what i can see:

  1. When the transport is polling and on the server side using
    socketio_transport_polling.erl, JSON data received by the browser fails to
    parse the received msg to valid JSON.

  2. This is due to the length() BIF being used within socket_io_data:encode,
    the encoded JSON string returned from here includes a number of headers /
    frames one of which is the length in bytes of the JSON string. Full
    structure of the encoded msgs being ?FRAME ++ Length ++ ?FRAME ++
    ?JSON_FRAME ++ JSON;

  3. The length returned in the msg header described above does not equal the
    length of the JSON string as the length() function does not count escape
    sequence characters, so calling length(""name"") will resolve to 6,
    whereas it needs to resolve to 8 as we wish to send the entire string
    including escape sequences.

  4. On the client side then the msg received from the server is decoded
    using the ?FRAME headers, and uses the length header included in the
    received msg to construct the data payload. This can be see in
    the Transport.prototype._decode function of socket.io.js. As the length
    header received is less than the size of the data payload, the js at line
    198 of socket.io.js

    this.base._onMessage(JSON.parse(message.substr(3)));

Fails to parse the message, as the msgs it received does not equal the fully
payload sent by the server.

Let me know what you think or if you need any further info. I could be
completely wrong on all of this as i havent got my head around all the code
yet but this is what i am seeing and the theory matches the results i see in
firefox.

Thanks

Andrew

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
reply@reply.github.com>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 15, 2011

Hi, havent had much time to play around with this but here is a description
of the problem from what i can see:

  1. When the transport is polling and on the server side using
    socketio_transport_polling.erl, JSON data received by the browser fails to
    parse the received msg to valid JSON.

  2. This is due to the length() BIF being used within socket_io_data:encode,
    the encoded JSON string returned from here includes a number of headers /
    frames one of which is the length in bytes of the JSON string. Full
    structure of the encoded msgs being ?FRAME ++ Length ++ ?FRAME ++
    ?JSON_FRAME ++ JSON;

  3. The length returned in the msg header described above does not equal the
    length of the JSON string as the length() function does not count escape
    sequence characters, so calling length(""name"") will resolve to 6,
    whereas it needs to resolve to 8 as we wish to send the entire string
    including escape sequences.

  4. On the client side then the msg received from the server is decoded
    using the ?FRAME headers, and uses the length header included in the
    received msg to construct the data payload. This can be see in
    the Transport.prototype._decode function of socket.io.js. As the length
    header received is less than the size of the data payload, the js at line
    198 of socket.io.js

    this.base._onMessage(JSON.parse(message.substr(3)));

Fails to parse the message, as the msgs it received does not equal the fully
payload sent by the server.

Let me know what you think or if you need any further info. I could be
completely wrong on all of this as i havent got my head around all the code
yet but this is what i am seeing and the theory matches the results i see in
firefox.

Thanks

Andrew

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
reply@reply.github.com>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 16, 2011

Hi, wondering if you have had a chance to look at my last email?

Keen to know what think...

Thanks

Andrew

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
reply@reply.github.com>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 16, 2011

Hi, wondering if you have had a chance to look at my last email?

Keen to know what think...

Thanks

Andrew

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
reply@reply.github.com>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

@ferd

This comment has been minimized.

Show comment Hide comment
@ferd

ferd Aug 16, 2011

Collaborator

Not yet. I'm having a pretty busy week. I know this is pretty important so I'll try to find some free time ASAP.

Collaborator

ferd commented Aug 16, 2011

Not yet. I'm having a pretty busy week. I know this is pretty important so I'll try to find some free time ASAP.

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 17, 2011

Hi, ok np.

FYI here is a test that will demo the issue:

simple_json_test2() ->
Msg = "{"hello":"world"}",
TestMsg = lists:concat(["~m", integer_to_list(length(Msg)), "m~~j",
Msg]),
[X] = decode(#msg{content=TestMsg}),
?assertMatch(#msg{content=[{<<"hello">>,<<"world">>}], json=true}, X).

Thanks

On Wed, Aug 17, 2011 at 12:16 AM, ferd <
reply@reply.github.com>wrote:

Not yet. I'm having a pretty busy week. I know this is pretty important so
I'll try to find some free time ASAP.

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 17, 2011

Hi, ok np.

FYI here is a test that will demo the issue:

simple_json_test2() ->
Msg = "{"hello":"world"}",
TestMsg = lists:concat(["~m", integer_to_list(length(Msg)), "m~~j",
Msg]),
[X] = decode(#msg{content=TestMsg}),
?assertMatch(#msg{content=[{<<"hello">>,<<"world">>}], json=true}, X).

Thanks

On Wed, Aug 17, 2011 at 12:16 AM, ferd <
reply@reply.github.com>wrote:

Not yet. I'm having a pretty busy week. I know this is pretty important so
I'll try to find some free time ASAP.

Reply to this email directly or view it on GitHub:
#57 (comment)

@ferd

This comment has been minimized.

Show comment Hide comment
@ferd

ferd Aug 17, 2011

Collaborator

Regarding the test case, that's because the correct length isn't 17. See:

20> socketio_data:encode(#msg{content=[{<<"hello">>,<<"world">>}], json=true}).
"~m~20~m~~j~{\"hello\":\"world\"}"

The 3 characters difference is due to the j framing, which counts towards the total length, but you didn't count it for the test. See https://github.com/LearnBoost/socket.io/blob/0.6.17/lib/socket.io/utils.js#L26 for the correct JS implementation as of 0.6.x.

Using the encoding functions we provide to create your messages seems to work.

Collaborator

ferd commented Aug 17, 2011

Regarding the test case, that's because the correct length isn't 17. See:

20> socketio_data:encode(#msg{content=[{<<"hello">>,<<"world">>}], json=true}).
"~m~20~m~~j~{\"hello\":\"world\"}"

The 3 characters difference is due to the j framing, which counts towards the total length, but you didn't count it for the test. See https://github.com/LearnBoost/socket.io/blob/0.6.17/lib/socket.io/utils.js#L26 for the correct JS implementation as of 0.6.x.

Using the encoding functions we provide to create your messages seems to work.

@ferd

This comment has been minimized.

Show comment Hide comment
@ferd

ferd Aug 17, 2011

Collaborator

Note being, I still can't produce any of these problems you had when I start by submitting Erlang Data structures rather than a JSON string and then trying to shove it in the app. Are you using the interface functions or just submitting the JSON?

Collaborator

ferd commented Aug 17, 2011

Note being, I still can't produce any of these problems you had when I start by submitting Erlang Data structures rather than a JSON string and then trying to shove it in the app. Are you using the interface functions or just submitting the JSON?

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 17, 2011

k, i guess i am not communication this properly. Hopefully this example
using real data will help:

Here is the term i trying to send via socket.io

[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function

"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){
this.base._onMessage(JSON.parse(message.substr(3)));
} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works fine
when its websockets.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <
reply@reply.github.com>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string and
then trying to shove it in the app. Are you using the interface functions or
just submitting the JSON?

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 17, 2011

k, i guess i am not communication this properly. Hopefully this example
using real data will help:

Here is the term i trying to send via socket.io

[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function

"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){
this.base._onMessage(JSON.parse(message.substr(3)));
} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works fine
when its websockets.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <
reply@reply.github.com>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string and
then trying to shove it in the app. Are you using the interface functions or
just submitting the JSON?

Reply to this email directly or view it on GitHub:
#57 (comment)

@ferd

This comment has been minimized.

Show comment Hide comment
@ferd

ferd Aug 17, 2011

Collaborator

On Wed, Aug 17, 2011 at 2:22 PM, andymck <
reply@reply.github.com>wrote:

k, i guess i am not communication this properly. Hopefully this example

using real data will help:

Here is the term i trying to send via socket.io


[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function


"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

The length is 396, including the j frame, yes. The \ you see are
_not_something you want to count. They're only a visual representation
in the
shell for you:

30> "m396m~~j["++[Char0,Char1,Char2|_] = JSON.
"m396m~~j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

31> {Char0,Char1,Char2}.
{123,34,97}
32> $".
34

The backslash characters are not part of the string and it's normal for
length/1 not to count them. They're just there so it makes sense when you
read it as a string so you get """ and not """.

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){

this.base._onMessage(JSON.parse(message.substr(3)));

} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works
fine
when its websockets.

I'm trying it with jsonp-polling and I am able to reproduce the bug, even
with FF6. That sounds like a good way to start looking into it. The error is
likely not about the length of the string and how we encode the data, but
likely has to do with the transport or the client. More likely the transport
module.

I inserted some debugging code in the client and got this as the output:

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty

There we have backslashes added when they need not to, rather than length
not taking them into account.

We'll start to work from there. Thanks for the report and the useful use
case there.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <

reply@reply.github.com>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string and
then trying to shove it in the app. Are you using the interface functions
or
just submitting the JSON?

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:

#57 (comment)

Collaborator

ferd commented Aug 17, 2011

On Wed, Aug 17, 2011 at 2:22 PM, andymck <
reply@reply.github.com>wrote:

k, i guess i am not communication this properly. Hopefully this example

using real data will help:

Here is the term i trying to send via socket.io


[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function


"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

The length is 396, including the j frame, yes. The \ you see are
_not_something you want to count. They're only a visual representation
in the
shell for you:

30> "m396m~~j["++[Char0,Char1,Char2|_] = JSON.
"m396m~~j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

31> {Char0,Char1,Char2}.
{123,34,97}
32> $".
34

The backslash characters are not part of the string and it's normal for
length/1 not to count them. They're just there so it makes sense when you
read it as a string so you get """ and not """.

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){

this.base._onMessage(JSON.parse(message.substr(3)));

} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works
fine
when its websockets.

I'm trying it with jsonp-polling and I am able to reproduce the bug, even
with FF6. That sounds like a good way to start looking into it. The error is
likely not about the length of the string and how we encode the data, but
likely has to do with the transport or the client. More likely the transport
module.

I inserted some debugging code in the client and got this as the output:

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty

There we have backslashes added when they need not to, rather than length
not taking them into account.

We'll start to work from there. Thanks for the report and the useful use
case there.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <

reply@reply.github.com>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string and
then trying to shove it in the app. Are you using the interface functions
or
just submitting the JSON?

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:

#57 (comment)

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 17, 2011

yes, but in firefox running jsonp-polling here is what the browser receives
for that same msg

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_

As you can see the backslashes are present....and the string terminates at
the exact same length as what the erlang shell returns for the length of the
json string....

Firefox throws the following error, as it cannot obviously parse the
truncated string...

JSON.parse
[Break On This Error] this.base._onMessage(JSON.parse(message.substr(3)));
socket.io.js (line 199)

On Wed, Aug 17, 2011 at 7:47 PM, ferd <
reply@reply.github.com>wrote:

On Wed, Aug 17, 2011 at 2:22 PM, andymck <
reply@reply.github.com>wrote:

k, i guess i am not communication this properly. Hopefully this example

using real data will help:

Here is the term i trying to send via socket.io


[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function


"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

The length is 396, including the j frame, yes. The \ you see are
_not_something you want to count. They're only a visual representation
in the
shell for you:

30> "m396m~~j["++[Char0,Char1,Char2|_] = JSON.

"m396m~~j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

31> {Char0,Char1,Char2}.
{123,34,97}
32> $".
34

The backslash characters are not part of the string and it's normal for
length/1 not to count them. They're just there so it makes sense when you
read it as a string so you get """ and not """.

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){

this.base._onMessage(JSON.parse(message.substr(3)));

} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed
to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i
am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works
fine
when its websockets.

I'm trying it with jsonp-polling and I am able to reproduce the bug, even
with FF6. That sounds like a good way to start looking into it. The error
is
likely not about the length of the string and how we encode the data, but
likely has to do with the transport or the client. More likely the
transport
module.

I inserted some debugging code in the client and got this as the output:

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty

There we have backslashes added when they need not to, rather than length
not taking them into account.

We'll start to work from there. Thanks for the report and the useful use
case there.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <

reply@reply.github.com>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string
and
then trying to shove it in the app. Are you using the interface
functions
or
just submitting the JSON?

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 17, 2011

yes, but in firefox running jsonp-polling here is what the browser receives
for that same msg

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_

As you can see the backslashes are present....and the string terminates at
the exact same length as what the erlang shell returns for the length of the
json string....

Firefox throws the following error, as it cannot obviously parse the
truncated string...

JSON.parse
[Break On This Error] this.base._onMessage(JSON.parse(message.substr(3)));
socket.io.js (line 199)

On Wed, Aug 17, 2011 at 7:47 PM, ferd <
reply@reply.github.com>wrote:

On Wed, Aug 17, 2011 at 2:22 PM, andymck <
reply@reply.github.com>wrote:

k, i guess i am not communication this properly. Hopefully this example

using real data will help:

Here is the term i trying to send via socket.io


[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function


"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

The length is 396, including the j frame, yes. The \ you see are
_not_something you want to count. They're only a visual representation
in the
shell for you:

30> "m396m~~j["++[Char0,Char1,Char2|_] = JSON.

"m396m~~j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

31> {Char0,Char1,Char2}.
{123,34,97}
32> $".
34

The backslash characters are not part of the string and it's normal for
length/1 not to count them. They're just there so it makes sense when you
read it as a string so you get """ and not """.

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){

this.base._onMessage(JSON.parse(message.substr(3)));

} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed
to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i
am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works
fine
when its websockets.

I'm trying it with jsonp-polling and I am able to reproduce the bug, even
with FF6. That sounds like a good way to start looking into it. The error
is
likely not about the length of the string and how we encode the data, but
likely has to do with the transport or the client. More likely the
transport
module.

I inserted some debugging code in the client and got this as the output:

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty

There we have backslashes added when they need not to, rather than length
not taking them into account.

We'll start to work from there. Thanks for the report and the useful use
case there.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <

reply@reply.github.com>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string
and
then trying to shove it in the app. Are you using the interface
functions
or
just submitting the JSON?

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:
#57 (comment)

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 17, 2011

Sorry, just noticed the bottom half of your email....which correlates with
what i just send

So ignore that please. Glad to see that you are able to reproduce the
problem

On Wed, Aug 17, 2011 at 7:47 PM, ferd <
reply@reply.github.com>wrote:

On Wed, Aug 17, 2011 at 2:22 PM, andymck <
reply@reply.github.com>wrote:

k, i guess i am not communication this properly. Hopefully this example

using real data will help:

Here is the term i trying to send via socket.io


[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function


"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

The length is 396, including the j frame, yes. The \ you see are
_not_something you want to count. They're only a visual representation
in the
shell for you:

30> "m396m~~j["++[Char0,Char1,Char2|_] = JSON.

"m396m~~j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

31> {Char0,Char1,Char2}.
{123,34,97}
32> $".
34

The backslash characters are not part of the string and it's normal for
length/1 not to count them. They're just there so it makes sense when you
read it as a string so you get """ and not """.

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){

this.base._onMessage(JSON.parse(message.substr(3)));

} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed
to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i
am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works
fine
when its websockets.

I'm trying it with jsonp-polling and I am able to reproduce the bug, even
with FF6. That sounds like a good way to start looking into it. The error
is
likely not about the length of the string and how we encode the data, but
likely has to do with the transport or the client. More likely the
transport
module.

I inserted some debugging code in the client and got this as the output:

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty

There we have backslashes added when they need not to, rather than length
not taking them into account.

We'll start to work from there. Thanks for the report and the useful use
case there.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <

reply@reply.github.com>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string
and
then trying to shove it in the app. Are you using the interface
functions
or
just submitting the JSON?

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 17, 2011

Sorry, just noticed the bottom half of your email....which correlates with
what i just send

So ignore that please. Glad to see that you are able to reproduce the
problem

On Wed, Aug 17, 2011 at 7:47 PM, ferd <
reply@reply.github.com>wrote:

On Wed, Aug 17, 2011 at 2:22 PM, andymck <
reply@reply.github.com>wrote:

k, i guess i am not communication this properly. Hopefully this example

using real data will help:

Here is the term i trying to send via socket.io


[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function


"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

The length is 396, including the j frame, yes. The \ you see are
_not_something you want to count. They're only a visual representation
in the
shell for you:

30> "m396m~~j["++[Char0,Char1,Char2|_] = JSON.

"m396m~~j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

31> {Char0,Char1,Char2}.
{123,34,97}
32> $".
34

The backslash characters are not part of the string and it's normal for
length/1 not to count them. They're just there so it makes sense when you
read it as a string so you get """ and not """.

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){

this.base._onMessage(JSON.parse(message.substr(3)));

} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed
to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i
am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works
fine
when its websockets.

I'm trying it with jsonp-polling and I am able to reproduce the bug, even
with FF6. That sounds like a good way to start looking into it. The error
is
likely not about the length of the string and how we encode the data, but
likely has to do with the transport or the client. More likely the
transport
module.

I inserted some debugging code in the client and got this as the output:

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty

There we have backslashes added when they need not to, rather than length
not taking them into account.

We'll start to work from there. Thanks for the report and the useful use
case there.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <

reply@reply.github.com>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string
and
then trying to shove it in the app. Are you using the interface
functions
or
just submitting the JSON?

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:
#57 (comment)

@ferd

This comment has been minimized.

Show comment Hide comment
@ferd

ferd Aug 17, 2011

Collaborator

I'm out of time to check this for now, but the source of the problem seems to be

Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict, false}]))

on line 298 in socketio_transport_polling.erl (send_message_1/5). The problem is that if the previous data was already JSON, then it ends up being escaped just a bit too much.

I don't have much time left today, but if this is critical for you, this quick patch ought to do the job until we find a real solution to the problem:

diff --git a/src/socketio_transport_polling.erl b/src/socketio_transport_polling.erl
index e52e0ce..0b76d03 100644
--- a/src/socketio_transport_polling.erl
+++ b/src/socketio_transport_polling.erl
@@ -296,7 +296,7 @@ send_message(Message, Req, Index, ServerModule, Sup) ->

 send_message_1(Headers, Message, Req, Index, ServerModule) ->
     Headers0 = [{"Content-Type", "text/javascript; charset=UTF-8"}|Headers],
-    Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict, false}])),
+    Message0 = escape(binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict, false}]))),
     Message1 = "io.JSONP["++Index++"]._(" ++ Message0 ++ ");",
     apply(ServerModule, respond, [Req, 200, Headers0, Message1]).

@@ -325,3 +325,7 @@ reset_duration({TimerRef, Time}) ->
     erlang:cancel_timer(TimerRef),
     NewRef = erlang:start_timer(Time, self(), polling),
     {NewRef, Time}.
+
+escape([]) -> [];
+escape("\\\\\\"++Rest) -> "\\"++escape(Rest);
+escape([C|Rest]) -> [C|escape(Rest)].

It will only 'unescape' some characters, so I'd consider it somewhat risky for user-submitted data. Risks of XSS are to be evaluated. I hope this helps in the meantime. As I said, we'll need to find a good solution, not just this toy thing. I've quickly tested the other transports and they seemed to work fine, in any case.

Collaborator

ferd commented Aug 17, 2011

I'm out of time to check this for now, but the source of the problem seems to be

Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict, false}]))

on line 298 in socketio_transport_polling.erl (send_message_1/5). The problem is that if the previous data was already JSON, then it ends up being escaped just a bit too much.

I don't have much time left today, but if this is critical for you, this quick patch ought to do the job until we find a real solution to the problem:

diff --git a/src/socketio_transport_polling.erl b/src/socketio_transport_polling.erl
index e52e0ce..0b76d03 100644
--- a/src/socketio_transport_polling.erl
+++ b/src/socketio_transport_polling.erl
@@ -296,7 +296,7 @@ send_message(Message, Req, Index, ServerModule, Sup) ->

 send_message_1(Headers, Message, Req, Index, ServerModule) ->
     Headers0 = [{"Content-Type", "text/javascript; charset=UTF-8"}|Headers],
-    Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict, false}])),
+    Message0 = escape(binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict, false}]))),
     Message1 = "io.JSONP["++Index++"]._(" ++ Message0 ++ ");",
     apply(ServerModule, respond, [Req, 200, Headers0, Message1]).

@@ -325,3 +325,7 @@ reset_duration({TimerRef, Time}) ->
     erlang:cancel_timer(TimerRef),
     NewRef = erlang:start_timer(Time, self(), polling),
     {NewRef, Time}.
+
+escape([]) -> [];
+escape("\\\\\\"++Rest) -> "\\"++escape(Rest);
+escape([C|Rest]) -> [C|escape(Rest)].

It will only 'unescape' some characters, so I'd consider it somewhat risky for user-submitted data. Risks of XSS are to be evaluated. I hope this helps in the meantime. As I said, we'll need to find a good solution, not just this toy thing. I've quickly tested the other transports and they seemed to work fine, in any case.

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 17, 2011

Thanks. The patch works for our usage right now, so thats a big help.

Cheers

On Wed, Aug 17, 2011 at 8:29 PM, ferd <
reply@reply.github.com>wrote:

I'm out of time to check this for now, but the source of the problem seems
to be

Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message),
[{strict, false}]))

on line 298 in socketio_transport_polling.erl (send_message_1/5). The
problem is that if the previous data was already JSON, then it ends up being
escaped just a bit too much.

I don't have much time left today, but if this is critical for you, this
quick patch ought to do the job until we find a real solution to the
problem:

diff --git a/src/socketio_transport_polling.erl
b/src/socketio_transport_polling.erl
index e52e0ce..0b76d03 100644
--- a/src/socketio_transport_polling.erl
+++ b/src/socketio_transport_polling.erl
@@ -296,7 +296,7 @@ send_message(Message, Req, Index, ServerModule, Sup)
->

send_message_1(Headers, Message, Req, Index, ServerModule) ->
    Headers0 = [{"Content-Type", "text/javascript;

charset=UTF-8"}|Headers],

  • Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message),
    [{strict, false}])),
  • Message0 =
    escape(binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict,
    false}]))),
    Message1 = "io.JSONP["++Index++"]._(" ++ Message0 ++ ");",
    apply(ServerModule, respond, [Req, 200, Headers0, Message1]).

@@ -325,3 +325,7 @@ reset_duration({TimerRef, Time}) ->
erlang:cancel_timer(TimerRef),
NewRef = erlang:start_timer(Time, self(), polling),
{NewRef, Time}.
+
+escape([]) -> [];
+escape("\"++Rest) -> ""++escape(Rest);
+escape([C|Rest]) -> [C|escape(Rest)].

It will only 'unescape' some characters, so I'd consider it somewhat risky
for user-submitted data. Risks of XSS are to be evaluated. I hope this helps
in the meantime. As I said, we'll need to find a good solution, not just
this toy thing. I've quickly tested the other transports and they seemed to
work fine, in any case.

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 17, 2011

Thanks. The patch works for our usage right now, so thats a big help.

Cheers

On Wed, Aug 17, 2011 at 8:29 PM, ferd <
reply@reply.github.com>wrote:

I'm out of time to check this for now, but the source of the problem seems
to be

Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message),
[{strict, false}]))

on line 298 in socketio_transport_polling.erl (send_message_1/5). The
problem is that if the previous data was already JSON, then it ends up being
escaped just a bit too much.

I don't have much time left today, but if this is critical for you, this
quick patch ought to do the job until we find a real solution to the
problem:

diff --git a/src/socketio_transport_polling.erl
b/src/socketio_transport_polling.erl
index e52e0ce..0b76d03 100644
--- a/src/socketio_transport_polling.erl
+++ b/src/socketio_transport_polling.erl
@@ -296,7 +296,7 @@ send_message(Message, Req, Index, ServerModule, Sup)
->

send_message_1(Headers, Message, Req, Index, ServerModule) ->
    Headers0 = [{"Content-Type", "text/javascript;

charset=UTF-8"}|Headers],

  • Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message),
    [{strict, false}])),
  • Message0 =
    escape(binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict,
    false}]))),
    Message1 = "io.JSONP["++Index++"]._(" ++ Message0 ++ ");",
    apply(ServerModule, respond, [Req, 200, Headers0, Message1]).

@@ -325,3 +325,7 @@ reset_duration({TimerRef, Time}) ->
erlang:cancel_timer(TimerRef),
NewRef = erlang:start_timer(Time, self(), polling),
{NewRef, Time}.
+
+escape([]) -> [];
+escape("\"++Rest) -> ""++escape(Rest);
+escape([C|Rest]) -> [C|escape(Rest)].

It will only 'unescape' some characters, so I'd consider it somewhat risky
for user-submitted data. Risks of XSS are to be evaluated. I hope this helps
in the meantime. As I said, we'll need to find a good solution, not just
this toy thing. I've quickly tested the other transports and they seemed to
work fine, in any case.

Reply to this email directly or view it on GitHub:
#57 (comment)

@ferd

This comment has been minimized.

Show comment Hide comment
@ferd

ferd Aug 19, 2011

Collaborator

Should be fixed and safer as of a146778

andymck, if you could update your version, you'd have something safer with regards to XSS, although likely a bit slower. Safety over speed.

Collaborator

ferd commented Aug 19, 2011

Should be fixed and safer as of a146778

andymck, if you could update your version, you'd have something safer with regards to XSS, although likely a bit slower. Safety over speed.

@ferd ferd closed this Aug 19, 2011

@andymck

This comment has been minimized.

Show comment Hide comment
@andymck

andymck Aug 19, 2011

ferd, thanks for that

But can i ask that you dont use the data i sent you as an example. That
data is from a live product and I dont know if my boss would be that happy
to see it out in the open

Appreciate the fix.

Thanks

Andrew

On Fri, Aug 19, 2011 at 4:56 PM, ferd <
reply@reply.github.com>wrote:

Should be fixed and safer as of
a146778

andymck, if you could update your version, you'd have something safer with
regards to XSS, although likely a bit slower. Safety over speed.

Reply to this email directly or view it on GitHub:
#57 (comment)

andymck commented Aug 19, 2011

ferd, thanks for that

But can i ask that you dont use the data i sent you as an example. That
data is from a live product and I dont know if my boss would be that happy
to see it out in the open

Appreciate the fix.

Thanks

Andrew

On Fri, Aug 19, 2011 at 4:56 PM, ferd <
reply@reply.github.com>wrote:

Should be fixed and safer as of
a146778

andymck, if you could update your version, you'd have something safer with
regards to XSS, although likely a bit slower. Safety over speed.

Reply to this email directly or view it on GitHub:
#57 (comment)

@ferd

This comment has been minimized.

Show comment Hide comment
@ferd

ferd Aug 19, 2011

Collaborator

That should be fixed in the merge. Let me know if this is enough or are looking to have all traces gone (which I am not sure how to do)

Collaborator

ferd commented Aug 19, 2011

That should be fixed in the merge. Let me know if this is enough or are looking to have all traces gone (which I am not sure how to do)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment