Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newlines aren't handled correctly #112

Closed
wants to merge 3 commits into from
Closed

Newlines aren't handled correctly #112

wants to merge 3 commits into from

Conversation

mstade
Copy link
Contributor

@mstade mstade commented Oct 21, 2016

I apologize for creating PR without a solution here, but I didn't want to just create an issue that didn't have some more substantial data attached, and figured a PR would be an easier read than a bunch of code snippets. Perhaps this PR can turn into a solution – it's just not apparent to me at this stage.

New lines don't seem to be serialized correctly. The test added in this PR breaks, when the stream tries parsing the chunk. For some reason – I don't know it yet – adding a \n seemingly anywhere in the message will produce invalid JSON which is then written to the output stream. I think it's because of fastRep, but haven't been able to verify this. I'm hoping by showing you this, maybe it'll be apparent to you what the problem is, and we can work together to resolve it. In the meantime, I'll try to figure this out as well.

This test shows a trailing newline, but it happens with newlines anywhere in the message string.

@mstade
Copy link
Contributor Author

mstade commented Oct 21, 2016

If I understand your code correctly, the asJSON is a kind of "faster" JSON.stringify, but the consequence of this is that not all unsafe characters are escaped. You've got code to deal with " but things like \n are left as-is. These would otherwise be escaped by JSON.stringify.

If I replace fastRep(msg) with JSON.stringify(msg) then the tests pass. Running the benchmarks seem to suggest there's no (obvious) performance degradation as a consequence of this. Maybe worth doing this as the message is guaranteed to be safe then?

This seems to fix message formatting, while not adding any obvious
performance penalties.
@mcollina
Copy link
Member

Thanks for reporting this! Let's add the fix, and see how it goes. I'll benchmark extensively and check what's going on.

@mstade
Copy link
Contributor Author

mstade commented Oct 21, 2016

Hey @mcollina I just pushed the change that uses JSON.stringify instead of fastRep. This is a very naive fix obviously, since I don't feel like I actually know what's going on and I'm riding on a lot of assumptions here, but from cursory testing this seems ok. Happy to work with you on a better solution if you think there is one!

@jsumners
Copy link
Member

It is most definitely that not all characters that should be escaped according to https://tools.ietf.org/html/rfc7159#section-7 are being escaped. fastRep is, in fact, only escaping the " character.

@mstade
Copy link
Contributor Author

mstade commented Oct 21, 2016

Following on @jsumners' comment, this can and probably should be extended to include all cases described in the spec. JSON.stringify should obviously cover that, but the tests should be added anyway I guess, especially if JSON.stringify turns out to be too expensive.

@jsumners
Copy link
Member

Maybe fastRep can be nothing more than:

function fastRep(s) {
  return JSON.stringify(s)
}
% node                                                                                                                                             [s:0 l:6225]
> JSON.stringify('this includes a \n')
'"this includes a \\n"'
> JSON.stringify('"this includes a \n"')
'"\\"this includes a \\n\\""'
>

@mcollina
Copy link
Member

I see a consistent slowdown by using JSON.stringify in place of fastRep.

We probably have to fix update fastRep to handle all those chars.

@mstade
Copy link
Contributor Author

mstade commented Oct 21, 2016

Really? I'm not able to reproduce that slowdown, but maybe I'm not doing it correctly? How do you benchmark this @mcollina?

@mstade
Copy link
Contributor Author

mstade commented Oct 21, 2016

With fastRep:

benchBunyan*10000: 1360.825ms
benchWinston*10000: 2128.220ms
benchBole*10000: 319.791ms
benchDebug*10000: 444.725ms
benchLogLevel*10000: 344.633ms
benchPino*10000: 325.967ms
benchPinoExreme*10000: 132.013ms
benchBunyan*10000: 1326.169ms
benchWinston*10000: 2089.695ms
benchBole*10000: 301.971ms
benchDebug*10000: 926.796ms
benchLogLevel*10000: 356.660ms
benchPino*10000: 292.415ms
benchPinoExreme*10000: 125.665ms

With JSON.stringify:

benchBunyan*10000: 1496.504ms
benchWinston*10000: 2594.201ms
benchBole*10000: 299.226ms
benchDebug*10000: 437.565ms
benchLogLevel*10000: 344.308ms
benchPino*10000: 313.119ms
benchPinoExreme*10000: 122.720ms
benchBunyan*10000: 1277.179ms
benchWinston*10000: 1968.836ms
benchBole*10000: 270.150ms
benchDebug*10000: 395.969ms
benchLogLevel*10000: 301.023ms
benchPino*10000: 278.244ms
benchPinoExreme*10000: 116.036ms

Works On My Machine(tm) certification naturally, but that's pretty consistently what I see – JSON.stringify is either similar or better performance. Maybe this is within the margin of error? Maybe I have no idea what I'm doing? (The latter is likely I think. :o))

@mcollina
Copy link
Member

Which version of node/OS are you in?

benchmarks/basic.js It's consistently faster here. Node v6.8.1/Mac OS X.

@mstade
Copy link
Contributor Author

mstade commented Oct 21, 2016

Node v6.1.0/macOS Sierra

@mstade
Copy link
Contributor Author

mstade commented Oct 21, 2016

This is what I get when running v6.9.1:

fastRep:

benchBunyan*10000: 978.038ms
benchWinston*10000: 2003.431ms
benchBole*10000: 287.674ms
benchDebug*10000: 387.887ms
benchLogLevel*10000: 314.808ms
benchPino*10000: 265.910ms
benchPinoExreme*10000: 133.663ms
benchBunyan*10000: 897.956ms
benchWinston*10000: 1893.973ms
benchBole*10000: 265.483ms
benchDebug*10000: 375.746ms
benchLogLevel*10000: 285.672ms
benchPino*10000: 257.504ms
benchPinoExreme*10000: 115.798ms

JSON.stringify:

benchBunyan*10000: 987.971ms
benchWinston*10000: 1954.223ms
benchBole*10000: 284.010ms
benchDebug*10000: 416.457ms
benchLogLevel*10000: 314.516ms
benchPino*10000: 290.437ms
benchPinoExreme*10000: 144.112ms
benchBunyan*10000: 926.902ms
benchWinston*10000: 1859.631ms
benchBole*10000: 278.802ms
benchDebug*10000: 387.563ms
benchLogLevel*10000: 311.918ms
benchPino*10000: 282.885ms
benchPinoExreme*10000: 131.419ms

This is more in line with what you're seeing.

@mstade
Copy link
Contributor Author

mstade commented Oct 21, 2016

To be fair, this isn't really an apples-to-apples comparison, since JSON.stringify covers all cases whereas fastRep only covers one. We should test a version where fastRep correctly covers all the cases, and then compare performance.

This adds a test for all characters that must be escaped in JSON,
according to the [section 7 of the spec][1], quoted below:

> All Unicode characters may be placed within the quotation marks,
> except for the characters that must be escaped: quotation mark,
> reverse solidus, and the control characters (U+0000 through U+001F).

[1]: https://tools.ietf.org/html/rfc7159#section-7
@github-actions
Copy link

github-actions bot commented Feb 9, 2022

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 9, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants