Fails to encode strings containing non-ASCII characters #1

Open
sorear opened this Issue Mar 2, 2013 · 3 comments

Projects

None yet

3 participants

@sorear
sorear commented Mar 2, 2013

The browser encoder uses character length for strings, but the browser decoder, node encoder, and node decoder all expect byte lengths. This leads to hilarity at decode time.

@ericz
Member
ericz commented Apr 3, 2013

Yes indeed. It's just hard to get byte lengths on client side in a performant manner. I'll probably make a UTF happy fork sometime by using (new Blob([str])).size instead of str.length sometime, but I don't want binarypack to be slow

@dnorman
dnorman commented Apr 3, 2013

I would agree that performance is important, but this malfunctions in a particularly egregious way when UTF8 data is passed, which is to say that it's quite vulnerable, as UTF8 is ever-present on the modern web. Notwithstanding this bug, binaryjs is an excellent library btw.

@ericz
Member
ericz commented Apr 3, 2013

I see what you're saying.

Any suggestion for solutions?

Options:

  • Take a performance hit and get string lengths with (new Blob([str])).size instead of str.length
  • Offer a UTF8 flag / fork that takes the performance hit if people so desire
  • Do nothing

Or other ideas?

@ericz ericz referenced this issue in binaryjs/binaryjs Apr 3, 2013
Closed

UTF8 non-binary data crashes #21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment