Fix printing of 4 byte UTF-8 characters #1537

Merged
merged 5 commits into from Aug 20, 2013

Conversation

Projects
None yet
3 participants
@aidanhs
Contributor

aidanhs commented Aug 18, 2013

Carrying on from the original issue reported at #1382 where I screwed up the conversion to a PR, here is a clean set of commits ready to be pulled.

The only test I ran was python tests/runner.py test_utf.

@juj

View changes

src/runtime.js
buffer.push(code);
- if (code > 191 && code < 224) {
+ if (((code & 0xE0) ^ 0xC0) == 0) { // 110xxxxx

This comment has been minimized.

@juj

juj Aug 20, 2013

Collaborator

As a (micro(?)-)optimization, could this be written as if ((code & 0xE0) == 0xC0), and the same for the check below on line 404? That would save an xor.

@juj

juj Aug 20, 2013

Collaborator

As a (micro(?)-)optimization, could this be written as if ((code & 0xE0) == 0xC0), and the same for the check below on line 404? That would save an xor.

@kripken

This comment has been minimized.

Show comment
Hide comment
@kripken

kripken Aug 20, 2013

Owner

Please rebase on incoming, so the diff here contains just your new code.

Owner

kripken commented Aug 20, 2013

Please rebase on incoming, so the diff here contains just your new code.

@aidanhs

This comment has been minimized.

Show comment
Hide comment
@aidanhs

aidanhs Aug 20, 2013

Contributor

Yeah, sorry about that, sometimes I love git and sometimes it cheerfully permits me to screw myself over...let me see if I can unpick this mess...

Contributor

aidanhs commented Aug 20, 2013

Yeah, sorry about that, sometimes I love git and sometimes it cheerfully permits me to screw myself over...let me see if I can unpick this mess...

@aidanhs

This comment has been minimized.

Show comment
Hide comment
@aidanhs

aidanhs Aug 20, 2013

Contributor

Right, fixed.

@juj wasn't too concerned about performance but I do think it's semantically clearer so I like.

Contributor

aidanhs commented Aug 20, 2013

Right, fixed.

@juj wasn't too concerned about performance but I do think it's semantically clearer so I like.

@kripken

This comment has been minimized.

Show comment
Hide comment
@kripken

kripken Aug 20, 2013

Owner

I don't really understand this or utf ;) but looks good and if the previous test passes + a new testcase now works, then excellent. thanks!

Owner

kripken commented Aug 20, 2013

I don't really understand this or utf ;) but looks good and if the previous test passes + a new testcase now works, then excellent. thanks!

kripken added a commit that referenced this pull request Aug 20, 2013

Merge pull request #1537 from aidanhs/4-byte-utf8-chars
Fix printing of 4 byte UTF-8 characters

@kripken kripken merged commit 6f2c31d into kripken:incoming Aug 20, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment