Skip to content
This repository has been archived by the owner. It is now read-only.

punycode: encoder contains bugs #2072

Closed
bnoordhuis opened this issue Nov 11, 2011 · 2 comments

Comments

Projects
None yet
2 participants
@bnoordhuis
Copy link
Member

commented Nov 11, 2011

Not all test cases from http://tools.ietf.org/html/rfc3492#section-7.1 pass. Tests added in bnoordhuis/node@1f217ee.

FAIL: expected "ihqwcrb4cv8a8dqg056pqjye", got "ihqwcrb2cv8a8dqg056pqjye"
FAIL: expected "他们为什么不说中文", got "诵为斈不他们中什么"
FAIL: expected "ihqwctvzc91f659drss3x8bo0yb", got "ihqwctvzcv8e659drss3x8bo0yb"
FAIL: expected "他們爲什麽不說中文", got "ꁈ他谵什朒不倠中玼"
FAIL: expected "b1abfaaepdrnnbgefbaDotcwatmq2g4l", got "b1abfaaepdrnnbgefbadotcwatmq2g4l"
FAIL: expected "TisaohkhngthchnitingVit-kjcr8268qyxafd2f1b9g", got "TisaohkhngthchnitingVit-kjcr8268qyxafd2f1b3g"
FAIL: expected "TạisaohọkhôngthểchỉnóitiếngViệt", got "TạisaohkhôngtọhểchỉnóitiếngViệt"
FAIL: expected "3B-ww4c5e180e575a65lsy2b", got "3B-ww4c5e708d575a65lsy2b"
FAIL: expected "3年B組金八先生", got "廿3総鉜B八疪先"

cc @mathiasbynens

@ghost ghost assigned bnoordhuis Nov 11, 2011

@mathiasbynens

This comment has been minimized.

Copy link

commented Nov 11, 2011

I have a working implementation that passes all unit tests except the Russian (Cyrillic) example string. I suspect that’s a typo in the RFC.

https://github.com/bestiejs/punycode.js

To run the unit tests in Node.js, clone the repository, cd into it, and then run node tests/tests.js.

$ node tests/tests.js 
 PASS - Punycode.utf16.decode
 PASS - Punycode.utf16.encode
 PASS - Punycode.decode
 FAIL - Punycode.encode
    PASS | EQ | ASCII string that breaks the existing rules for host-name labels | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | Vietnamese | 
    PASS | EQ | Spanish | 
    FAIL | EQ | Russian (Cyrillic) | Expected: b1abfaaepdrnnbgefbaDotcwatmq2g4l, Actual: b1abfaaepdrnnbgefbadotcwatmq2g4l
    PASS | EQ | Korean (Hangul syllables) | 
    PASS | EQ | Japanese (kanji and hiragana) | 
    PASS | EQ | Hindi (Devanagari) | 
    PASS | EQ | Hebrew | 
    PASS | EQ | Czech | 
    PASS | EQ | Chinese (traditional) | 
    PASS | EQ | Chinese (simplified) | 
    PASS | EQ | Arabic (Egyptian) | 
    PASS | EQ | long string with both ASCII and non-ASCII characters | 
    PASS | EQ | mix of ASCII and non-ASCII characters | 
    PASS | EQ | multiple non-ASCII characters | 
    PASS | EQ | a single non-ASCII character | 
    PASS | EQ | a single basic code point | 
 PASS - Punycode.toUnicode
 PASS - Punycode.toASCII
----------------------------------------
    PASS: 63  FAIL: 1  TOTAL: 64
    Finished in 20 milliseconds.
----------------------------------------
@bnoordhuis

This comment has been minimized.

Copy link
Member Author

commented Nov 11, 2011

Fixed in 326b2cb. Thanks, Mathias.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.