Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

toASCII returns incorrect result #66

Open
nucks opened this issue Apr 2, 2017 · 5 comments
Open

toASCII returns incorrect result #66

nucks opened this issue Apr 2, 2017 · 5 comments
Labels

Comments

@nucks
Copy link

nucks commented Apr 2, 2017

While converting various emojis, I've found that very often it is not converting them correctly.

What I'd expect:
1. punycode.toASCII( "❄️🙊" ) should equal "xn--tdix719m" where instead it equals "xn--tdiy444e5vxg" (invalid punycode)
2. punycode.toASCII( "👰✉️" ) should equal "xn--4bi6168m" where instead it equals "xn--4biw254ehqwg" (invalid punycode)

I could give many more examples, but I haven't found a pattern in the mistake.

@mathiasbynens
Copy link
Owner

mathiasbynens commented Apr 2, 2017

What makes you say that’s the expected output, and that the current output is “invalid Punycode”? Could you elaborate please?

Update: Entering e.g. ❄️🙊.la in Chrome’s address bar displays xn--tdix719m.la.

@nucks
Copy link
Author

nucks commented Apr 2, 2017

Yes, sorry. I've been converting many emojis and because the punycode isn't correct many of them haven't been appearing correctly, which is what pushed me to investigate further. I tested on many conversion sites to test that out. The main one I trust most is Punycoder. If you try converting those strings above, you will find the results I was describing.

@mathiasbynens mathiasbynens changed the title Javascript Conversion Incorrect toASCII returns incorrect result Apr 2, 2017
@hashtag2949
Copy link

@hashtag2949
Copy link

Cluster-Nobes has given a networking drive & wifi-drive. Having first error on Nobe-3

@mjethani
Copy link
Contributor

mjethani commented Sep 5, 2021

This is not really a bug. It's a limitation.

> [...'❄️🙊']
[ '❄', '️', '🙊' ]
> 

The second character is U+FE0F. If you drop it from the input, you get the correct result. Take a look at idna-uts46 for a more detailed explanation.

Here's a quick alternative that works in the browser as well as Node.js:

let toASCII = d => d.split('.').map(l => new URL(`ws://${l}`).hostname).join('.');

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants