Unicode characters are mangled in JavaScript kata output

Unicode characters in the JavaScript kata output are being mangled.  

Each byte of the [UTF-8 encoding](https://en.wikipedia.org/wiki/UTF-8#Description) seems to be printing as a separate Unicode character.  So the Chinese greeting [`你好`](https://en.wikipedia.org/wiki/Ni_Hao) displays as `ä½ å¥½`.

**_This is very bad for anyone needing more than 7-bit ASCII._**

Now for some examples of what I think may be happening.  This code in a JavaScript kata:

``` javascript
console.log("£");
```

displays as the two characters `Â£` (`"\u00c2\u00a3"`).  The Unicode code point for `£` (`"\u00a3"`) is normally encoded in UTF-8 as `0xc2a3`.  But Codewars apparently re-encodes each byte: `0xc2`, `0xa3` to get `Â£`.

This:

``` javascript
console.log("\uffff")
```

is displayed as three characters `ï¿¿` (`"\u00ef\u00bf\u00bf"`).  The Unicode code point `0xffff` is normally encoded in UTF-8 as `0xefbfbf`.  But as above, Codewars then seems to re-encode `0xef`, `0xbf`, `0xbf` to `ï¿¿`.

I could give as many examples as there are multiple-byte UTF-8 encodings, but this suffices to show the pattern for a single character.  Longer strings just repeat the problem, so that `console.log("£££££");` displays as `Â£Â£Â£Â£Â£` for example.

As I said, this seems to be pretty serious for anyone needing Unicode.  

---

Note: I discovered this while completing [Simple Change Machine](http://www.codewars.com/kata/57238766214e4b04b8000011/train/javascript), which uses the pound symbol.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unicode characters are mangled in JavaScript kata output #307

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unicode characters are mangled in JavaScript kata output #307

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions