This repository has been archived by the owner. It is now read-only.

Reading a file with ascii encoding and with multibyte characters #4413

Closed
gagle opened this Issue Dec 14, 2012 · 1 comment

Comments

Projects
None yet
2 participants

gagle commented Dec 14, 2012

file:

require("fs").readFile ("file", "ascii", function (e, d){
    console.log(d==="") //true
})

How is this possible? 聵 is not an ascii character, is encoded with 3 bytes, 0xE881B5. What I expect is to get è\u0081µ because ascii characters are encoded with a single byte.

If I read using "binary" encoding it prints true, what I expect with ascii encoding...

require("fs").readFile ("file", "binary", function (e, d){
    console.log(d === "è\u0081µ") //true
})

Is this result an intentional feature?

EDIT: More info

This is the content (opened with HxD program):

Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000  E8 81 B5                                         è.µ

and:

require("fs").readFile ("file", function (e, d){
    console.log (d.toString ("ascii") === "") //true
    console.log (d.toString ("utf8") === "") //true
    console.log (d.toString ("binary") === "è\u0081µ") //true
    console.log (d) //<Buffer e8 81 b5>
})
Member

bnoordhuis commented Dec 15, 2012

Is this result an intentional feature?

No, it's a bug. See #4379 for details.

@bnoordhuis bnoordhuis closed this Dec 15, 2012

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.