Tar files in npm registry have incorrectly encoded test filenames #29

tomhughes opened this Issue Feb 25, 2013 · 3 comments


None yet
2 participants

tomhughes commented Feb 25, 2013

The files in the test directory with unicode names are using codepoints which combine the base character and accent in a single codepoint when fetched from git, which matches what the test scripts expect.

When the tar ball from the npm registry is used however the filenames have separate combining characters for the accents, which mean the tests fail because the names do not match.

As an example, consider tests/data/Clément which should be encoded as:


but in the npm tarball is encoded as:


springmeyer commented Aug 10, 2013

any idea how to re-encode correctly? This has baffled me. I might just remove them...


tomhughes commented Aug 10, 2013

Well I don't really know anything about how the tar balls in the npm registry are created. I'm guessing that there is an npm command that creates and uploads them? Presumably that is using a node implementation of tar rather than the normal tar command, and that implementation is broken in it's handling of filename encodings...

A quick fix might be to change the names of the files and then have a test script that renames them before running the tests?


springmeyer commented Jan 14, 2015

not seen this problem for a while I'm assuming something was fixed upstream in npm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment