utf-8 bom causes first column value to be corrupted #131

matt-blanchette · 2016-03-22T23:45:09Z

UTF8 files starting with BOM cause parsed csv data to be broken for the first value.

If the first row contains column names,
then the first column becomes unusable to lookup by name.

Either a code change to strip a possible BOM as the first character or a documentation update would be helpful.

ajmas · 2016-10-01T03:17:16Z

I am observing this in version 2.3.0. In my case the first column should be 'Type', but it is coming out as '<U+FEFF>Type'.

We should probably add some code to handle all the common scenarios, as described here:

matt-blanchette · 2016-10-07T21:16:47Z

I've used the strip-bom-stream module for now to handle this.

fs.createReadStream(fileName)
    .pipe( stripBomStream() )
    .pipe( csv(options) )

pietercolpaert · 2017-01-27T09:33:15Z

Should this fix be added to the repo or should we always use stripBomStream?

dustinsmith1024 · 2017-01-30T15:49:44Z

2.3.1 is published with the fix.

pietercolpaert mentioned this issue Jan 27, 2017

Fix #131 #170

Merged

dustinsmith1024 closed this as completed in #170 Jan 30, 2017

dustinsmith1024 added a commit that referenced this issue Jan 30, 2017

Merge pull request #170 from pietercolpaert/fix-131

f8e6a4b

Fix #131

Provide feedback