UTF-8 BOM results in whitespace error #97

spikeheap · 2014-10-08T12:17:41Z

UTF-8 files with a Byte Order Mark have the BOM passed through to the content by default in Ruby, and the result is whitespace errors reported by csvlint.

Here's an example: http://csvlint.io/validation/543526f36373760fc6020000.

The BOM only needs to be filtered from the first line in the file, e.g.:

row.delete!("\xEF\xBB\xBF")

The text was updated successfully, but these errors were encountered:

ntkog · 2015-04-16T18:46:06Z

Another workaround is be sure to strip any BOM sequence .
You can do it with strip-bom module :

Ex:

var fs = require('fs');
var stripBom = require('strip-bom');
var rs = fs.createReadStream(file);
var csvlintInstance = csvlint();
rs
.pipe(stripBom.stream())
.pipe(csvlintInstance)
  .on('error', function (errArr) {
   console.log(errArr); 
})
...

By the way,
If you convert a .csv file ti UTF-8 directy from Windows notepad , you'll get one file with BOM in most cases, you can check it with:

file example.csv
example.csv: UTF-8 Unicode (with BOM) text, with CRLF line terminators

quadrophobiac added fn:Performance i:enhancement a:relocate labels Jul 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UTF-8 BOM results in whitespace error #97

UTF-8 BOM results in whitespace error #97

spikeheap commented Oct 8, 2014

ntkog commented Apr 16, 2015

UTF-8 BOM results in whitespace error #97

UTF-8 BOM results in whitespace error #97

Comments

spikeheap commented Oct 8, 2014

ntkog commented Apr 16, 2015