Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 BOM results in whitespace error #97

Open
spikeheap opened this issue Oct 8, 2014 · 1 comment
Open

UTF-8 BOM results in whitespace error #97

spikeheap opened this issue Oct 8, 2014 · 1 comment

Comments

@spikeheap
Copy link

UTF-8 files with a Byte Order Mark have the BOM passed through to the content by default in Ruby, and the result is whitespace errors reported by csvlint.

Here's an example: http://csvlint.io/validation/543526f36373760fc6020000.

The BOM only needs to be filtered from the first line in the file, e.g.:

row.delete!("\xEF\xBB\xBF")
@ntkog
Copy link

ntkog commented Apr 16, 2015

Another workaround is be sure to strip any BOM sequence .
You can do it with strip-bom module :

Ex:

var fs = require('fs');
var stripBom = require('strip-bom');
var rs = fs.createReadStream(file);
var csvlintInstance = csvlint();
rs
.pipe(stripBom.stream())
.pipe(csvlintInstance)
  .on('error', function (errArr) {
   console.log(errArr); 
})
...

By the way,
If you convert a .csv file ti UTF-8 directy from Windows notepad , you'll get one file with BOM in most cases, you can check it with:

file example.csv
example.csv: UTF-8 Unicode (with BOM) text, with CRLF line terminators

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants