-
Notifications
You must be signed in to change notification settings - Fork 278
Description
I'm using node-csv to pluck data from large CSV files. On extremely large datasets (>500MB), I'm getting this. I'm pretty sure it's related to incorrect handling of a buffer boundary, because the location of the error shifts if I change the preceding data. For instance, this particular instance disappeared when I shifted the 100,000-line window by 1 line (using head to pipe gunzip output to node).
It doesn't seem to be strictly size-based, since I have been able to parse up to 250MB before seeing this error, though it is reproducible by searching for subsets that exhibit this behaviour in as little as 30MB. That's why I think it's likely due to a buffer boundary occurring within a quoted object.
All data in one of these subsets is parseable if divided into small batches, so I'm certain it's not a syntax error.
Unfortunately, the dataset is logs that can't be shared for security reasons, and I haven't had a change to try constructing a random dataset exhibiting this behaviour yet.
events.js:292
throw er; // Unhandled 'error' event
^
CsvError: Invalid Closing Quote: got " " at line 90262 instead of delimiter, row delimiter, trimable character (if activated) or comment
at Parser.__parse (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/csv-parse/lib/index.js:529:17)
at Parser._transform (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/csv-parse/lib/index.js:403:22)
at Parser.Transform._read (_stream_transform.js:191:10)
at Parser.Transform._write (_stream_transform.js:179:12)
at doWrite (_stream_writable.js:403:12)
at writeOrBuffer (_stream_writable.js:387:5)
at Parser.Writable.write (_stream_writable.js:318:11)
at /Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:640:33
at Stream.s._send (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:1560:9)
at Stream.write (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:1661:18)
Emitted 'error' event on Stream instance at:
at Stream._send (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:998:18)
at push (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:1526:19)
at /Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:2212:13
at Stream.s._send (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:1560:9)
at Stream.write (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:1658:18)
at Stream._send (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:984:26)
at push (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:1526:19)
at /Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:2212:13
at Stream.s._send (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:1560:9)
at Stream.write (/Users/USERNAME/Documents/src/tools/mobile-logspam/node_modules/highland/lib/index.js:1658:18) {
code: 'CSV_INVALID_CLOSING_QUOTE',
column: 'log',
empty_lines: 0,
header: false,
index: 6,
invalid_field_length: 0,
quoting: false,
lines: 90262,
records: 90260
}