-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
errror when comment character contained within CSV data #325
Comments
Hi @missinglink, supporting regular expression is impossible. It would apply to the all record but to know what is a record, we need to parse the record because a record separator could be escaped or present inside a quoted field. However, with a comment, attempting to parse the record will legitimately end up as an error. Not a big fan of introducing a new option but I don't have much other option to propose. |
* chore: latest dependencies * fix: uncaught errors with large stream chunks (fix adaltas#386) * chore(release): publish - csv-demo-browser@0.1.6 - csv-demo-cjs@0.2.4 - csv-demo-eslint@0.1.10 - csv-demo-esm@0.0.18 - csv-issues-cjs@0.1.5 - csv-issues-esm@0.0.9 - csv-demo-ts-moduleresolution-node16-cjs@0.2.4 - csv-demo-ts-module-node16@0.2.4 - csv-demo-webpack-ts@0.1.6 - csv-demo-webpack@0.1.8 - csv-generate@4.2.3 - csv-parse@5.3.7 - csv-stringify@6.3.1 - csv@6.2.9 - stream-transform@3.2.3 * test(csv-stringify): fix legacy * chore(release): publish - csv-demo-browser@0.1.7 - csv-demo-cjs@0.2.5 - csv-demo-eslint@0.1.11 - csv-demo-esm@0.0.19 - csv-issues-cjs@0.1.6 - csv-issues-esm@0.0.10 - csv-demo-ts-moduleresolution-node16-cjs@0.2.5 - csv-demo-ts-module-node16@0.2.5 - csv-demo-webpack-ts@0.1.7 - csv-demo-webpack@0.1.9 - csv-generate@4.2.4 - csv-parse@5.3.8 - csv-stringify@6.3.2 - csv@6.2.10 - stream-transform@3.2.4 * build: remove trailing slash in home url * chore: latest dependencies * fix(csv): fixed CJS types under modern `modernResolution` options (adaltas#388) * fix(csv): remove ts files in cjs dist * chore(release): publish - csv-demo-browser@0.1.8 - csv-demo-cjs@0.2.6 - csv-demo-eslint@0.1.12 - csv-demo-esm@0.0.20 - csv-issues-cjs@0.1.7 - csv-issues-esm@0.0.11 - csv-demo-ts-moduleresolution-node16-cjs@0.2.6 - csv-demo-ts-module-node16@0.2.6 - csv-demo-webpack-ts@0.1.8 - csv-demo-webpack@0.1.10 - csv-generate@4.2.5 - csv-parse@5.3.9 - csv-stringify@6.3.3 - csv@6.2.11 - stream-transform@3.2.5 * docs: minor upercase modification * chore: latest dependencies * chore(release): publish - csv-demo-browser@0.1.9 - csv-demo-cjs@0.2.7 - csv-demo-eslint@0.1.13 - csv-demo-esm@0.0.21 - csv-issues-cjs@0.1.8 - csv-issues-esm@0.0.12 - csv-demo-ts-moduleresolution-node16-cjs@0.2.7 - csv-demo-ts-module-node16@0.2.7 - csv-demo-webpack-ts@0.1.9 - csv-demo-webpack@0.1.11 - csv-generate@4.2.6 - csv-parse@5.3.10 - csv-stringify@6.3.4 - csv@6.2.12 - stream-transform@3.2.6 * feat: add unicode chars to formula escape (adaltas#387) * fix(csv-stringify): use switch in formula escaping * fix(csv-stringify): add unicode character equivalents in formula sanitization * chore: update tests * docs(csv-stringify): escape formulas references * chore(release): publish - csv-demo-browser@0.1.10 - csv-demo-cjs@0.2.8 - csv-demo-eslint@0.1.14 - csv-demo-esm@0.0.22 - csv-issues-cjs@0.1.9 - csv-issues-esm@0.0.13 - csv-demo-ts-moduleresolution-node16-cjs@0.2.8 - csv-demo-ts-module-node16@0.2.8 - csv-demo-webpack-ts@0.1.10 - csv-demo-webpack@0.1.12 - csv-stringify@6.4.0 - csv@6.3.0 * feat(csv-parse): add `columns` property in `Info` object type (adaltas#390) * fix(ts): Add `columns` property in `Info` object type * Add disabled options to columns type * build(csv-parse): build and write test after info ts definition * chore(release): publish - csv-demo-browser@0.1.11 - csv-demo-cjs@0.2.9 - csv-demo-esm@0.0.23 - csv-issues-cjs@0.1.10 - csv-issues-esm@0.0.14 - csv-demo-ts-moduleresolution-node16-cjs@0.2.9 - csv-demo-ts-module-node16@0.2.9 - csv-demo-webpack-ts@0.1.11 - csv-demo-webpack@0.1.13 - csv-parse@5.4.0 - csv@6.3.1 * docs: update build badge urls * docs(csv-generate): comment indentation in samples * refactor(csv-issues-cjs): code format * refactor(csv-issues-cjs): remove unused arguments * test(csv-issues-cjs): fix stdout maxBuffer length exceeded * test(csv-issues-esm): use spawn instead of exec * fix: commonjs types, run tsc and lint to validate changes (adaltas#397) * fix: types weren't working for commonjs. Run tsc and lint to validate changes * chore: needs to work on linux and BSD * chore: latest dependencies * chore(release): publish - csv-demo-browser@0.1.12 - csv-demo-cjs@0.2.10 - csv-demo-eslint@0.1.15 - csv-demo-esm@0.0.24 - csv-issues-cjs@0.1.11 - csv-issues-esm@0.0.15 - csv-demo-ts-moduleresolution-node16-cjs@0.2.10 - csv-demo-ts-module-node16@0.2.10 - csv-demo-webpack-ts@0.1.12 - csv-demo-webpack@0.1.14 - csv-generate@4.2.7 - csv-parse@5.4.1 - csv-stringify@6.4.1 - csv@6.3.2 - stream-transform@3.2.7 * feat(csv-issues-cjs): 399 issue * fix(csv-demo-ts-cjs-node16): upgrade module definition after latest typescript * feat(csv-parse): new comment_no_infix option (fix adaltas#325) * test(csv-issues-esm): reproduce issue adaltas#391 * refactor(csv-stringify): rename variable in sample * test(csv-issues-cjs): reproduce issue 327 * chore(release): publish - csv-demo-browser@0.1.13 - csv-demo-cjs@0.2.11 - csv-demo-eslint@0.1.16 - csv-demo-esm@0.0.25 - csv-issues-cjs@0.2.0 - csv-issues-esm@0.0.16 - csv-demo-ts-cjs-node16@0.2.11 - csv-demo-ts-module-node16@0.2.11 - csv-demo-webpack-ts@0.1.13 - csv-demo-webpack@0.1.15 - csv-generate@4.2.8 - csv-parse@5.5.0 - csv-stringify@6.4.2 - csv@6.3.3 - stream-transform@3.2.8 * docs(csv-parse): comment_no_infix sample --------- Co-authored-by: David Worms <david@adaltas.com> Co-authored-by: Petter <petter@petterhaggholm.net> Co-authored-by: Mateusz Burzyński <mateuszburzynski@gmail.com> Co-authored-by: Tom Emelko <tom.emelko@gmail.com> Co-authored-by: Elia Maino <eliamaino@gmail.com> Co-authored-by: David Tanner <darthtanner@gmail.com>
Summary
Hi 👋 thanks for the great lib!
We are using the option
{ "comment": "#" }
to remove a header section from the CSV file which contains multiple lines beginning with '#' (as per bash syntax).Motivation
The issue we face is that the hash (
#
) character may also exist as a valid character within the body of some rows, this results in a fatal columns mismatch error.For example:
Alternative
My understanding of the documentation "Treat all the characters after this one as a comment" is that currently both infix and prefix matching are supported, which makes sense for lines like this
a,b,c # this is a comment
.In my case at least I was caught out by this, as I assumed that the match was prefix only, I guess I was expecting it to only apply to lines which begin with the
comment
string (as per bash).Draft
What I'd love to have is the ability to control whether this was applied as an infix match or only as a prefix.
For example, if I were able to supply a regular expression I could use
^#
to 'anchor' the string at the beginning of the row.Additional context
We're using the
stream
API, I wasn't able to find the exact places in the code where this is implemented, but presumably this is handled in a streaming fashion and so therefore may or may not have access to the newline, depending on where in the parser it is implemented.If you'd like to point me to the places in the code which are relevant I might be able to draft a PR, although we'd need to discuss how best to change the JS API to allow users to configure whether infix matching was enabled or not.
The text was updated successfully, but these errors were encountered: