-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Catch CSV parse errors and add validation notices #874
Conversation
If a CSV file failed to parse, there will be a clear user-visible notice, e.g.: ``` { "code": "csv_parsing_failed", "severity": "ERROR", "totalNotices": 1, "notices": [ { "filename": "stops.txt", "charIndex": 278201, "columnIndex": 2, "lineIndex": 2148, "message": "Length of parsed input (4097) exceeds the maximum number of characters defined in your parser settings (4096).", "content": "..." } ] } ```
Note that Univocity provides ExpandingCharAppender that can expand up to (Integer.MAX_VALUE - 8). We may switch to it in a separate PR if there is a use case for that (which I doubt because a CSV value longer than 4096 characters looks invalid to me). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for working on that @aababilov! Could you update RULES.md
with this new notice description please?
Note that Univocity provides ExpandingCharAppender that can expand up to (Integer.MAX_VALUE - 8). We may switch to it in a separate PR if there is a use case for that (which I doubt because a CSV value longer than 4096 characters looks invalid to me).
@MobilityData/transit-specs could the GTFS specification add a requirement? "each cell value must be shorter than 4096 characters"? Unless there is a use case for as @aababilov suggested.
Also @maximearmstrong you should take a look, see how his new notice code impacts the MobilityDatabase. |
Done.
Good question. This requires an extra PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Issue created to keep track https://github.com/MobilityData/gtfs-tasks/issues/118 |
If a CSV file failed to parse, there will be a clear user-visible
notice, e.g.: