-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1052 Simplify the results of parse-csv #1066
1052 Simplify the results of parse-csv #1066
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I have various comments…
- I liked the record names proposed in parse-csv() - simplify output #1052 (
get
,columns
,column-index
), - The type of
column-index
should getmap(xs:string, xs:integer*)
, - We should get rid of the newline (by dropping
row-separator
or reducingCRLF/CR/LF
- to
LF
),
…but it may be better to create a separate issue for it.
Revised to address the comments. Thanks. I have also addressed issue #1044 on row delimiter. This is now simplified to be a single character defaulting to newline: line ending normalization must be done first. Note this means that other line endings cannot be retained if they appear within quoted data; this seems a price worth paying to lose some complexity. Fix issue #1044. |
Note, I renamed the option "column-names" to "header" as this seems more appropriate for the simple yes/no case, and avoids confusion with the names of entries in the result map. I'm going to propose a couple more changes.
|
I like it.
I wonder whether we should add this in the
+1
I would really be in favor of dropping this option completely. Apache’s CSVFormat has a “record separator”, but it’s restricted to |
e855d2f
to
8e8b6ce
Compare
@fidothe My apologies for being more insistent in the meeting than I had planned. I fully agree that users must be able to process CSV input, no matter which newlines are used; I just think we should take charge of the normalization automatically. |
Changes parse-csv to deliver the results in a simpler format:
(a) the result structure is less deeply nested: one record with four entries
(b) the actual data is delivered as a sequence of arrays of strings, closely aligned with the result of
csv-to-arrays
The rules in the spec have also been rearranged to reflect this, so the rules are now organised according to the values delivered for each of these four fields.
The examples in the spec are changed to reflect the new output format; in addition they have been editorially reorganized so each example is more self-contained, avoiding the need for extensive scrolling to find the values of variables referenced in each example.
Fix issue #1052