-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Functions for splitting a sequence (or array) based on predicate matching #149
Comments
What about just adding to the two functions a boolean argument: Or having "inclusive" or "exclusive" in the name of the function: items-to-inclusive()
items-to-exclusive()
items-from-inclusive()
items-from-exclusive() |
The problem with boolean arguments is that someone reading the code `items-to($input, ->{boolean(self::h2)}, true())` has to either remember or read up on what the semantics of the third argument are. There's almost a case for using a string argument that must be set to "inclusive" or "exclusive".
These suggestions are all possible, but I think my preference is for having four functions rather than two functions with options.
There's always a fine balance between making the semantics of a function evident from its name, and keeping it short. We do know from experience that naming is really important; there's an awful lot of incorrect XPath code written because people jump to wrong conclusions about what contains() does.
Michael Kay
… On 21 Sep 2022, at 01:51, dnovatchev ***@***.***> wrote:
What about just adding to the two functions a boolean argument: $inclusive as xs:boolean ?
Or having "inclusive" or "exclusive" in the name of the function:
items-to-inclusive()
items-to-exclusive()
items-from-inclusive()
items-from-exclusive()
—
Reply to this email directly, view it on GitHub <#149 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AASIQIR6373AWI23RRMPQODV7JLXLANCNFSM6AAAAAAQRNJU4E>.
You are receiving this because you authored the thread.
|
I’d be happy to see only 2 functions (items-from, items-to), as it feels like overkill to have 4 functions that are pretty similar. Or we manage to define a single function with additional options for it. I believe this has been discussed before, I just can't find the sources. |
Maybe |
I've pushed a proposal that uses the function names
|
I’ll add the examples from #80 (comment), and some more, to document possible alternatives: The common prolog for all queries: declare variable $INPUT := 1 to 10000; The function items-starting-where($INPUT, function($item) { $item <= 5000 }) …could also be realized with while(
function($seq) { head($seq) <= 5000 },
function($seq) { tail($seq) },
$INPUT
) Or, taking advantage of various new proposals: while(=> { head(.) <= 5000 }, => { tail(.) }, $INPUT) The alternative writing of items-starting-where(1 to 10000, -> { . <= 5000 }
while(=> { foot(.) >= 5000 }, => { init(.) }, $INPUT) If we decide to add a predicate for aborting a loop to items-ending-where($INPUT, -> { . = $5 })
fold-left($INPUT, (), ->($seq, $curr) { $seq, $curr }, => { foot(.) = 5 })
items-starting-where($INPUT, -> { . = $5 })
fold-right($INPUT, (), ->($curr, $seq) { $curr, $seq }, => { foot(.) = 5 }) With let $pos := head(index-where($INPUT, -> { . = 5 }))
return subsequence($INPUT, 1, $pos) If we add a positional argument to let $pos := head(for-each($INPUT, ->($item, $pos) { $pos[$item = 5] }))
return subsequence($INPUT, 1, $pos) |
I love those people who contribute ideas if things have more or less been finalized. Still… I did some research on how common challenges on lists and arrays are tackled in other programming languages, and noticed there's quite a bunch of languages today that come with |
I'm all for reuse rather than reinvention, but unless I'm being more than usually thick, |
It's true, the functions are not equivalent. I'm just wondering if our requirements are really that different from those of Scala, Python, C#/F#, Kotlin, Java, and other languages? The main difference I can see is that we have sequences, but apart from some specific and cool features such as implicit flattening, the data structure pretty much resembles lists and arrays. |
Accepted and resolved (#199 (comment)) |
fn:items-(until|from) → fn:items-(ending|starting)-where. qt4cg/qtspecs#149
This is concerned with use cases like "How do I select all the paragraphs before the first H2?" or "How do I select items between and ?".
Currently in the draft spec we have proposals for:
range-from($input, $predicate)
: Returns a sequence containing items from an input sequence, starting with the first item that matches a supplied predicate.range-to($input, $predicate)
: Returns a sequence containing items from an input sequence, ending with the first item that matches a supplied predicate.These both include the matching item, on the theory that it's easier to drop it if it's not wanted, than to add it if its needed.
I've also proposed (as an alternative) a family of four functions
items-before
,items-to
,items-from
,items-after
giving four combinations of taking the subsequence before/after the first match of the predicate, and including or not including the matched item.It's worth pointing out that these can all be defined in terms of index-where. For example range-to (assuming at least one item matches the predicate) is subsequence($input, 1, index-where($input, $predicate).
These functions all treat the first match of the predicate as special: they partition the sequence before or after the first item that matches the predicate. An alternative, inspilred by XSLT's for-each-group group-ending|starting-with, would be to partition the sequence breaking immediately before or after every item that matches the predicate:
But these logically return a sequence of sequences, which would typically be presented either as an array of sequences or a sequence of arrays, neither of which is ideal. (An alternative would be to return a sequence of arity-0 functions)
Having reviewed the options, I think my preferance remains having a family of four functions which I have called items-before, items-to, items-from, items-after. But I'm certainly open to other options. The logical names would probably be subsequence-before etc, but that's a bit of a mouthful.
Whatever family of functions we decide upon, there's logically a requirement to offer the same for arrays.
Michael Kay
The text was updated successfully, but these errors were encountered: