New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[XPath] Introduce the lookup operator for sequences #50
Comments
Actually both Saxon and BaseX do allow |
@martin-honnen Based on the update, do you have any remaining questions? |
The proposal is appealing. I guess the extension makes only sense for Postfix Lookups, although it may be more consistent to define the same rules for Unary Lookups. As typing may not be strict, we should only check the first item of the left-hand operand, as it’s e. g. done when computing the Effective Boolean Value: “…a sequence whose first item is a node, fn:boolean returns true.”, https://www.w3.org/TR/xquery-31/#id-ebv: (: the test is successful, although the second item of the test is an integer :)
let $input := (1 to 5)
let $test := (<a/>, 1)
return $input[$test] So I would suggest rephrasing your rule to:
This would generally be cheaper (in particular if the input is streamed or processed in an iterative manner). It would then still be possible to evaluate… let $left := ($array1, $array2, ..., 1)
let $right := 1
return head( $left?($right) ) …and skip evaluation after the first result. A similar question arises with the type of the right operand. The following filter expression is legal (but it may raise an error if the position is compared with the string): (1 to 5)[position() = (1 to 10, 'a')] So it would be consistent to make this legal as well: (1 to 5)?((1 to 10, 'a') And we may need to consider usability aspects: People might forget the question mark, and might wonder why |
So |
Oh, you are right. So I think the following expressions would be equivalent then (provided that all position values are numbers)? (10, 20, 30) ? (2, 3, 1, 1, 2),
for $i in (2, 3, 1, 1, 2)
return (10, 20, 30)[$i] Array lookups can be rewritten as follows: [10, 20, 30] ? (2, 3, 1, 1, 2),
for $i in (2, 3, 1, 1, 2)
return [10, 20, 30]($i) |
I think the first example captures Dimitre's intent, let's hear what he himself thinks. Whether the second is a complete rewrite strategy for already existing abilities I am not sure, we can already do e.g. |
Thanks for your comment. Of course you’re right, if we have multiple input items, we need to have two ([10, 20, 30], map { 1 : 'a'}) ? (1,2)
for $i in ([10, 20, 30], map { 1 : 'a'})
for $k in (1, 2)
return $i($k) |
@ChristianGruen , @martin-honnen , I think that the original proposal states quite clearly: Thus:
produces
and
produces (using the current XPath 3.1 rule, as the first item in the LHS is an array)
Thank you very much @ChristianGruen for the clever disambiguation solution, based on the type of the first item in the LHS sequence. I also thought about this but wanted to hear the first responses before digging in deeper. |
I do not think this is a good idea Currently, It would be very confusing to change it such that the position sometimes matters and sometimes not. Depending on the first element is also confusing. It was already a bad idea to do that for the EBV. (extremely unconvincing that One could allow sequences in |
@benibela It seems that you are not against the idea per se but against using the Using another symbol, such as Is my understanding correct? Also, this proposal is not in conflict with the proposal for ranges. In fact these two complement each other well.
implements the reverse() function. In fact, this proposal has already inspired another great proposal by @ChristianGruen: #51 |
Yes, Without special handling of arrays/map If Although |
The way that the meaning of |
I am not going to defend something that is considered "ugly" even by its creators... And indeed, Still, using an operator would be better than a function. Why not:
Any idea for a better (or more suitable) operator string is welcome. |
I don’t think it’s a good idea to include yet another operator. If we created a completely new language, things would look completely different, but we already have If we have (10, 20, 30) ? (2, 3, 1, 1, 2)
→ (2, 3, 1, 1, 2) ! item-at((10, 20, 30), .) …we don’t save too many characters by extending the lookup operator, so it’s probably cleaner indeed to restrict lookups to function items. Talking about numeric predicates: I hated the language design while I was implementing it many years ago, I still struggle with the implications when adding optimizations, and I wouldn’t recommend the design for any new language; but I’m frequently surprised how intuitive the syntax is when I give lectures on XPath for (real) beginners. |
Talking about numeric predicates: I hated the language design while I was implementing it many years ago, I still struggle with the implications when adding optimizations, and I wouldn’t recommend the design for any new language; but I’m frequently surprised how intuitive the syntax is when I give lectures on XPath for (real) beginners.
I had a discussion with James Clark about this while XPath 1.0 was still in draft; I felt the overloading of [] was very confusing, but he felt that both meanings of [] were intuitive to users and it was up to the language designers and implementors to make it work.
Michael Kay
Saxonica
|
Interesting. I could assume that your point of view might have prevailed if all had anticipated the further development and complexity of the languages we have today. |
Here is something intuitive:
or even
or
|
With XQuery 3.1, sequence lookups can be achieved by wrapping a sequence into an array: let $seq := (10, 20, 30)
let $lookup := (1, 3, 2, 2, 3, 1)
return array { $seq }?($lookup) I’m still wondering if we really want to introduce yet another operator. Maybe we could collect some more use cases to find out how often people would require such an operator? I’m also interested in the feedback and opinion of everyone else. |
@ChristianGruen , your latest comment to me it looks like an option good enough to not introduce a new operator, in particular given all the different opinions we have seen as to whether an ASCII operator, a Unicode symbol or a function or overloading |
@ChristianGruen Who will remember
We already know that in XPath, for any given integer The above just applies this function on the sequence, using the arrow operator. Nothing new in fact! The arrow operator is already given to us in XPath 3.1. |
I find this incredibly confusing. The thing on the RHS of the arrow operator is, in effect, a partially applied function (that has been partially applied by supplying its first argument). An array is a function from integers to array members. So I would expect this construct to take integers from $seq, and use them to extract items from the array on the RHS. But you seem to be proposing that it should do the reverse!
Michael Kay
Saxonica
… On 2 Mar 2021, at 18:51, dnovatchev ***@***.***> wrote:
With XQuery 3.1, sequence lookups can be achieved by wrapping a sequence into an array:
let $seq := (10, 20, 30)
let $lookup := (1, 3, 2, 2, 3, 1)
return array { $seq }?($lookup)
@ChristianGruen <https://github.com/ChristianGruen> Who will remember array { $seq }?($lookup) ,
given the nothing-to remember (actually no new syntax) of:
$seq => [ 1, 3, 2, 2, 3, 1 ]
We already know that in XPath, for any given integer k, [k] is a function, which when applied on a sequence $vSeq produces the k-th item of the sequence.
The above just applies this function on the sequence, using the arrow operator. Nothing new in fact! The arrow operator is already given to us in XPath 3.1.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#50 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AASIQIQLDHRXNGB362MB5BDTBUXSBANCNFSM4WFRSXYQ>.
|
@michaelhkay, in #50 (comment) I proposed three alternative syntaxes all of which seem intuitive. Am I right to expect that you would reject all of them due to some reason? Thanks, |
I am not at all convinced that any operator-based syntax is going to be significantly more usable than a function such as I'd really like to avoid inventing new operators where a function will do the job - the grammar is far too fragile to make this an easy undertaking, and it's not clear to me that it improves usability. |
For what it’s worth, I find DSLs to be great if you’re an expert in that DSL and difficult-to-incomprehensible if you’re not. It’s impossible to tell what would work - ie, what magic incantation will do the job that you need to get done - even if it’s relatively easy to tell how something that’s already been written might work.
If the syntax needs to be accessible - and I would think that’s a priority - then I would agree with Michael.
Cheers,
Damian
On 3 Mar 2021, at 6:39 am, Michael Kay <notifications@github.com<mailto:notifications@github.com>> wrote:
I am not at all convinced that any operator-based syntax is going to be significantly more usable than a function such as $seq => items-at($positions).
I'd really like to avoid inventing new operators where a function will do the job - the grammar is far too fragile to make this an easy undertaking, and it's not clear to me that it improves usability.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#50 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AABBWXWDBUNEHGKTYOVSTJDTBVEH5ANCNFSM4WFRSXYQ>.
|
Actually,
Has always been an existing operator used in XPath consistently starting with XPath 1.0, and when it is applied on a sequence (on its left-hand side) it selects the The operator above is actually the operator below, in the case when the length of the sequence of integers is 1:
So there is nothing new and confusing in this operator. It has been used for more than 20 years in its abbreviated form by everyone, thus there is almost 0 barrier for understanding/using this operator. In order to eliminate any confusion, the proposal is to use this operator applied explicitly with the
If there is still someone confused, please say so and ask your questions. I will be happy to answer them :) Dimitre |
It's the use of "=>" I found confusing, not the use of square brackets. It seems to have no relationship to the existing "=>" operator.
My view is that there's nothing in this proposed capability that requires custom syntax rather than a regular function call, and that a regular function call will be easier for implementors and more comprehensible to users.
Michael Kay
Saxonica
… On 6 Mar 2021, at 21:39, dnovatchev ***@***.***> wrote:
I find this incredibly confusing. The thing on the RHS of the arrow operator is, in effect, a partially applied function (that has been partially applied by supplying its first argument). An array is a function from integers to array members. So I would expect this construct to take integers from $seq, and use them to extract items from the array on the RHS. But you seem to be proposing that it should do the reverse! Michael Kay Saxonica
Actually,
[{integer k}]
Has always been an existing operator used in XPath consistently starting with XPath 1.0, and when it is applied on a sequence (on its left-hand side) it selects the k-th item of the sequence.
The operator above is actually the operator below, in the case when the length of the sequence of integers is 1:
[{integer k1}, {integer k2}, ..., {integer kN}]
So there is nothing new and confusing in this operator. It has been used for more than 20 years in its abbreviated form by everyone, thus there is almost 0 barrier for understanding/using this operator.
In order to eliminate any confusion, the proposal is to use this operator applied explicitly with the => (arrow) operator on the left-hand side sequence:
$vSeq => [{integer k1}, {integer k2}, ..., {integer kN}]
If there is still someone confused, please say so and ask your questions. I will be happy to answer them :)
Dimitre
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#50 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AASIQIUGG76CFXG5WWJREMLTCKOKNANCNFSM4WFRSXYQ>.
|
Here's a proposal. (a) We introduce a new construct
If an integer is negative then it counts from the end, so the full expansion becomes
Examples:
The [#..] operator is available in both places predicates are allowed in the grammar. The expression B is evaluated with the same focus as A. So
selects items in B corresponding to the positions of the selected items from C. (b) We introduce an operator "downto", analogous to "to". The result of (c) We introduce an inverse subscripting operator X[^Y]. Y evaluates to a sequence of integers, the expression selects all items whose position is NOT in Y. Negative numbers again count from the end. For example A[^1] is equivalent to tail(A) (Slightly confusing for C# users, admittedly, since C# uses ^ in a subscript to mean counting from the end. But the use here is analogous to its use in regular expressions. We could use "!" instead.) |
I suppose
is meant to say
|
The section
kind of confuses or at least shatters my existing understanding of the use of |
Sorry about the messed-up attempt to use a ternary conditional. Basically, the expression A[#B] evaluates A and B with the same focus. So using "." and "position()" and "last()" within B gives the same result as using them outside the "predicate". So I wondered about having a function I also wondered about alternative syntax "#[" in place of "[#" so it becomes X#[1 to 5] in place of X[#1 to 5]. Another possibility would be "?[" which suggests a (perhaps misleading or perhaps helpful?) analogy with the lookup operator. |
I like this alternative, it looks more intuitive to me than All the same, I still wonder if the number of users who would benefit from the proposed extension is large enough. Personally, I would probably use the new syntax whenever it gets available due to my tendency to write compact code, but I would claim that the number of users who can tell the difference between |
I find this a good and almost complete definition and exploration of the
main idea.
I prefer the # syntax over the ? one (indeed there are now too many uses
for ? )
I would only wish that it is made clear (stressed) that in:
A [# B ]
B can be any *expression*.
Thanks,
Dimitre
…On Mon, Mar 15, 2021 at 1:39 AM Michael Kay ***@***.***> wrote:
Here's a proposal.
(a) We introduce a new construct A [# B ] where A is an arbitrary
sequence, and B evaluates to a sequence of integers. If the integers are
positive, the result is equivalent to
for $b in B return A[position() = $b]
If an integer is negative then it counts from the end, so the full
expansion becomes
let $C := count(A) + 1 return
for $b in B return A[position() = $b ge 0 then $b else $C + $b]
Examples:
X[#1] returns the first item
X[#1 to 3] returns the first three items, in order
X[#-1] returns the last item
X[#-3 to -1] returns the last three items, in order
("A", "B", "C")[#3 <#3>, 1, 2]
returns "C", "A", "B"
The [#..] operator is available in both places predicates are allowed in
the grammar.
The expression B is evaluated with the same focus as A. So
<xsl:for-each select="C">
<xsl:value-of select="B[#position()+1]"/>
</xsl:for-each>
selects items in B corresponding to the positions of the selected items
from C.
(b) We introduce an operator "downto", analogous to "to". The result of A
downto B is the same as reverse(B to A). For example, X[#-1 downto -3]
selects the last three items in a sequence, starting with the last.
(c) We introduce an inverse subscripting operator X[^Y]. Y evaluates to a
sequence of integers, the expression selects all items whose position is
NOT in Y. Negative numbers again count from the end. For example
A[^1] is equivalent to tail(A)
A[^-1] is equivalent to A[position() != last()]
A[^3] is equivalent to remove(A, 3)
(Slightly confusing for C# users, admittedly, since C# uses ^ in a
subscript to mean counting from the end. But the use here is analogous to
its use in regular expressions. We could use "!" instead.)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#50 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACQ5KREQNN4QR5CDPP4TOELTDXBURANCNFSM4WFRSXYQ>
.
--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.
|
Issue #213 takes this forward as a concrete proposal, and proposes that this issue now be closed. |
I suggest that if a proposal is raised to resolve an issue and we have at least one other person agreeing to close an issue, we can close it. I don't really want to spend telcon time discussing whether or not an issue can be closed. There's a bare minimum amount of time required to just ask the question on a telcon. If one of us sees that an issue is closed and we disagree that it should have been closed, we can reopen it. (Preferably with a comment that explains what issue we feel is unresolved.) |
In XPath 3.1 it is convenient to use the
?
lookup operator on arrays and maps.It is easy and readable to construct expressions, such as:
And this understandably produces the sequence:
However, it is not possible to write:
or
or
This proposal is to allow the use on
sequences
of the postfix lookup operator?
with the same syntax as it is now used forarrays
.The
?
lookup operator will be applied on sequences whose first item isn't an array or a map. The only change would be to allow the type of the left-hand side to be asequence
, in addition to the currently allowedmap
andarray
types. At present, applying?
on any such sequence results in error. In case the first item of the LHS sequence is an array or a map, then the current XPath 3.1 semantics is in force, which applies the RHS to each item in the sequence.The restriction in the above paragraph can be eliminated if we decide to use a different than
?
symbol for this operator, for example^
The goal of this feature is achieving conciseness, readability, understandability and convenience.
For example, now one could easily produce from a sequence a projection / rearrangement with any desired multiplicity and ordering.
Thus, it would be easy to express the function
reverse()
as simply:The text was updated successfully, but these errors were encountered: