Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context item → Context value? #129

Closed
ChristianGruen opened this issue Aug 18, 2022 · 46 comments · Fixed by #703
Closed

Context item → Context value? #129

ChristianGruen opened this issue Aug 18, 2022 · 46 comments · Fixed by #703
Labels
Feature A change that introduces a new feature XPath An issue related to XPath

Comments

@ChristianGruen
Copy link
Contributor

ChristianGruen commented Aug 18, 2022

This has already been discussed before at various places, I’d like to raise it again: What about generalizing the context item and allowing it to reference sequences? Are there definitive showstoppers?

The Context Item

As its name says, the context item is a container for a single item in the current context. A value that is bound to the context item is referenced with the Context Item Expression, the single dot: ..

The context item shares many similarities with variables. The main difference is that it currently cannot be used for sequences. I propose to generalize the semantics and introduce a “context value”:

  • Items that have formerly been bound to the context item (via the Context Item Declaration, within predicates, the simple map operator, path expressions, the transform with expression, etc.) are now bound to the context value.
  • The revised Context Item Expression returns sequences instead of single items.
  • We cannot drop context items completely – for example, we have a Context Item Declaration in the prolog of XQuery expressions, which uses the item keyword – but we can treat it as a secondary concept.

Context Value Declaration

It has become a common pattern to use declare context item to bind a document to the context item and process queries on that item:

declare context item := doc('flowers');
.//flower[name = 'Tigridia']

If data can be distributed across multiple documents (which is often, if not the standard case, in databases), this approach does not work. It would work if we could bind sequences:

declare context value := collection('flowers');
.//flower[name = 'Tigridia']

External Bindings

Many processors allow users to bind external values to the context item. This approach is particularly restricting for databases, in which data is often distributed across multiple documents. With the generalized concept, it would get possible to bind sequences and collections to the context. Paths like the following one could be used, no matter if the contents are stored in a single document or in a collection:

//flower[name = 'Iridaceae']

Focus Functions

The focus function provides a compact syntax for common arity-one functions. The single argument is bound it to the context item:

sort($flowers, (), function { @petals })

With the generalization to values, we could easily enhance focus functions to accept arbitrary sequences:

array:sort($flower-species, (), function { count(.) })
let $flowers := array:join(
  for $flower in //flower
  group by $_ := $flower/name
  return [ $flower ]
)
(: some $p in petals satisfies $p gt 4 :)
return array:filter($flowers, function { petals > 4 })

Use Case: Arrow Expressions

The arrow expression provides an intuitive syntax for performing multiple subsequent operations on a given input. With the context value generalization, we could also process chained sequences:

//flower[name = 'Psychotria']
=> function { count(.) || ' flower(s) found' }()
@rhdunn
Copy link
Contributor

rhdunn commented Aug 19, 2022

I like the idea of supporting the EnclosedExpr syntax on fat arrows for consistency and symmetry between the new syntax.

I also like the idea of passing a context value (sequence) to a query.

This needs to work with the current focus definition (https://qt4cg.org/branch/master/xquery-40/xquery-40-diff.html#dt-focus), where context item, context position, and context size in the dynamic context form the focus.

We would therefore need something that binds to the context item -- such as function () { ... } or array { ... }, which don't quite work -- that when used evaluates to the contents of the sequence for the purpose of evaluating the items in that sequence. In the case of the function, we would want the context value . to be evaluated as .().

To achieve this, we can define a context value as a sequence bound to the context item that when evaluated returns the content of its containing sequence. If we wanted a syntax for this, we could have something like sequence { 1, 2, 3 }. Then U => EnclosedExpr would bind the context item to the context value of the evaluated UnaryExpr.

@ChristianGruen
Copy link
Contributor Author

I would hope that we can effectively replace the term context item by context value, and indicate in the Change Log that the context value was formerly restricted to a single item.

As I see no general stumbling blocks, I’m trying to adapt the definitions and parts of the further context references. It could read as follows:


2.1.1 Static Context

[Definition: Context value static type. This component defines the static sequence type of the context value.]

2.1.2 Dynamic Context

[Definition: The first three components of the dynamic context (context value, context position, and context size) are called the focus of the expression.] The focus enables the processor to keep track of which items are being processed by the expression. If any component in the focus is defined, all components of the focus are defined. [Definition: A singleton focus is a focus that refers to a single item; in a singleton focus, context value is set to the item, context position = 1 and context size = 1.]

  • [Definition: The context value is the value currently being processed.] [Definition: When the context value is a single node, it can also be referred to as the context node.] The context value is returned by an expression consisting of a single dot (.). When an expression E1/E2 or E1[E2] is evaluated, each item in the sequence obtained by evaluating E1 becomes the context value in the inner focus for an evaluation of E2.
    [Definition: In the dynamic context of every module in a query, the context value component must have the same setting. If this shared setting is not absent, it is referred to as the initial context value.]
  • [Definition: If a sequence of items is processed and successively bound to the context value, the context position is the position within this sequence.]] It changes whenever the context value changes. When the focus is defined, the value of the context position is an integer greater than zero. The context position is returned by the expression fn:position(). When an expression E1/E2 or E1[E2] is evaluated, the context position in the inner focus for an evaluation of E2 is the position in the sequence obtained by evaluating E1. The position of the first item in a sequence is always 1 (one). The context position is always less than or equal to the context size.

2.4.4 Input Sources

An expression can access input data either by calling one of these input functions or by referencing some part of the dynamic context that is initialized by the external environment, such as a variable or context value.

4.3.4 Context Item Expression

ContextValueExpr   ::=   "."

A context value expression evaluates to the context value, which may be an arbitrary sequences of nodes, atomic values and functions.

If the context value is absent, a context value expression raises a dynamic error [err:XPDY0002].

4.4.2.1 Evaluating Dynamic Function Calls

Example: Using the Context Value in an Anonymous Function

The following example will raise a dynamic error [err:XPDY0002]:

let $vat := function() { @vat + @price }
return shop/article/$vat()

Instead, the context value can be implicitly bound with a function item expression and the => symbol …

declare context value := collection('items')/shop/article;
=> { ./(@vat + @price) }()

…or one by one with the -> symbol:

let $vat := -> { @vat + @price }
return shop/article/$vat(.)

5.17 Context Item Declaration

If a module contains more than one context item declaration and context value declaration altogether, a static error is raised [err:XQST0099].

Should be treated as special-case legacy version of the “Context Value Expression”.

5.xx Context Value Declaration (…parts)

ContextValueDecl ::= "declare" "context" "value" ("as" SequenceType)? ((":=" VarValue) |
                     ("external" (":=" VarDefaultValue)?))`

A context value declaration allows a query to specify the static type, value, or default value for the initial context value.

In every module that does not contain a context value declaration, the effect is as if the declaration

declare context value as item()* external;

appeared in that module.

If a module contains more than one context value declaration and context item declaration altogether, a static error is raised [err:XQST0099].

During query evaluation, a focus is created in the dynamic context for the evaluation of the QueryBody in the main module, and for the initializing expression of every variable declaration in every module. The context value of this focus is called the initial context value, …


“The context value is the value currently being processed” is still misleading. It’s already the original definition “The context item is the item currently being processed.” that I found confusing: It matches well for a singleton focus that’s temporarily created for the evaluation of a predicate or simple map expression, but it doesn’t really fit if the context item is declared in the prolog and globally available in the main module.

@rhdunn
Copy link
Contributor

rhdunn commented Aug 19, 2022

For the prolog and main module, the "currently being processed" will be in the context of the QueryBody, so that should be fine. The ContextItemDecl documentation (and thus equivalently also the ContextValueDecl documentation) as you reference describes how that is determined in that case.

@graydon2014
Copy link

  • [Definition: In the dynamic context of every module in a query, the context value component must have the same setting. If this shared setting is not absent, it is referred to as the initial context value.]

That second sentence is defeating me.

I think the first sentence is something like "there is one and only one focus at any point in time in an entire query, so the context value component of the dynamic context must have the same setting everywhere in the query."

But there MUST be a focus, conceptually; there has to be a dynamic context if the query can be evaluated. So I think the second sentence might be "If the context value component has a defined value, it is referred to as the initial context value."

@ChristianGruen
Copy link
Contributor Author

For the prolog and main module, the "currently being processed" will be in the context of the QueryBody, so that should be fine.

Yes, I assume it's logically correct. Maybe something like “The context value is available for reference within the corresponding context.” would sound more intuitive to me, in accordance with the definition of in-scope variables.

That second sentence is defeating me.

The original definition is as follows:

  • [Definition: In the dynamic context of every module in a query, the context item component must have the same setting. If this shared setting is not absentDM31, it is referred to as the initial context item. ]

I imagine that the set of definitions needs to be rephrased as a whole in order to get consistent again.

@michaelhkay
Copy link
Contributor

I'm concerned about the sheer number of things in the spec that would need to change.

I'm also concerned about performance. If we can't statically infer that "." is a singleton, there's a significant risk that existing code slows down because existing optimisations are no longer safe.

At the same time, I recognise the need - for example, when iterating over an array. But I feel it's a bridge too far.

@ChristianGruen
Copy link
Contributor Author

ChristianGruen commented Sep 14, 2022

I’m pretty optimistic that performance shouldn't be something that cannot be resolved. At least that's the impression I got when implementing it by myself, including all corner cases I managed to find.

@michaelhkay
Copy link
Contributor

In XQuery I think you can always statically determine what expression provides the focus for any occurrence of ".". That's not true in XSLT (and it's not true for XPath if the context item is supplied by the host language, as will often be the case). XSLT is much more heavily dependent on the context item than XQuery is.

@rhdunn rhdunn added XPath An issue related to XPath XQuery An issue related to XQuery XSLT An issue related to XSLT Feature A change that introduces a new feature labels Sep 15, 2022
@ChristianGruen ChristianGruen removed XQuery An issue related to XQuery XSLT An issue related to XSLT labels Sep 21, 2022
@ChristianGruen
Copy link
Contributor Author

ChristianGruen commented Oct 5, 2022

In #149 (comment), some more use cases are given for binding sequences to the context.

@michaelhkay
Copy link
Contributor

To take an example of the problems this would cause in XSLT, consider named templates. The context item, position, and size are passed through a call-template instruction, so code in the called template has no idea what the context item might be. Which means that we wouldn't be able to determine statically that expressions such as . or ./@code will always deliver a singleton; meaning that (a) we're no longer able to dispense with run-time type checking when the expression is used as an argument of a function that requires a singleton, and (b) we don't know that an expression like ./author delivers nodes in document order, which means we have to be prepared to do a sort to get them into document order.

Frankly, I think this is a non-starter.

@michaelhkay
Copy link
Contributor

michaelhkay commented Oct 13, 2022

I think a better approach might be to introduce a separate concept called the context value, represented by the symbol ~.

We could certainly use this to refer to the implicit argument of an expression such as (1 to 5) => { ~[. gt 3] }.

I would also like to do something similar in XSLT allowing a pipeline of instructions to feed into each other - rather like the arrow operator:

<xsl:pipe select="//x">
  <xsl:apply-templates select="~"/>
  <xsl:for-each select="~">
    ....
</xsl:pipe>

where each instruction in the pipe binds its output to the implicit variable ~ which can then be referenced in the next instruction. This saves the verbosity involved in binding the intermediate results to variables.

This could also be useful for iterating over arrays and filtering arrays:

$array[| count(~) eq 3 |]

selects all the members of an array that are sequences of length 3. (I'm not sure what symbol one might use for a mapping operator in XPath; but we could certainly do xsl:for-each-member in XSLT, binding each member to ~.

In xsl:for-each-group, we could bind ~ to the current group.

I'm not sure how position() and size() fit into this, or whether there is some relationship between . and ~.

@ChristianGruen
Copy link
Contributor Author

You mentioned named templates. Could it be an option to ensure that the input of named templates will always be singletons, similar to the context inside predicates?

If performance considerations in XSLT are too troublesome, maybe we can restrict the proposed extension to XQuery, or provide support for syntactic extensions in all languages, but disallow bindings of sequences in XSLT?

@michaelhkay
Copy link
Contributor

I guess it would be possible to say that if a named template has no context item declaration, then the default is <xsl:context-item as="item()" use="optional"/> which means that if the context value isn't an item, then you have to declare it. (But the fact that the instruction is named xsl:context-item shows just how deeply embedded the notion is that "." is a singleton, and I still worry that generalising it would be a massive project.)

@ChristianGruen
Copy link
Contributor Author

Still being somewhat tenacious and enthusiastic about my initial proposal, I fully understand your concerns.

I was positively surprised to see that it was close to a no-brainer to integrate the generalization in XQuery, and the result feels clean and conclusive to me. But it didn’t escape me that numerous sections in the specification would need to be revised, as you already indicated. It seems onerous indeed to restrict the generalization to XQuery; and I clearly lack the XSLT perspective.

Sigh. Maybe it’s best to postpone it to a potential 4.1 or 5.0 release. Hope dies last…

@michaelhkay
Copy link
Contributor

I think we should definitely file this under "too difficult". I've just been worrying about the semantics of simple expressions like PARA. Does this sort nodes into document order? What about PARA[1] -- does it mean ./PARA[1] or (./PARA)[1] or .!PARA[1] or (.!PARA)[1], all of which potentially have different results if "." contains multiple nodes? And it gets worse with reverse axes like preceding-sibling. Perhaps that can all be solved by saying that for an axis expression, the context item must be a single node. But then ./X no longer means the same as X, which has potential to cause immense confusion.

@michaelhkay
Copy link
Contributor

michaelhkay commented Oct 15, 2022

I think having "context value" (or some other name...) as a separate concept from context item is much more workable. It can be represented by ~.

The following use cases would then be conceivable, among others:

  1. Bind collections locally or globally

let ~ := collection('flowers');
return ~//flower[name = 'Tigridia']

  1. Use fat arrow operator to bind sequences

//flower[name = 'Psychotria']
=> { count(~) || ' flowers found'}

  1. Use fat arrow as inline function operator

(: some $p in petals satisfies $p gt 4 :)
return array:filter($flowers, => { ~/petals > 4 })

  1. Expression syntax for array filtering

$flowers [| ~/petals > 4 |]

  1. Expression syntax for array mapping

$flowers ?! count(~)

  1. Array iteration in XSLT

<xsl:for-each-member select="$array">{ count(~) }</xsl:for-each>

  1. Pipeline in XSLT, where the result of each instruction is available as "~" in the next instruction (semantically equivalent to A => {B} => {C} at the XPath level:
<xsl:pipe>
  <xsl:call-template name="do-some-magic"/>
  <xsl:for-each select="~" >
    <x att="{.}"/>
  </xsl:for-each>
  <xsl:for-each select="~/@att">
    ...
  </xsl:for-each>
</xsl:pipe>

@ChristianGruen
Copy link
Contributor Author

I've just been worrying about the semantics of simple expressions like PARA. […] But then ./X no longer means the same as X, which has potential to cause immense confusion.

As X and ./X are equivalent, it will be the natural decision to keep up the analogy if we allow sequences.


I’ll give some more background on my proposal (sorry in advance for being verbose). I mentioned earlier that it was a “no-brainer to integrate the generalization in XQuery”; I still agree on that, but I actually referred to the specific three syntax extensions in this proposal. The general idea has a longer history.

If path expressions are run against databases, it’s often irrelevant if data is located in a single document or spread across a set of documents. Users want to answer the same questions, no matter if one, thousands, or millions of documents are stored in a database.

The officially legal way to run queries on multiple documents is to use fn:collection and prepend it to a path expression, bind it to a variable, etc. Obviously, that’s something users wish to avoid if they repeatedly run simple path expressions. In our database system, we have thus deliberately been ignoring the restriction to single items on the root level for a long time, as one of the most basic workflows is to:

  1. open a database globally and
  2. run XPath expressions against that database.

All APIs we offer (REST, programming language bindings, command line, GUI) work similarly: The selection of a document (i.e., the selection of the initial sequence of document nodes) is separated from the actual evaluation of the query.

We certainly don’t aim to enforce idiosyncratic behavior of our processor to be applied to the official language just to make it legal (it has been existing for too long, and no one ever complained about it, so we’ll stick to it anyway). We just experienced over the years it feels like the most natural choice to do be allowed to run //my/results or a/b/c, no matter if the input is a single document or a collection.


I can well imagine, though, that this is not an issue in XSLT, which focuses on single documents. As the overall implications are too far-reaching, the idea of adding a syntax for binding sequences may be the only realistic choice (@michaelhkay thanks for further pursuing this). For the first use case – binding collections globally – I would press for a declaration to bind sequences (however named) …

declare context value := collection('flowers');
~//flower[name = 'Tigridia']

…and not restricting it to FLWOR expressions. If we wanted to additionally enhance the let clause, I think we should have both let ~ := and let . := (but that could probably discussed in #131 or in a separate issue).

@michaelhkay
Copy link
Contributor

OUTINE PROPOSAL

This proposal is in two parts. Part A introduces the notion of "context value" to the dynamic context -- except that I will call it the initial input value, abbreviated for the purpose of this proposal to IIV. Part B, which is dependent on part A, introduces the idea that some expressions might use the IIV implicitly.

PART A

The dynamic context is extended with a component called the initial input value. Its value is either a sequence, or absent.

A new kind of primary expression is introduced, the initial input value expression, written ~. It evaluates to the initial input value if present, or raises an error if absent.

In XQuery, the initial input value for the main query may be set using a "declare initial input" clause in the prolog, whose syntax parallels the "declare context item" declaration. For the time being, we will allow the context item and the initial input to be set independently of each other.

In XPath, the initial input value may be set by the calling application.

In XSLT, the initial input value will be absent when the transformation is initiated.

The initial input value is set to absent on entry to a global function or variable declaration, and in XSLT, on entry to other callable constructs such as templates and attribute sets.

The initial input value is set to a non-absent value by the following constructs:

(a) An inline function declaration using the fat-arrow syntax with implicit signature, for example =>{ string-join(~, ',') }. Here ~ is bound to the value supplied for the single argument in the function call.

(b) An enclosed expression on the RHS of the binary => operator, for example characters('abc') => { '(', ~, ')' }. Here ~ is bound to the value of the LH operand of the => operator.

(c) An instruction in an XSLT pipeline. A new XSLT instruction xsl:pipe is introduced; its content is a sequence of instructions called a pipeline. The value of each instruction other than the last becomes the initial input value for the following instruction; the value of the final instruction is delivered as the value of the xsl:pipe instruction.

(d) An array filter expression EXPR "[|" predicate "|]" is introduced. EXPR must evaluate to an array; within the predicate, ~ is bound to each member of the array in turn. As with existing predicates, the construct is overloaded to do both boolean filtering and numeric indexing

(e) An new operator is introduced to do array mapping. The expression takes the form E1 !{ E2 }. E1 must evaluate to an array; E2 is evaluated with ~ bound to each member of the array in turn, and the results are combined into a new array. For example [1 to 5] !{ [~, ~+1] } returns [[1,2], [2,3], [3,4], [4,5], [5,6]].

PART B

A new component called implicit mapping enabled is added to the static context for an expression. Its value is a boolean. If implicit mapping enabled is true for a path expression or context item expression E, then the result of the expression is effectively ~!E. The value of implicit mapping enabled is set to true for the main expression of a query if the query prolog contains a declare initial input declaration; it may also be set (explicitly or implicitly) by the calling application. It is set to false for any operand of an expression where the focus changes, for example E2 in E1/E2, E1!E2, or E1[E2], including expressions that set the focus to absent. In other cases it propagates to subexpressions, for example for $x in //* ... means for $x in ~//* ....

@ChristianGruen
Copy link
Contributor Author

Great to see how the different requirements are coalescing. I do appreciate your efforts, and I’ll let it sink in.

I assume that if…

the result of the expression is effectively ~!E.

…and not ~/E, the equivalent representation for…

for example for $x in //* ... means for $x in ~//* ....

…must be for $x in ~!//* ....

I like the example for arrays. We may also need to clarify what’s supposed to happen if both the context item and the initial input value are declared.

@michaelhkay
Copy link
Contributor

must be for $x in ~!//* ....

Yes indeed. I chose ! rather than / because the requirement to sort a collection of documents into document order is usually a completely unnecessary overhead.

@michaelhkay
Copy link
Contributor

michaelhkay commented Feb 8, 2023

VERSION II PROPOSAL

A revised version of my previous outline proposal.

The dynamic context is extended with a component called the context value. It is always present, and its value is a sequence. Its default value (for example on entry to a function) is an empty sequence. The context value can be accessed using the expression ~, which never throws an error.

The "context item" is no longer an independent quantity. If the context value is a single item, then we call this the "context item". If the context value is empty or contains multiple items, we say that the context item is absent, and throw an error if it is referenced, either explicitly using "." or implicitly.

In XQuery, the context value for the main query may be set using a "declare context [value|item]" clause in the prolog, whose syntax parallels the "declare context item" declaration. If the keyword "item" is used, rather than "value", then the supplied value must be a singleton.

In XPath and XSLT (and XQuery in the absense of "declare context value"), the initial context value may be set by the calling application.

The context value is set to a non-absent value by the following constructs:

(0) All constructs that currently bind the context item are redefined so they now bind the context value to a singleton. This means that within a predicate, for example, . and ~ can be used synonymously.

(a) It may be set explicitly using the syntax let ~ := expression return expression.

(b) An enclosed expression on the RHS of the binary => operator, for example characters('abc') => { '(', ~, ')' } which produces ('(', 'a', 'b', 'c', ')') . Here ~ is bound to the value of the LH operand of the => operator.

(c) We change the abbreviated function syntax ->{expression} so the function now accepts a sequence rather than a singleton item, and binds the sequence to ~ (it remains available as the context item if it is a singleton). So we can now write array:filter($array, ->{exists(~)}) to remove empty members from an array.

(c) An instruction in an XSLT pipeline. A new XSLT instruction xsl:pipe is introduced; its content is a sequence of instructions called a pipeline. The value of each instruction other than the last becomes the context value for the following instruction; the value of the final instruction is delivered as the value of the xsl:pipe instruction.

(d) An array filter expression EXPR "?[" predicate "]" is introduced. EXPR must evaluate to an array; within the predicate, ~ is bound to each member of the array in turn. The array filter expression always delivers an array, and the predicate is always treated as boolean; for positional selection, the position() function is available. In the case of an array whose members are all singletons, . can be used in place of ~.

(e) An new operator is introduced to do array mapping. The expression takes the form E1 !! E2. E1 must evaluate to an array; E2 is evaluated with ~ bound to each member of the array in turn, and the results are combined into a new array, without flattening. For example [1 to 5] !! [~, ~+1] } returns [[1,2], [2,3], [3,4], [4,5], [5,6]]. Again, if the members of the array are all singletons, they can be referred to as . rather than ~.

(f) In XSLT, <xsl:for-each-member select="[1 to 5]"> does array mapping in the same way. (Though this requires further thought: does the body of the instruction become an "array constructor" rather than a "sequence constructor", and what does this mean?)

(g) Leading / (and //) is changed so that instead of /X meaning root(.)/X, it means ~!(root(.)/X). That is, it selects from multiple roots, but doesn't force sorting into document order or deduplication.

(h) Relative path expressions such as @item continue to mean ./@item as before, and raise an error if there is no context item. You can write ~/@item or ~!@item if you want something different.

@ChristianGruen
Copy link
Contributor Author

Thanks again for the comprehensive proposal and the resulting constructs. I like the approach to make the context value the new default, and to treat the context item as a subordinate concept.

(c) We change the abbreviated function syntax ->{expression} so the function now accepts a sequence rather than a singleton item, and binds the sequence to ~ (it remains available as the context item if it is a singleton). So we can now write array:filter($array, ->{exists(~)}) to remove empty members from an array.

From today’s perspective, I think the fat arrow could be an even better choice for binding the context:

  1. With ->() { }, a function item without arguments can be declared. The syntax -> { } is even shorter, but it would be semantically richer. This recently caused confusion in one of our internal feedback rounds.
  2. We already use the thin arrow for iteratively processing single items: E -> { }
  3. In your proposal in (b), the fat arrow is proposed for processing sequences, so it seems consistent to use it for inline function as well.

We would then have:

Construct Example
Thin arrow operator (binding items) EXPR -> { (: context item :) }
Fat arrow operator (binding values) EXPR => { (: context value :) }
Inline Function (with bound context) => { (: context value :) }
Inline Function ->() { }, ->($a) { }, ->($a, $b) { }

The fat arrow could generally be used for inline functions =>() { }, but due to the above observation in 1., I imagine that a different syntax would be helpful (it would also allow us to generate better error messages).

(h) Relative path expressions such as @item continue to mean ./@item as before, and raise an error if there is no context item. You can write ~/@item or ~!@item if you want something different.

It would be great if we could also relax the semantics for relative paths. I think it should make no difference if the path expression is absolute or relative. Otherwise, expressions such as the following ones would not work:

(: bound externally or in the prolog declaration :)
declare context value external := collection('store')//person;
name

If performance concerns prevent us from doing so, maybe we could let the implementation decide if sequences are allowed as input for path expressions.

@ChristianGruen
Copy link
Contributor Author

So what about generalizing the single dot? Would it be confusing if a/b worked on a sequence of input nodes?

Even this is confusing... 😒

What is "a" here?

a/b is a valid XPath 1.0 expression. It is a relative location path: https://www.w3.org/TR/1999/REC-xpath-19991116/

@dnovatchev
Copy link
Contributor

So what about generalizing the single dot? Would it be confusing if a/b worked on a sequence of input nodes?

Even this is confusing... 😒
What is "a" here?

a/b is a valid XPath 1.0 expression. It is a relative location path: https://www.w3.org/TR/1999/REC-xpath-19991116/

Yes, what is confusing is: where is the proposed new concept in this expression, and what it gives us?

Without any explanation/description how would one even know that this has something to do with the proposed new concept?

@ChristianGruen
Copy link
Contributor Author

Yes, what is confusing is: where is the proposed new concept in this expression, and what it gives us?

Sorry. The topic was discusse somewhere in this thread, but it's already too long. a/b and ./a/b are equivalent, both start off with the current context item, and I was wondering if you believe it would be confusing if the expressions would yield results if the input was not a single node, but on a sequence of nodes.

@ChristianGruen
Copy link
Contributor Author

How about .* for . as a possible sequence? .*/leg/socks is currently a syntax error i think.

@liamquin Thanks for the proposal: I like .* as an alternative. We could even think about also offering .? and .+. – What do you think about the initial proposal to enhance the context item concept and use . for arbitrary sequences?

@dnovatchev
Copy link
Contributor

Yes, what is confusing is: where is the proposed new concept in this expression, and what it gives us?

Sorry. The topic was discusse somewhere in this thread, but it's already too long. a/b and ./a/b are equivalent, both start off with the current context item, and I was wondering if you believe it would be confusing if the expressions would yield results if the input was not a single node, but on a sequence of nodes.

I would definitely prefer having a separate, unambiguous representation, such as current-container().

If not a function, then something that would not be too-easy to arrive at by an accidental misspelling. And, if possible, visually adequate for the intended meaning.

This is why I also proposed (...)

@benibela
Copy link

benibela commented Apr 9, 2023

(0) All constructs that currently bind the context item are redefined so they now bind the context value to a singleton. This means that within a predicate, for example, . and ~ can be used synonymously.

When it has to update two things, that is bad for performance. Perhaps worse than redefining . to allow sequences

I’m not sure how i'd teach ~ yet.

It looks like a snake. Longer than a .. Or multiple . in a sequence

How about .* for . as a possible sequence? .*/leg/socks is currently a syntax error i think.

It is a multiplication

document{<leg><socks>10</socks></leg>}/(.*/leg/socks)

Another unused symbol is °. Like a bigger .

@graydon2014
Copy link

graydon2014 commented Apr 9, 2023

How about .* for . as a possible sequence? .*/leg/socks is currently a syntax error i think.

@liamquin Thanks for the proposal: I like .* as an alternative.

For what it's worth, I do as well.

There isn't much left except the at sign if we're going for single characters, and the day is not quite yet to start wandering through Unicode for operators.

We could even think about also offering .? and .+. – What do you think about the initial proposal to enhance the context item concept and use . for arbitrary sequences?

I can't find either of those earlier in the thread.

I'd guess that .? is the lookup operator applied to the members of the context value (and that the lookup operator would return the value for values that aren't maps or arrays), and that .+ is a "add the value to the right to every member of the context value", but I have little confidence in either guess.

[Edited to add] Or, duh, .* is a sequence of zero or more items, .? is a sequence of zero or one items, and .+ is a sequence of one or more items, where . is a sequence of exactly one item.

@michaelhkay
Copy link
Contributor

.*/a already has a well-defined meaning in XPath 1.0 - the * is a multiplication operator.

@michaelhkay
Copy link
Contributor

Well, ~ for me means not, or regexp-match

Of course, it has many meanings in many different contexts. I guess the usage that made it feel natural to me was its use in Unix filenames, which a lot of XPath syntax is inspired by. It's not exactly the same meaning of course, but when I see ~/a/b then I at least think in terms of hierarchic selection from some defined origin.

@ChristianGruen ChristianGruen changed the title [XPath] [XQuery] Context item → Context value? Context item → Context value? Apr 27, 2023
@ChristianGruen ChristianGruen added the Propose for V4.0 The WG should consider this item critical to 4.0 label Jun 15, 2023
@ChristianGruen
Copy link
Contributor Author

I have reworked the initial comment of this issue by adding explanatory comments and aligning it with the features from the latest version of the specs. I’ll be happy to present it in the meeting; it may help us resolve QT4CG-026-01.

@ChristianGruen
Copy link
Contributor Author

Reopening this (for the pending edits).

@ChristianGruen ChristianGruen added the Tests Needed Tests need to be written or merged label Sep 19, 2023
ChristianGruen added a commit to qt4cg/qt4tests that referenced this issue Sep 28, 2023
ChristianGruen added a commit to qt4cg/qt4tests that referenced this issue Sep 28, 2023
@ChristianGruen ChristianGruen removed the Propose for V4.0 The WG should consider this item critical to 4.0 label Oct 25, 2023
@ChristianGruen
Copy link
Contributor Author

…closed. The major remaining challenge to solve is #755.

@ChristianGruen ChristianGruen removed the Tests Needed Tests need to be written or merged label Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature A change that introduces a new feature XPath An issue related to XPath
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants