Skip to content

Support general union sequence types #122

@rhdunn

Description

@rhdunn

Use Case

There are a number of cases where it is beneficial to define a type more precisely (specifically in parameters and return types) as a union of item or sequence types, for example:

  1. a binary type over xs:hexBinary and xs:base64Binary;
  2. an element that accepts ol or ul html list element names;
  3. an options parameter that accepts strings (xs:string*) an element (element(options)) or a map;
  4. a function that takes JSON types (map, array, xs:integer, xs:decimal, xs:string).

There are a number of MarkLogic APIs that make use of this. Several EXPath and EXQuery specifications can take advantage of this. I've also used this in my XQuery IntelliJ plugin when defining vendor APIs that have changed over the different versions.

Examples

(: BaseX API change in 8.5 :)
declare function archive:options($archive as xs:base64Binary)
     as (element(archive:options) | map(*)) external;

declare function html:list($list as (element(ol) | element(ul))) { ... };

(: https://docs.marklogic.com/cts:classify -- MarkLogic defines this as `(element() | map:map)?` :)
declare function cts:classify($data-nodes as node()*,
                              $classifier as element(cts:classifier),
                              $options as (element()? | map:map?))
     as element(cts:label)* external;

(: https://docs.marklogic.com/cts:search :)
declare function cts:search($expression as node()*,
                            $query as cts:query?,
                            $options as (cts:order* | xs:string*))
     as node()* external;

Existing Support

  1. Local Union Types -- This handles support for unions over atomic types.
  2. Extending element and attribute tests to NameTest unions #23 -- This provides a more concise syntax for unions over element or attribute names.
  3. Types -- The Formal Semantics specification defines union types.
  4. Sequence Type Union -- This is the definition in my XQuery IntelliJ plugin.

Note: Due to SequenceTypeUnion being present in typeswitch expressions XQuery implementations will have existing code to handle matching these unioned types.

Syntax

3.4 Sequence Types

SequenceTypeUnion ::= SequenceType  ("|"  SequenceType)*
SequenceType ::= EmptySequenceType | (ItemType OccurrenceIndicator?) | ParenthesizedSequenceType
EmptySequenceType ::= "empty-sequence" "(" ")"
ParenthesizedSequenceType ::= "(" SequenceTypeUnion ")"
ItemType ::= AnyItemTest | TypeName | KindTest | FunctionTest | MapTest | ArrayTest |
             AtomicOrUnionType | RecordTest | LocalUnionType | EnumerationType

Design Note: SequenceTypeUnion is an existing BNF symbol used in typeswitch expressions that is unchanged in this issue.

3.6 Item Types

ItemTypeUnion ::= ItemType  ("|"  ItemType)*
ParenthesizedItemType ::= "("  ItemTypeUnion  ")"
ParenthesizableItemType ::= ItemType | ParenthesizedItemType

Design Note: ItemTypeUnion mirrors SequenceTypeUnion, allowing the non-sequence unions to be used in the contexts where only item types are allowed. Implementations can make use of the SequenceTypeUnion logic after the syntax/parser validates the item type restriction in those contexts.

Design Note: An alternative to this -- in order to minimize grammar changes -- would be to replace the ItemType with an ItemTypeBase symbol (or appropriately named alternative), and then define ItemType accordingly:
ItemTypeBase ::= AnyItemTest | TypeName | KindTest | ...
ItemTypeUnion ::= ItemTypeBase ("|" ItemTypeBase)*
ItemType ::= ItemTypeBase | ParenthesizedItemType
SequenceType ::= EmptySequenceType | (ItemTypeBase OccurrenceIndicator?) | ParenthesizedSequenceType

Other Changes

Design Notes: If ItemType is changed to ParenthesizableItemType, these are the other areas in the current XPath/XQuery 4.0 grammar that need changing.

ContextItemDecl ::= "declare"  "context"  "item"  ("as"  ParenthesizableItemType)?
                    ((":="  VarValue)  |  ("external"  (":="  VarDefaultValue)?))
ItemTypeDecl ::= "item-type" EQName "as" ParenthesizableItemType
TypedMapTest ::= "map" "(" ParenthesizableItemType "," SequenceType ")"
LocalUnionType ::= "union" "(" ParenthesizableItemType ("," ParenthesizableItemType)* ")"

Text

4.22.2 Typeswitch

The effective case definition is defined as:

The effective case in a typeswitch expression is the first case clause in which the value of the operand expression matches a SequenceType in the SequenceTypeUnion of the case clause, using the rules of SequenceType matching.

In order to make that fit this proposal, the wording should be updated to something like:

The effective case in a typeswitch expression is the first case clause in which the value of the operand expression matches the SequenceTypeUnion of the case clause, using the rules of SequenceType matching.

3.7.2 The judgement subtype-itemtype(A, B)

Section (2) Conditions for atomic and union types: should add the following rules:

  1. A is an ItemTypeUnion in the form (T1 | T2 | ...) and every type T in (T1, T2, ...) satisfies subtype-itemType(T, B).
  2. B is an ItemTypeUnion in the form (T1 | T2 | ...) and any type T in (T1, T2, ...) satisfies subtype-itemType(A, T).

3.7.1 The judgement subtype(A, B)

The first paragraph in this section shall be replaced by:

The judgement subtype(A, B) determines if the sequence type A is a subtype of the sequence type B. A can either be empty-sequence(), xs:error, an ItemType, Ai, possibly followed by an occurrence indicator, or a SequenceTypeUnion. Similarly B can either be empty-sequence(), xs:error, an ItemType, Bi, possibly followed by an occurrence indicator, or a SequenceTypeUnion.

The result of the subtype(A, B) judgement can be determined as follows:

  1. If A is a SequenceTypeUnion in the form (T1 | T2 | ...) and every type T in (T1, T2, ...) satisfies subtype(T, B), then subtype(A, B) is true.
  2. If B is a SequenceTypeUnion in the form (T1 | T2 | ...) and any type T in (T1, T2, ...) satisfies subtype(A, T), then subtype(A, B) is true.
  3. Otherwise, the result of the subtype(A, B) judgement can be determined from the table below, which makes use of the auxiliary judgement subtype-itemtype(Ai, Bi) defined in 3.7.2 The judgement subtype-itemtype(A, B) .

Metadata

Metadata

Assignees

No one assigned

    Labels

    FeatureA change that introduces a new featureXPathAn issue related to XPath

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions