From 1932f32ae4b4ea6e79e7af6f4618f62c390a9f37 Mon Sep 17 00:00:00 2001 From: Michael Kay Date: Fri, 8 Sep 2023 20:38:19 +0100 Subject: [PATCH] 688 Semantics of local union types, enumeration types, and type alias names --- specifications/grammar-40/xpath-grammar.xml | 5 +- specifications/xquery-40/src/errors.xml | 13 ++ specifications/xquery-40/src/expressions.xml | 184 +++++++++++++----- specifications/xquery-40/src/query-prolog.xml | 15 +- 4 files changed, 167 insertions(+), 50 deletions(-) diff --git a/specifications/grammar-40/xpath-grammar.xml b/specifications/grammar-40/xpath-grammar.xml index 0bbd03e23..4f0846778 100644 --- a/specifications/grammar-40/xpath-grammar.xml +++ b/specifications/grammar-40/xpath-grammar.xml @@ -2573,7 +2573,7 @@ ErrorVal ::= "$" VarName - + ? @@ -2756,11 +2756,12 @@ ErrorVal ::= "$" VarName - + + diff --git a/specifications/xquery-40/src/errors.xml b/specifications/xquery-40/src/errors.xml index 9529e167a..0337cde4c 100644 --- a/specifications/xquery-40/src/errors.xml +++ b/specifications/xquery-40/src/errors.xml @@ -1081,6 +1081,19 @@ It is a static error if the name of a feature in will always return an empty sequence.

+ +

It is a static error if two or more item types + declared or imported by a module have equal expanded QNames (as defined by the eq + operator.)

+
+ + +

It is a static error if a type used as a member + type in a LocalUnionType is not a + .

+
+ diff --git a/specifications/xquery-40/src/expressions.xml b/specifications/xquery-40/src/expressions.xml index 76dc9b3df..3af9fb799 100644 --- a/specifications/xquery-40/src/expressions.xml +++ b/specifications/xquery-40/src/expressions.xml @@ -4270,7 +4270,7 @@ the schema type named us:address.

Although the grammar allows any ItemType to appear, each ItemType - must identify a . [TODO: error code]

+ must identify a .

A LocalUnionType is a generalized atomic type. It is classified @@ -4309,26 +4309,54 @@ the schema type named us:address.

-

An item matches an EnumerationType if it is an instance of xs:string, - and is equal to one of the string literals listed within the parentheses, when compared - using the codepoint collation.

+

An EnumerationType has a value space consisting of a set of xs:string + values. When matching strings against an enumeration type, strings are always compared + using the Unicode codepoint collation.

-

For example, the type enum("red", "green", "blue") matches the string "green".

+

For example, if an argument of a function declares the required type + as enum("red", "green", "blue"), then the string "green" is accepted, + while "yellow" is rejected with a type error.

- -

Unlike a schema-defined type that restricts xs:string with an enumeration facet, - matching of an EnumerationType is based purely on value comparison, and not on type - annotations. For example, if color is a schema-defined atomic type derived from - xs:string with an enumeration facet permitting the values - ("red", "green", "blue"), - the expression "green" instance of color returns false, because the type annotation - does not match. By contrast, "green" instance of enum("red", "green", "blue") - returns true.

-

An EnumerationType matches only xs:string values, not - xs:untypedAtomic or xs:anyURI values, even though these might compare - equal. However, the coercion rules allow xs:untypedAtomic or - xs:anyURI values to be supplied where the required type is an enumeration type.

-
+

Technically, enumeration types are defined as follows:

+ + +

An enumeration type with a single enumerated value (such as + enum("red")) is an atomic type + derived from xs:string by restriction using an enumeration facet + that permits only the value "red". This is referred to + as a singleton enumeration type. It is equivalent to the XSD-defined type:

+ + + + + ]]>
+

Two singleton enumeration types are the same type if and only + if they have the same (single) enumerated value, as determined using the Unicode + codepoint collation.

+

An enumeration type with multiple + enumerated values is a union of singleton enumeration types, + so enum("red", "green", "blue") + is equivalent to union(enum("red"), enum("green"), enum("blue")).

+

In consequence, an enumeration type T is a subtype + of an enumeration type U if the enumerated values of T + are a subset of the enumerated values of U: + see .

+ +
+ +

An enumeration type is thus a .

+ +

It follows from these rules that an atomic value will only satisfy an instance of + test if it has the correct type annotation, and this can only be achieved using an explicit cast or + constructor function. So the expression "red" instance of enum("red", "green", "blue") + returns false. + However, the ensure that where a variable + or function declaration specifies an enumeration type as the required type, a string + (or indeed an xs:untypedAtomic or xs:anyURI value) equal + to one of the enumerated values will be accepted.

+ + @@ -5848,6 +5876,12 @@ declare function flatten($tree as tree) as item()* { by restriction from xs:decimal.

xs:decimal ⊆ xs:numeric because xs:numeric is a pure union type that includes xs:decimal as a member type.

+

enum("red") ⊆ xs:string because the singleton + enumeration type enum("red") is defined to be an atomic + type derived from xs:string.

+

enum("red") ⊆ enum("red", "green") because the + enumeration type enum("red", "green") is defined to be a union type + that has the atomic type enum("red") as a member type.

@@ -5862,23 +5896,24 @@ declare function flatten($tree as tree) as item()* { because xs:short ⊆ xs:integer and xs:long ⊆ xs:integer.

union(P, Q) ⊆ union(P, Q, R) because P ⊆ union(P, Q, R) and Q ⊆ union(P, Q, R).

+

enum("red", "green") ⊆ xs:string because the + enumeration type enum("red") ⊆ xs:string + and enum("green") ⊆ xs:string.

+

enum("red", "green") ⊆ enum("red", "green", "blue") because + enum("red") ⊆ enum("red", "green", "blue") and + enum("green") ⊆ enum("red", "green", "blue").

+

enum("red", "green", "blue") ⊆ union(enum("red", "green"), enum("blue")) because + each of the types enum("red"), enum("green"), and enum("blue") + is a subtype of one of the two members of the union type.

This rule applies both when A is a schema-defined union type - and when it is a LocalUnionType. + and when it is a LocalUnionType; in addition it + applies when A is an enumeration type with multiple enumerated values, + which is defined to be equivalent to a union type.

- -

A is an , and B matches - every string literal in the enumeration of A.

- - Examples: - -

enum("red", "green", "blue") ⊆ xs:string

-

enum("red", "green", "blue") ⊆ enum("red", "green", "blue", "yellow")

-
-
-
+ @@ -8868,12 +8903,11 @@ return $incrementors[2](4)]]>

If T is a SequenceType - whose ItemType is a - other than an enumeration type, (possibly with an occurrence indicator + whose ItemType is a (possibly with an occurrence indicator *, +, or ?), then the following conversions are applied, in order:

-

TODO: coercion for enumeration types needs further work.

+

Enumeration types are generalized atomic types, so these rules apply.

@@ -8886,7 +8920,7 @@ return $incrementors[2](4)]]>

Each item in the atomic sequence that is of type xs:untypedAtomic is cast to the expected - atomic type. If the expected atomic type is an + generalized atomic type. If the expected atomic type is an , the value is cast to xs:string. If the item is of type xs:untypedAtomic and the expected type is Relabeling occurs only when T is a derived type. Promotion and relabeling are therefore never combined.

+ +

If T is a sequence type whose item type is a U, + then any atomic value A in the atomic sequence is relabeled as an instance of + some member type M in the transitive membership of U if M satisfies + all the conditions for relabeling defined in the previous rule, and if it is the first member type + in the transitive membership of U to satisfy those conditions. For example, if T + is the type union(xs:negativeInteger, xs:positiveInteger)* and the supplied value is the + sequence (20, -20), then the first item 20 is relabeled as type + xs:positiveInteger and the second item -20is relabeled as type + xs:negativeInteger. +

+ +

This rule also ensures that if the required type is enum("red", "green", "blue") + and the supplied value is "green", then the supplied value will be accepted, and + will be relabeled as an instance of the derived atomic type enum("green").

+
+
@@ -18830,6 +18881,16 @@ matching
; otherwise it returns false. For example:

>dynamic error is raised .

+ +

An instance of test does not allow any kind of casting or coercion. + The results may therefore be counterintuitive. For example, the expression + 3 instance of xs:positiveInteger returns false, because + the expression 3 evaluates to an instance of xs:integer, + not xs:positiveInteger. For similar reasons, "red" instance of + enum("red", "green", "blue") returns false.

+ +

On such occasions, a castable as test may be more appropriate: + see

@@ -18964,6 +19025,7 @@ be used to process an expression in a way that depends on its +

Sometimes it is necessary to convert a value to a specific datatype. For this @@ -18972,13 +19034,17 @@ creates a new value of a specific type based on an existing value. A cast expression takes two operands: an input expression and a target type. The type of the atomized value of the input expression is called the input type. -The target type must be either of:

+The target type must be one of:

+

The name of an item type alias + defined in the , which in turn must refer to an item + type in one of the following categories.

The name of a type defined in the in-scope schema types, which must be a simple type . In addition, the target type cannot be xs:NOTATION, xs:anySimpleType, or xs:anyAtomicType

A LocalUnionType such as union(xs:date, xs:dateTime).

+

An EnumerationType such as enum("red", "green", "blue").

Otherwise, a static error is raised .

The optional occurrence indicator ? denotes that an empty @@ -19084,7 +19150,12 @@ The result of a cast expression is one of the following:

- +

Casting to an enumeration type relies on the fact that an enumeration type + is a generalized atomic type. So cast $x as enum("red") is equivalent + to casting to an anonymous atomic type derived from xs:string + whose enumeration facet restricts the value space to the single string "red", + while cast $x as enum("red", "green") is equivalent to casting + to union(enum("red"), enum("green")).

@@ -19100,19 +19171,24 @@ The result of a cast expression is one of the following: +

&language; provides an expression that tests whether a given value is castable into a given target type. -The target type must be either of:

+The target type must be one of:

+

The name of an item type alias + defined in the , which in turn must refer to an item + type in one of the following categories.

The name of a type defined in the in-scope schema types, which must be a simple type . In addition, the target type cannot be xs:NOTATION, xs:anySimpleType, or xs:anyAtomicType

A LocalUnionType such as union(xs:date, xs:dateTime).

+

An EnumerationType such as enum("red", "green", "blue").

@@ -19122,8 +19198,9 @@ if the result of evaluating E can be successfully cast into the target type T by using a cast expression; otherwise it returns false. If evaluation of E fails with a dynamic error or if the value of E cannot be atomized, -the castable expression as a whole fails. -The castable expression can be used as a castable expression as a whole fails.

+ +

The castable expression can be used as a predicate to avoid errors at evaluation time. @@ -19136,18 +19213,31 @@ if ($x castable as hatsize) else if ($x castable as IQ) then $x cast as IQ else $x cast as xs:string]]> + + +

The expression $x castable as enum("red", "green", "blue") + is for most practical purposes equivalent to $x = ("red", "green", "blue"); + the main difference is that it uses the Unicode codepoint collation for comparing strings, + not the default collation from the static context.

+
Constructor Functions -

For every simple type in the For every simple type in the in-scope schema types (except xs:NOTATION and xs:anyAtomicType, and xs:anySimpleType, which are not instantiable), a constructor function is implicitly defined. In each case, the name of the constructor function is the same as the name of its target type (including namespace). The signature of the constructor function for a given type depends on the type that is being constructed, and can be found in . All constructor functions for atomic types are system functions.

+ spec="FO31" ref="constructor-functions"/>. + There is also a constructor function for every named item type in the + that maps to a .

+ +

All such constructor functions are classified as + system functions.

The constructor function for a given type is used to convert instances of other simple types into the given type. The semantics of the constructor function call T($arg) are defined to be equivalent to the expression The constructor function for a given type is used to convert instances of other simple types into the given type. + The semantics of the constructor function call T($arg) are defined to be equivalent to the expression (($arg) cast as T?).

The following examples illustrate the use of constructor functions:

@@ -19187,7 +19277,7 @@ equivalent to +

If usa:zipcode is a user-defined atomic type in the .

+ +

If my:chrono is defined as a type alias for + union(xs:date, xs:time, xs:dateTime), then the result + of my:chrono("12:00:00Z") is the xs:time + value 12:00:00Z.

+
diff --git a/specifications/xquery-40/src/query-prolog.xml b/specifications/xquery-40/src/query-prolog.xml index d7273e7e2..9cb8aa6e2 100644 --- a/specifications/xquery-40/src/query-prolog.xml +++ b/specifications/xquery-40/src/query-prolog.xml @@ -1751,7 +1751,7 @@ local:depth(doc("partlist.xml")) Item Type Declarations

An item type declaration defines a name for an item type. Defining a name for an item type - allows it to be referenced (using the syntax item-type(name) rather than repeating + allows it to be referenced by name rather than repeating the item type definition in full.

@@ -1779,7 +1779,7 @@ local:depth(doc("partlist.xml")) -

If the name of the item type is written as an (unprefixed) NCName, then +

If the name of the item type being declared is written as an (unprefixed) NCName, then it is interpreted as being in no namespace.

All item type names declared in a library module must (when expanded) be in the target namespace of the @@ -1803,8 +1803,15 @@ local:depth(doc("partlist.xml")) %private and a %public annotation, more than one %private annotation, or more than one %public annotation.

-

A static error must be reported if the definition of item types is cyclic: that is, if the definition - of an item type depends directly or indirectly on itself. [TODO: ERROR CODE]

+

It is a static error if two item type declarations (whether locally declared in a module or + imported from a public declaration in an imported module) share the same name +

+ +

The declaration of an item type (whether locally declared in a module or + imported from a public declaration in an imported module) must precede any use of the + item type name: that is, the name only becomes available in the static context of constructs + that lexically follow the relevant item type declaration or module import. A consequence + of this rule is that cyclic and self-referential definitions are not allowed.