Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[XDM] Terminology around "Atomic value" and "Type Annotation" #225

Closed
michaelhkay opened this issue Oct 30, 2022 · 3 comments
Closed

[XDM] Terminology around "Atomic value" and "Type Annotation" #225

michaelhkay opened this issue Oct 30, 2022 · 3 comments
Labels
XDM An issue related to the XPath Data Model

Comments

@michaelhkay
Copy link
Contributor

In response to the comments (see #202) against issue 196, I propose that we tighten up the terminology associated with atomic values.

In XDM, §2.7.5 says:

An atomic value can be constructed from a lexical representation. Given a string and an atomic type, the atomic value is constructed in such a way as to be [consistent with schema validation]. If the string does not represent a valid value of the type, an error is raised. When xs:untypedAtomic is specified as the type, no validation takes place. The details of the construction are described in [Section 18 Constructor functions ] and the related [Section 19 Casting ] section of [[XQuery and XPath Functions and Operators 3.1]].

The actual definition of "atomic value" is found in 2.1 Terminology and reads:

[Definition: An atomic value is a value in the value space of an [atomic type] and is labeled with the name of that atomic type.]

(Oddly, this isn't linked from 2.7.5)

There's another little issue here, which is that an atomic value created by atomizing a schema-validated node may end up having an anonymous type, in which case its most specific type is inexpressible using XPath ItemType syntax. Is the type annotation in this case an atomic type, or is it the name of an atomic type. And should we avoid assuming that the concept of "item type" is something synonymous with the ItemType construct in the XPath grammar? As a start, I would prefer to say that the type annotation is a type, rather than a name.

In a non-normative note in §2.7, XDM also says:

Values including element and attribute nodes, and atomic values, have a property called a type annotation whose value is a type: this is a reference to a type definition in the Schema Component Model.

It goes on to say (normatively, but rather informally):

Every [item] in the data model has both a value and a type. In addition to nodes, the data model can represent atomic values like the number 5 or the string “Hello World.” For each of these atomic values, the data model contains both the value of the item (such as 5 or “Hello World”) and its type. The property that holds the type is sometimes referred to as the type annotation: its value is a type definition component as defined in the Schema Component Model. This may be a built-in type (a type with a name such as xs:integer or xs:string), or a user-defined type.

This statement is misleading in a number of ways. Firstly, some items such as empty maps and arrays conform to many types, but they do not have a single defining type that trumps all the others. Secondly, the way the term "type annotation" is introduced fails to make clear that the type annotation of a node is something quite different from the type annotation of an atomic value; if an element named N has a type annotation of my:part-number, then its (most specific) type is element(N, my:part-number), while if an atomic value has a type annotation of my:part-number, then its most specific type is my:part-number. It also fails to make it clear that items like maps and arrays do not have a type annotation; their type is inferred from their content.

The fuzziness of some of these definitions makes it very difficult to be sufficiently formal elsewhere in the language. For example in 4.0 we're proposing to allow "down-casting" (or "relabelling") of atomic values in the coercion rules, and it's very hard to describe this operation formally without a better model.

I would like to start by changing the definition to

An atomic item (also known as an atomic value) is a pair (T, D) where T (the "type annotation") is an atomic type, and D (the "datum") is a point in the value space of T.

followed by references to atomic types and value spaces as concepts defined in XSD.

I don't expect we will want to use the term "datum" very often, but it's useful to have a name for the concept when we need it. We currently tend to call it the "value" which is very confusing, because if the (T, D) pair is a value, then D can't also be a value.

@michaelhkay michaelhkay added XDM An issue related to the XPath Data Model Clarification labels Oct 31, 2022
@michaelhkay
Copy link
Contributor Author

Note also that the definition of "Atomic type" in XDM 3.1 is completely wrong. It says "[Definition: An atomic type is a primitive simple type or a type derived by restriction from another atomic type.] (Types derived by list or union are not atomic.)", and the link to "primitive simple type" takes you to a definition that includes built-in list and union types as well as atomic types.

@michaelhkay
Copy link
Contributor Author

PR #232 has been raised.

@michaelhkay
Copy link
Contributor Author

The issue is now resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
XDM An issue related to the XPath Data Model
Projects
None yet
Development

No branches or pull requests

1 participant