How to format decimals? #58

wschella · 2019-05-16T08:36:46Z

In regards to the formatting of decimals, how should spec-compliant engines handle trailing zero's in decimals? I find that the expected results in this test suite are inconsistent, i.e. sometimes they expect a trailing zero, sometimes they do not.

Some examples that have no trailing zero:

Some examples that do have a trailing zero:

Should this test suite be consistent in it's formatting? Either consistently have a trailing zero (in conformance with the canonical representation, or consistently remove it.
Or should we not compare the outputs by string equality?

cc: @rubensworks

rubensworks · 2019-05-16T08:59:56Z

Following the RDF spec, literals should definitely be strictly equal:

Literal term equality: Two literals are term-equal (the same RDF literal) if and only if the two lexical forms, the two datatype IRIs, and the two language tags (if any) compare equal, character by character. Thus, two literals can have the same value without being the same RDF term.

I think we could go two ways:

It should be defined somewhere (probably SPARQL 1.2) that functions should produce literals in canonical form.
Mention in the test suite that literals in results should be compared by first converting to their canonical form, and then comparing character by character.

1 is IMO the cleanest solution, but it requires a spec update (unless I'm missing something). 2 is probably the most practical one.

gkellogg · 2019-05-16T18:25:03Z

According to the SPARQL 1.1 Grammar trailing digits in number representations are preserved, and so 1.0 and 1.00 would be considered different terms. IIRC, if the tests represent literals differently, it is intentional.

rubensworks · 2019-05-17T06:29:09Z

According to the SPARQL 1.1 Grammar trailing digits in number representations are preserved, and so 1.0 and 1.00 would be considered different terms.

Indeed, if terms originate from the underlying data source, then the terms must be returned as-is. However, this issue is about the response of functions, for which there seems to be no guideline on how decimals should be formatted.

IIRC, if the tests represent literals differently, it is intentional.

It is indeed possible that this was intentional, but it is unclear to me what the reason for this was.

For example in the spec tests, SECONDS("2010-06-21T11:28:01Z"^^xsd:dateTime) produces "1"^^http://www.w3.org/2001/XMLSchema#decimal, while 0/2 produces "0.0"^^http://www.w3.org/2001/XMLSchema#decimal.

However, the SPARQL spec does not seem to make any indications regarding the required format of these decimals, so I would expect the spec tests to be at least consistent in this regard. Perhaps there is some other reason for this inconsistency?

rubensworks · 2019-05-17T06:39:55Z

Pinging @cygri to confirm that this indeed something underspecified in SPARQL 1.1, and whether or not this should be added to the errata.

afs · 2019-05-17T09:21:31Z

Functions return values so the way to compare is by value, not by term. (2 is one way to do that but the principle to state is that it is value-equality).

XSD changed the canonical form of decimals between 1.0 to 1.1 to require the decimal point. It was "1"^^xsd:decimal, and became "1.0"^^xsd:decimal.

The SECONDS case is different because of context of use. There is a reasonable expectation that the term format is like xsd:dataTime - fixed two digits, then (concat (str (hours ?dt)) ":" (str (minutes ?dt)) ":" (str (seconds ?dt))) generates a legal lexicial form.

Or what about xsd:decimal("0") or STRDT("0", xsd:decimal) ? Canonical or did the application writer write it like that because they wanted exactly that lexicial form? Both are reasonable.

xsd:precisionDecimal preserves trailing zeros in the cannical form.

I don't think being totally prescriptive about term representation is a good idea. It is the value that matters.

cygri · 2019-05-18T11:33:22Z

If the SPARQL spec doesn’t say that a particular lexical form is generated, then any lexical form that has the correct value should be considered correct.

So, as Andy said, the test suite documentation should state that literals are compared by value. Canonicalising actual and expected value before comparing them in the test runner is one way to achieve comparison by value.

(For strDT, the language of the spec demands that a literal with exactly the given lexical form is generated, so arguably substituting a different form of the same value would be incorrect there. For the cast/constructor functions, the spec makes no such demand.)

Closes w3c#58

ericprud · 2021-09-23T08:15:33Z

If the SPARQL spec doesn’t say that a particular lexical form is generated, then any lexical form that has the correct value should be considered correct.

But that leaves STR() of that value (needlessly?) underspecified. Also, those STRs might be used to construct important things like identifiers.

I think we'd like predictable behavior for at least integers. I.e. we'd like IRI(CONCAT("http://a.example/id=", STR(1123-1000))) to return <http://a.example/id=123> and not <http://a.example/id=0123> or <http://a.example/id=+123>.

Decimals are used less frequently to construct terms but if we say that integer constructors MUST return canonical forms, we may as well do the same for all supported XSD datatypes with a canonical form: e.g. IRI(CONCAT("http://a.example/id=", STR(1123.0-1000))) produces exactly <http://a.example/id=123.0> and not something with arbitrary leading and trailing zeros.

There may be some cases where it's hard to tell if something is being constructed vs. a lexical form is being cast, but hopefully they'll surface in this issue.

iirc, the SPARQL 1.0 tests assumed canonical forms, which means any vestiges of them should be updated to reflect the mildly spec-breaking 1 -> 1.0 change that @afs mentioned above.

wschella mentioned this issue May 16, 2019

Decimal formatting comunica/sparqlee#28

Closed

rubensworks added a commit to rubensworks/rdf-tests that referenced this issue May 20, 2019

Clarify comparison of literals in SPARQL test suite

46ac922

Closes w3c#58

This was referenced May 20, 2019

Clarify comparison of literals in SPARQL test suite #59

Closed

Compare SPARQL literals by value rubensworks/rdf-test-suite.js#37

Closed

jitsedesmet mentioned this issue Aug 13, 2021

Feature/auto casting function arguments comunica/sparqlee#103

Merged

gkellogg added the SPARQL label Nov 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to format decimals? #58

How to format decimals? #58

wschella commented May 16, 2019 •

edited

rubensworks commented May 16, 2019

gkellogg commented May 16, 2019

rubensworks commented May 17, 2019

rubensworks commented May 17, 2019

afs commented May 17, 2019 •

edited

cygri commented May 18, 2019

ericprud commented Sep 23, 2021

How to format decimals? #58

How to format decimals? #58

Comments

wschella commented May 16, 2019 • edited

rubensworks commented May 16, 2019

gkellogg commented May 16, 2019

rubensworks commented May 17, 2019

rubensworks commented May 17, 2019

afs commented May 17, 2019 • edited

cygri commented May 18, 2019

ericprud commented Sep 23, 2021

wschella commented May 16, 2019 •

edited

afs commented May 17, 2019 •

edited