Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests to confirm detection of various integer types #122

Merged
merged 7 commits into from Sep 16, 2022

Conversation

ajnelson-nist
Copy link
Contributor

This patch series will add tests to confirm behaviors of XSD integer types aside from xsd:integer. It stems from trying to test a sample with a literal "0"^^xsd:positiveInteger and not receiving an error.

It is not entirely clear to me if corrections from this patch series should be in pySHACL or upstream in RDFLib.

This list of integer types was drawn from OWL 2.

At the time of this test's initial implementation, only the lexical
error in `"zero"^^xsd:integer` was detected.

References:
* The general style of this test was adapted from
  `/test/test_extra.py`.

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
@ajnelson-nist
Copy link
Contributor Author

@ashleysommer , I don't intend for this to block your plans to do a release soon.

@ajnelson-nist
Copy link
Contributor Author

@ashleysommer , in light of v0.19.0 being a 1.0.0 Release Candidate, I suggest this PR get a closer look due to missing some validation issues.

I also just caught a few copy-paste errors, will correct in a moment with a follow-on patch.

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
The third test in test_integers.py now constructs Literal objects for a
strict set-equivalence of expected failing path-value pairs.  However,
it's not clear whether these Literals should have been constructible
with rdflib, so this patch might revert significantly.

References:
* RDFLib/rdflib#1757

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
@ajnelson-nist
Copy link
Contributor Author

It looks like discussion of how to resolve this PR should be anchored upstream, on RDFLib Issue 1757.

@ajnelson-nist
Copy link
Contributor Author

The upstream issue is now closed, but an API was introduced to expose the not-well-formed literals. The present PR should probably wait for the next release of RDFLib.

@ajnelson-nist
Copy link
Contributor Author

@ashleysommer , I stumbled across a data XPASS today that reminded me about this open issue.

It looks like there is still a pause on this while waiting for the next rdflib release, which hopefully incorporates PR 2003. However, I'd like to be ready for the future, and have a typing question for you.

I'm trying to define a shape on a property that has rdfs:range xsd:positiveInteger. Which would be the correct syntax for this?

What I have now is the following, which doesn't throw a validation error on passing in "0"^^xsd:integer:

ex:my-shape
  a sh:PropertyShape ;
  sh:datatype xsd:positiveInteger ;
  sh:nodeKind sh:Literal ;
  sh:path ex:my-property ;
  sh:targetSubjectsOf ex:my-property ;
  .

I figure I can get the important bit about needing to be >=1 across with sh:minInclusive, while waiting for the datatype-based enforcement. But, what is the syntax that should be used?

ex:my-shape sh:minInclusive "1"^^xsd:positiveInteger .

Or:

ex:my-shape sh:minInclusive "1"^^xsd:integer .

@ashleysommer
Copy link
Collaborator

ashleysommer commented Sep 8, 2022

Hi @ajnelson-nist

Sorry, I forgot about this PR, and I didn't see your typing question above.

You're right, to emulate rdfs:positiveInteger in SHACL on older RDFLib versions, you will need
sh:datatype xsd:positiveInteger ; sh:minInclusive "1"^^xsd:integer ;

Consider, in Turtle, numerals are implicitly xsd:integer so for minInclusive you can simply use sh:minInclusive 1;

HOWEVER, RDFlib v6.2.0 has been out for a few weeks, and has the new ill-typed literals feature. I have just released a new version of PySHACL (v0.20.0) that takes advantage of this feature when evaluating sh:datatype that allows PySHACL to leverage RDFLib's more comprehensive data type checking (sorry, I forgot you had this PR open to do the same).

Note also, RDFLib v6.2.0 has added a new check for xsd:positiveInteger among others, so I believe if you upgrade your RDFLib to v6.2.0 and PySHACL to v0.20.0, your issue with type checking should now be resolved.

In regard to the release candidate, that was a little premature. v0.19.x is no longer a release candidate, some big changes were implemented with the move to RDFLib v6.2.0, so the new PySHACL v0.20.0 is the new release candidate.

With this, `test_integers.py` passes `mypy --strict`.

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
@ajnelson-nist
Copy link
Contributor Author

Hi @ashleysommer ,

Apologies for the delayed reply, I was on annual leave.

Thanks for integrating the ill-typed literals feature. Unfortunately, I'm not quite sure how to use it. I would still like this PR to be eventually be merged because the unit test will serve as a demonstration of triggering and detecting the errors. I think the test still needs a little more development to take care of what I'm not clear on.

I believe something between pyshacl and rdflib should raise an error for "-1"^^xsd:positiveInteger, and the other more numeric samples in the unit test. It's not clear to me if this is supposed to cause an effect in the SHACL validation results graph. I would like it to, but I don't know if the SHACL specification covers datatype constraints and ill-typed literals, so I don't know if it's a bend of the SHACL specification to have "-1"^^xsd:positiveInteger fail a sh:datatype xsd:positiveInteger constraint. I'd personally prefer this, because I think that's at least in the spirit of SHACL, and possibly in scope of SHACL by vaguely pointing at how sh:pattern gives us regexes.

test/test_integers.py Outdated Show resolved Hide resolved
fix sh.path to sh.resultPath in integers test
@ashleysommer
Copy link
Collaborator

ashleysommer commented Sep 16, 2022

Hi @ajnelson-nist

Unfortunately, I'm not quite sure how to use it
I believe something between pyshacl and rdflib should raise an error for "-1"^^xsd:positiveInteger,

I'm not sure I'm following. What kind of error are you expecting?

As-of PySHACL v0.20.0 (and RDFLib v6.2.0) the units tests you contributed now pass. PySHACL now detects the ill-typed literals when executing the sh.datatype constraint, and that constraint emits a validation result as expected.

Note, you'll notice my commit in this thread above, I needed to make one tiny correction to the third test in the file, to enable it to search the validation graph for the correct sh:resultPath values.

I don't know if it's a bend of the SHACL specification to have "-1"^^xsd:positiveInteger fail a sh:datatype xsd:positiveInteger constraint

It is in the textual definition of the sh:datatype specification in the SHACL Spec document

"A literal matches a datatype if the literal's datatype has the same IRI and, for the datatypes supported by SPARQL 1.1, is not an ill-typed literal."

Before RDFLIb v6.2.0 was released, PySHACL had no way to determine if a Literal was ill-typed (beyond some crude string, int and float checks). That is why it these tests were not able to pass before now. But since RDFLib v6.2.0 was released, PySHACL can now take advantage of the new ill-typed Literal test and incorporates that check into the sh:datatype constraint validation algorithm.

Thank you for your contribution of this unit test.

@ashleysommer ashleysommer merged commit 6437830 into RDFLib:master Sep 16, 2022
@ajnelson-nist ajnelson-nist deleted the test_integer_types branch September 16, 2022 13:16
@ajnelson-nist
Copy link
Contributor Author

Huh. I'm not sure why that test was still failing on my laptop yesterday, but it does indeed pass on my laptop now. Apologies for the confusion. Thank you for working through this with me, and for the spec. citation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants