Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JENA-2097: Bad URIs are parser warnings #989

Merged
merged 2 commits into from
May 5, 2021
Merged

Conversation

afs
Copy link
Member

@afs afs commented Apr 30, 2021

JENA-2097 shows two things:

  1. checking urn:uuid was broken (4.0.0 adds checking of UUIDs).
  2. IRI problems should be warnings (parsing does not stop).

Only IRI syntax that is broken to the point where <> can't be matched or impossible characters (e.g. space) are used should be a parse-stop error as it was in jena 3.17.0, especially for N-Triples.

To this end, add a test suite TestIRIxRIOT that has test cases for all the situations, checking the numbers of errors and warning produced for the new IRIx abstraction use in RIOT parsing.

Relative IRIs are now a special case with their own exception RelativeIRIException. Some formats (N-Triples when not strict) accept them. Turtle always resolves them (and there is always a base) unless specific low-level setup is done when it can be a parse-stop error.

Behaviour in normal use should be like 3.17.0 with only possible changes of ERROR to WARN in messages. Behaviour in detailed setup for Turtle now has test cases.

Tidy.
Add assertion in TestDatabaseOps.
@afs afs merged commit 37c3697 into apache:main May 5, 2021
@afs afs deleted the jena2097-iri branch May 5, 2021 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant