New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why does a query that works in Jena 3.16 but throw an error in Jena 4.10? #2102
Comments
Jena 4.x more strictly enforces certain tests around URI validity because allowing base URIs into the system always leads to problems down the road. Since this is a URI it should be URL encoded appropriately (i.e. Java/SPARQL backslash escapes are not suitable here), and that needs to happen in both your data and your queries. In general the way your URIs are structured looks strange. You seem to be trying to put a lot of "structure" into the URI fragment (the portion after the Your data should probably have URIs more like |
indeed the IRI validation has changed for SPARQL queries since 3.16. You have to get your data into the correct format. if this is not possible you have to URL encode the path component. |
It's a bug IMO. The Turtle parse accepts the data with a warning. There ought to be consistency between the SPARQL parser and the Turtle parser. That said - the data is not legal. No amount of escaping will change that. %-encoding as @rvesse mentions is replacing the character Not using Even if Jena allowed it, then the current data is likely to cause problems eslewhere. The SPARQL query can work with the string form of the URI in the data in an potentially inefficient manner:
|
Side note: Parsing in SPARQL and parsing in Turtle are signficantly dofefrent in the way dubious (error or warning) IRIs are treated. The W3C specs define the IRI token as
and then expect further checking for the legality of the string that matches that rule. Jena's SPARQL parser, ARQ, uses that rule (via javacc) then performs IRI validation. Jena's Turtle parser uses a custom tokenizer and does more limited checking on the characters between Any IRI validation has to parse the string so it duplicates the character exclusion rules of This is all known and intended by the W3C working groups - both specs intentionally did not include the full RFC3986/3986 grammar. It is quite large and it would have to be modified for UCHAR. An effect is that |
Version
4.10
Question
This query works in Jena 3.16
But throw an error in Jena 4.10
[line: 5, col: 31] Bad IRI: 'https://brickschema.org/schema/1.0.2/building_example#building:gtc/vavs/2/port[zn]': <https://brickschema.org/schema/1.0.2/building_example#building:gtc/vavs/2/port[zn]> Code: 0/ILLEGAL_CHARACTER in FRAGMENT: The character violates the grammar rules for URIs/IRIs.
I know it's the problem of '[zn]'.
I try to excape '[zn]',but it also throw an error
Here is the fragment of the TTL data
How can i fix it?
The text was updated successfully, but these errors were encountered: