-
Notifications
You must be signed in to change notification settings - Fork 650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RDFXML: possible to load incorrect XML #2620
Comments
There are a lot of warnings. But it parses for me using Would it be possible to have more concise examples of problems? It should only need a short example. |
What about this ? String data = """
<?xml version="1.0"?>
<Ontology xmlns="http://www.w3.org/2002/07/owl#"
xml:base="http://www.w3.org/2002/07/owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<Prefix name="owl" IRI="http://www.w3.org/2002/07/owl#"/>
<Prefix name="rdf" IRI="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
<Prefix name="xml" IRI="http://www.w3.org/XML/1998/namespace"/>
<Prefix name="xsd" IRI="http://www.w3.org/2001/XMLSchema#"/>
<Prefix name="rdfs" IRI="http://www.w3.org/2000/01/rdf-schema#"/>
<Declaration>
<Class IRI="http://x#X"/>
</Declaration>
</Ontology>
""";
ModelFactory.createDefaultModel()
.read(new StringReader(data), null, "rdf/xml")
.setNsPrefixes(PrefixMapping.Standard).write(System.out, "ttl"); output
To create such a document, OWLAPI or owlcs/ONTAPI can be used: OWLOntologyManager m = OntManagers.createOWLAPIImplManager();
OWLOntology ont = m.createOntology();
ont.add(m.getOWLDataFactory().getOWLDeclarationAxiom(m.getOWLDataFactory().getOWLClass(IRI.create("http://x#X"))));
ont.saveOntology(new OWLXMLDocumentFormat(), System.out); |
I can't make Jena4 cause an error parsing; it does throw an exception when writing the model due to a relative IRI for a property. Jena4 parsing issues warnings, as does Jena5. There is a difference output between Jena4 and Jena5. <?xml version="1.0"?>
<Ontology xmlns:p="http://example/ns#">
<Prefix p:name="rdf"/>
</Ontology> with In Jena4, ARP issues a warning and outputs a relative URI property. It also generate an relative URI for In Jena5, RRX issues a warning and skips the triple. It resolves the This is by design. In writing RRX, I checked all the cases ARP supports. It was decided by the RDF 1.0 WG that bare attributes were not legal - they had been in the earlier design phases but that never made it to a spec. So in RRX it a warning and skipping, it is not a hard error due to legacy with ARP. Relative URIs will get into trouble later! I'd be happy for that to become a error, not warning. The fact that the doc parses as RDF/XML at all is because the root qname does not have to be ARP (0 - the original; 1 - more integrated into RIOT error handling) is available in Jena5, at the moment, as lang name "arp0", "arp1" or as a file extension, and deprecated constant Expect ARP0 to go away soon. ARP1 was the RDF/XML parser from started in 4.7.0 to 4.10.0, |
As the distingushing RDF/XML from OWL/XML, it hard/impossible in the most general case. It would be better to read the input, snoop for the top level element and decide then reparse (all after checking MIME type and file extension if available). File extension |
It seems need to use the original ontology document https://github.com/owlcs/ont-api/blob/4.x.x/src/test/resources/owlapi/owlxml_anonloop.owx |
What's the error message, and is there a small extract that picks up the feature in error? |
|
The error occurs because of two objects:
Only the SAX based RRX parser is affected - the two StAX based ones, and ARP, detect this mistake. Parsing OWLx (".owx") as RDF/XML and expecting an error isn't guaranteed. There are some simple documents that will pass. |
…to ignore OWLXML (workaround for apache/jena#2620)
Version
5.1.0
What happened?
This time I expect failure (i.e. the behavior of Jena 4.x).
The document is OWL/XML, not RDF/XML.
It seems to be important: in owlcs/ONTAPI there is a loading mechanism that iterates over formats (both Jena and OWLAPI).
If the document is parsed as RDF/XML successfully, than mechanism stops and returns ready Graph. In this case it contains rubbish.
Relevant output and stacktrace
Are you interested in making a pull request?
None
The text was updated successfully, but these errors were encountered: