turtle parsing of input stream + relative uris #122

Closed
dtm opened this Issue Jun 30, 2015 · 4 comments

Comments

Projects
None yet
2 participants
@dtm
Contributor

dtm commented Jun 30, 2015

49% curl http://eprints.soton.ac.uk/375233/7/provenance.ttl | provconvert -infile - -informat ttl -outformat provn -outfile -
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 48319  100 48319    0     0   691k      0 --:--:-- --:--:-- --:--:--  693k
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "main" org.openprovenance.prov.interop.InteropException: org.openrdf.rio.RDFParseException: Not a valid (absolute) URI: /starting-points.png [line 44]
    at org.openprovenance.prov.interop.InteropFramework.readDocument(InteropFramework.java:604)
    at org.openprovenance.prov.interop.InteropFramework.readDocument(InteropFramework.java:533)
    at org.openprovenance.prov.interop.InteropFramework.doReadDocument(InteropFramework.java:790)
    at org.openprovenance.prov.interop.InteropFramework.run(InteropFramework.java:841)
    at org.openprovenance.prov.interop.CommandLineArguments.main(CommandLineArguments.java:227)
Caused by: org.openrdf.rio.RDFParseException: Not a valid (absolute) URI: /starting-points.png [line 44]
    at org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:622)
    at org.openrdf.rio.turtle.TurtleParser.reportFatalError(TurtleParser.java:1114)
    at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:340)
    at org.openrdf.rio.helpers.RDFParserBase.resolveURI(RDFParserBase.java:327)
    at org.openrdf.rio.turtle.TurtleParser.parseURI(TurtleParser.java:855)
    at org.openrdf.rio.turtle.TurtleParser.parseValue(TurtleParser.java:525)
    at org.openrdf.rio.turtle.TurtleParser.parseObject(TurtleParser.java:413)
    at org.openrdf.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:339)
    at org.openrdf.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:315)
    at org.openrdf.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:301)
    at org.openrdf.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:208)
    at org.openrdf.rio.turtle.TurtleParser.parse(TurtleParser.java:186)
    at org.openrdf.rio.turtle.TurtleParser.parse(TurtleParser.java:131)
    at org.openprovenance.prov.rdf.Utility.parseRDF(Utility.java:67)
    at org.openprovenance.prov.rdf.Utility.parseRDF(Utility.java:58)
    at org.openprovenance.prov.interop.InteropFramework.readDocument(InteropFramework.java:584)
    ... 4 more
Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) URI: /starting-points.png
    at org.openrdf.model.impl.URIImpl.setURIString(URIImpl.java:68)
    at org.openrdf.model.impl.URIImpl.<init>(URIImpl.java:57)
    at org.openrdf.model.impl.ValueFactoryImpl.createURI(ValueFactoryImpl.java:38)
    at org.openrdf.rio.helpers.RDFParserBase.createURI(RDFParserBase.java:337)
    ... 17 more

But this works fine:

50% curl http://eprints.soton.ac.uk/375233/7/provenance.ttl > provenance.ttl
51% provconvert -infile provenance.ttl -informat ttl -outformat provn -outfile -

Line 44:

<http://openprovenance.org/include#20892220-a071-4ef3-a799-3056447ec8a2-1>schema:contentLocation <starting-points.png> . 

I would guess its because we dont have a default base uri defined when we're parsing from a stream as the file based conversion output has pre_88:starting-points.png where pre_88 starts with file:.

@dtm

This comment has been minimized.

Show comment
Hide comment
@dtm

dtm Jun 30, 2015

Contributor

As a side point pre_88 in the output is:

prefix pre_88 <file:/home/...>

I think the file uri should be file:///home/....

Contributor

dtm commented Jun 30, 2015

As a side point pre_88 in the output is:

prefix pre_88 <file:/home/...>

I think the file uri should be file:///home/....

@lucmoreau lucmoreau added the prov-rdf label Jul 8, 2015

@lucmoreau

This comment has been minimized.

Show comment
Hide comment
@lucmoreau

lucmoreau Jul 23, 2015

Owner

This is a perfectly valid rdf file, but does it allow for prov inter-operability? The other prov representations don't have this notion of uri relative to a base uri.

So, how do we handle this? Should we have a base uri parameter for provconvert?

Owner

lucmoreau commented Jul 23, 2015

This is a perfectly valid rdf file, but does it allow for prov inter-operability? The other prov representations don't have this notion of uri relative to a base uri.

So, how do we handle this? Should we have a base uri parameter for provconvert?

@lucmoreau

This comment has been minimized.

Show comment
Hide comment
@lucmoreau

lucmoreau Jul 23, 2015

Owner

I have implemented a fix, setting a base uri to file://stdin/.

The example now parses in this specific case. However, as said above, we have not solved the interoperability issue here. Should we explicitly disallow relative uris?

As far as the file uri is concerned, it's generated by java.io library. There is very little I can do here.

Owner

lucmoreau commented Jul 23, 2015

I have implemented a fix, setting a base uri to file://stdin/.

The example now parses in this specific case. However, as said above, we have not solved the interoperability issue here. Should we explicitly disallow relative uris?

As far as the file uri is concerned, it's generated by java.io library. There is very little I can do here.

@lucmoreau

This comment has been minimized.

Show comment
Hide comment
@lucmoreau

lucmoreau Jul 23, 2015

Owner

I added a example file in prov-rdf/src/test/resources/examples/relative-uri.ttl.

If we read the file with

cat prov-rdf/src/test/resources/examples/relative-uri.ttl | provconvert -infile - -informat ttl -outfile - -outformat provn

Then, we get a file://stdin/ as a "base uri".

document
prefix bnode <http://openprovenance.org/provtoolbox/bnode/>
prefix pre_0 <file://stdin/>
prefix ex <http://example.com/>
prefix owl <http://www.w3.org/2002/07/owl#>
prefix rdf <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs <http://www.w3.org/2000/01/rdf-schema#>
activity(ex:experiment,-,-)
entity(ex:inconsistentResult)
wasEndedBy(ex:experiment,ex:inconsistentResult,-,-)
wasEndedBy(ex:experiment,ex:inconsistentResult,-,2011-07-16T01:52:02Z,[prov:location = 'pre_0:scienceLab_003'])
endDocument

If we read the file with

 provconvert -infile prov-rdf/src/test/resources/examples/relative-uri.ttl -outfile - -outformat provn 

then, we get a base uri file:/home/me/workspace/ProvToolbox/prov-rdf/src/test/resources/examples/

document
prefix bnode <http://openprovenance.org/provtoolbox/bnode/>
prefix pre_0 <file:/home/me/workspace/ProvToolbox/prov-rdf/src/test/resources/examples/>
prefix ex <http://example.com/>
prefix owl <http://www.w3.org/2002/07/owl#>
prefix rdf <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs <http://www.w3.org/2000/01/rdf-schema#>
activity(ex:experiment,-,-)
entity(ex:inconsistentResult)
wasEndedBy(ex:experiment,ex:inconsistentResult,-,-)
wasEndedBy(ex:experiment,ex:inconsistentResult,-,2011-07-16T01:52:02Z,[prov:location = 'pre_0:scienceLab_003'])
endDocument
Owner

lucmoreau commented Jul 23, 2015

I added a example file in prov-rdf/src/test/resources/examples/relative-uri.ttl.

If we read the file with

cat prov-rdf/src/test/resources/examples/relative-uri.ttl | provconvert -infile - -informat ttl -outfile - -outformat provn

Then, we get a file://stdin/ as a "base uri".

document
prefix bnode <http://openprovenance.org/provtoolbox/bnode/>
prefix pre_0 <file://stdin/>
prefix ex <http://example.com/>
prefix owl <http://www.w3.org/2002/07/owl#>
prefix rdf <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs <http://www.w3.org/2000/01/rdf-schema#>
activity(ex:experiment,-,-)
entity(ex:inconsistentResult)
wasEndedBy(ex:experiment,ex:inconsistentResult,-,-)
wasEndedBy(ex:experiment,ex:inconsistentResult,-,2011-07-16T01:52:02Z,[prov:location = 'pre_0:scienceLab_003'])
endDocument

If we read the file with

 provconvert -infile prov-rdf/src/test/resources/examples/relative-uri.ttl -outfile - -outformat provn 

then, we get a base uri file:/home/me/workspace/ProvToolbox/prov-rdf/src/test/resources/examples/

document
prefix bnode <http://openprovenance.org/provtoolbox/bnode/>
prefix pre_0 <file:/home/me/workspace/ProvToolbox/prov-rdf/src/test/resources/examples/>
prefix ex <http://example.com/>
prefix owl <http://www.w3.org/2002/07/owl#>
prefix rdf <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs <http://www.w3.org/2000/01/rdf-schema#>
activity(ex:experiment,-,-)
entity(ex:inconsistentResult)
wasEndedBy(ex:experiment,ex:inconsistentResult,-,-)
wasEndedBy(ex:experiment,ex:inconsistentResult,-,2011-07-16T01:52:02Z,[prov:location = 'pre_0:scienceLab_003'])
endDocument

@lucmoreau lucmoreau closed this Jul 23, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment