Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jena doesn't accept the urn:uuid: UUID prefix #120

Closed
mielliott opened this issue May 21, 2021 · 1 comment
Closed

Jena doesn't accept the urn:uuid: UUID prefix #120

mielliott opened this issue May 21, 2021 · 1 comment

Comments

@mielliott
Copy link
Collaborator

mielliott commented May 21, 2021

Apache Jena doesn't seem to like the urn:uuid: prefixes on UUIDs. Trying to build an index from a list of nquads gave the following error:

$ preston cat --remote http://preston.acis.ufl.edu hash://sha256/8ff46ae6a30bf9647df0294b92434a83784626b3f8c37163db3edefb049daead | tee progress.log | preston grep --no-cache --remote http://preston.acis.ufl.edu 'http://www.botanicalcollections.be/specimen/[a-zA-Z]+[0-9]+V?' | bzip2 > grep-out.nq.bz2
$ tdbloader --loc index/ grep-out.nq.bz2 
13:01:13 INFO  loader          :: -- Start triples data phase
13:01:13 INFO  loader          :: ** Load empty triples table
13:01:13 INFO  loader          :: -- Start quads data phase
13:01:13 INFO  loader          :: ** Load empty quads table
13:01:13 INFO  loader          :: Load: grep-out.nq.bz2 -- 2021/05/21 13:01:13 EDT
13:01:13 ERROR riot            :: [line: 1, col: 1 ] Bad IRI: Not a valid UUID string: urn:uuid:d1ddb004-ff12-491d-9a93-3282a6aba454
org.apache.jena.riot.RiotException: [line: 1, col: 1 ] Bad IRI: Not a valid UUID string: urn:uuid:d1ddb004-ff12-491d-9a93-3282a6aba454

Looks like Jena complained about the first UUID it encountered:

$ bzcat grep-out.nq.bz2 | grep -n "d1ddb004-ff12-491d-9a93-3282a6aba454"
1:<d1ddb004-ff12-491d-9a93-3282a6aba454> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <d1ddb004-ff12-491d-9a93-3282a6aba454> .
2:<d1ddb004-ff12-491d-9a93-3282a6aba454> <http://www.w3.org/ns/prov#wasInformedBy> <a1e86586-c7c4-4f90-b066-4c9a53bf6c7a> <d1ddb004-ff12-491d-9a93-3282a6aba454> .
3:<d1ddb004-ff12-491d-9a93-3282a6aba454> <http://www.w3.org/ns/prov#used> <hash://sha256/030e42b29d292e98ed25f081adf789e28b085a4cfd667fff4d3a34fc554a3030> <d1ddb004-ff12-491d-9a93-3282a6aba454> .
4:<d1ddb004-ff12-491d-9a93-3282a6aba454> <http://purl.org/dc/terms/description> "An activity that finds the locations of text matching the regular expression 'http://www.botanicalcollections.be/specimen/[a-zA-Z]+[0-9]+V?' inside any encountered content (e.g., hash://sha256/... identifiers)."@en <d1ddb004-ff12-491d-9a93-3282a6aba454> .

After dropping the urn:uuid: prefix via sed 's/urn:uuid://g', the tdbloader command succeeded.

Tool versions:

$ tdbloader --version
Jena:       VERSION: 4.0.0
Jena:       BUILD_DATE: 2021-03-27T10:32:16+0000
ARQ:        VERSION: 4.0.0
ARQ:        BUILD_DATE: 2021-03-27T10:32:16+0000
TDB:        VERSION: 4.0.0
TDB:        BUILD_DATE: 2021-03-27T10:32:16+0000
$ preston version
0.2.5
@jhpoelen jhpoelen changed the title Jena doesn't accept the urn:uud: UUID prefix Jena doesn't accept the urn:uuid: UUID prefix Jun 2, 2021
@jhpoelen
Copy link
Member

jhpoelen commented Mar 7, 2024

Jena is no longer used, and workaround exists.

Oxigraph seems to be able to handle the uuids ok.

@jhpoelen jhpoelen closed this as completed Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants