Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDF/XML and Turtle serializations broken for sub-namespaces involving special characters #262

Open
osma opened this issue Apr 1, 2016 · 1 comment

Comments

@osma
Copy link
Contributor

osma commented Apr 1, 2016

EasyRdf cannot properly serialize data from the EuroVoc thesaurus (see NatLibFi/Skosmos#421 for original issue report).

The problem seems to be that URI namespaces are shortened naively and the shortened URIs (qnames) may contain characters that are invalid in RDF/XML and/or Turtle. EuroVoc has two namespaces, http://eurovoc.europa.eu/ for thesaurus concepts and labels and http://eurovoc.europa.eu/schema# for schema elements such as custom classes and properties. Note that the latter is a sub-namespace of the former.

Code to demonstrate:

$graph = new EasyRdf_Graph();
EasyRdf_Namespace::set('euro', 'http://eurovoc.europa.eu/');

$types = array('http://eurovoc.europa.eu/schema#PreferredTerm');
$res = $graph->resource('http://eurovoc.europa.eu/212626', $types);

echo $graph->serialise('rdfxml');

This outputs:

<?xml version="1.0" encoding="utf-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:euro="http://eurovoc.europa.eu/">

  <euro:schema#PreferredTerm rdf:about="http://eurovoc.europa.eu/212626">
  </euro:schema#PreferredTerm>

</rdf:RDF>

This is invalid XML, as # cannot be used in tag names.

If I change the output format to turtle instead I get this:

@prefix euro: <http://eurovoc.europa.eu/> .

euro:212626 a euro:schema#PreferredTerm .

In my reading of the RDF 1.1 Turtle grammar, # is not allowed unescaped in local names. In older Turtle versions it is certainly not allowed.

The safest would be to avoid shortening URIs if the result would not be a valid XML QName.

@osma
Copy link
Contributor Author

osma commented Apr 4, 2016

I only now noticed that this has been discussed before, in #115 and PR #248.
Anyway, this bug affects EasyRdf 0.9.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant