Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hextuples Serializer #1489

Merged
merged 26 commits into from Dec 9, 2021
Merged

Hextuples Serializer #1489

merged 26 commits into from Dec 9, 2021

Conversation

nicholascar
Copy link
Member

This HexTuples Serializer partly addresses Issue #1437. No Parser is yet made

@nicholascar
Copy link
Member Author

@joepio we are some of the way there!

@nicholascar
Copy link
Member Author

@joepio: now we have a parser as well! It's as simple as I could make it so it just loops through strings or files. No remote URI fetching of content yet.

@joepio: please can you at least verify that the following test data is valid Hextuples:

        ["http://example.com/s01", "http://example.com/a", "http://example.com/Type1", "", "", ""]
        ["http://example.com/s01", "http://example.com/label", "This is a Label", "http://www.w3.org/2001/XMLSchema#string", "en", ""]
        ["http://example.com/s01", "http://example.com/comment", "This is a comment", "http://www.w3.org/2001/XMLSchema#string", "", ""]
        ["http://example.com/s01", "http://example.com/creationDate", "2021-12-01", "http://www.w3.org/2001/XMLSchema#date", "", ""]
        ["http://example.com/s01", "http://example.com/creationTime", "2021-12-01T12:13", "http://www.w3.org/2001/XMLSchema#dateTime", "", ""]
        ["http://example.com/s01", "http://example.com/age", 42, "http://www.w3.org/2001/XMLSchema#integer", "", ""]
        ["http://example.com/s01", "http://example.com/trueFalse", false, "http://www.w3.org/2001/XMLSchema#boolean", "", ""]
        ["http://example.com/s01", "http://example.com/op1", "http://example.com/o1", "", "", ""]
        ["http://example.com/s01", "http://example.com/op1", "http://example.com/o2", "", "", ""]
        ["http://example.com/s01", "http://example.com/op2", "http://example.com/o3", "", "", ""]

Since comparisons are against this, if this is correct, we should be fine!

My only query was whether a datatype of XSD string must be indicated if a language is present. For completeness, I think it should be, hence the "label" line in the data above.

@nicholascar
Copy link
Member Author

OK, so roundtripping is failing and I haven't yet worked out why. Likely due to the Hextuples parser putting in bad default graph values

@rescribet
Copy link

@nicholascar No that's not valid.

  • All elements are always a string, not json literal true or 42.
  • The individuals encoding the meaning of a named or blank node are missing in the datatype column
  • rdf:langString is the correct datatype if a language tag is present, as per the rdf spec
  • Empty value in the graph position means these statements should end up in the default graph

"No remote URI fetching of content yet." Not sure what you mean by this, it is a serialization format.

@nicholascar
Copy link
Member Author

I have worked out why most of the round-tripping tests that are failing aren't working: it is due to my current serialization implementation adding in xsd:string for literals of unknown type. I think this is an rdflib implementation issue for, in RDF in general, a literal of undeclared type is interpreted as a string (https://www.w3.org/TR/turtle/#turtle-literals) so whether there's an xsd:string declaration there or not, it should be isomorphic. This will be partly solved by addressing the comments below.

Some other tests are failing due to something relating to HexT not being formula-aware but I can't work this out yet. Basically, HexT doesn't have the expressive power for Quoted Graphs in the Subject position (I think). So I will just get the HexT round-tripping to pass for all n-triples examples that are not formula-aware for now and ignore the N3 tests which are formular-aware.

@rescribet I think I can make most of those changes:

  • All elements are always a string, not json literal true or 42.

    • easy
  • The individuals encoding the meaning of a named or blank node are missing in the datatype column

    • easy
  • rdf:langString is the correct datatype if a language tag is present, as per the rdf spec

    • easy
  • Empty value in the graph position means these statements should end up in the default graph

    • yes, this is what my parser is doing, but the serializer is, so far, putting Blank Node IDs in the graph position for default graphs. I'll replace this with ""
  • "No remote URI fetching of content yet." Not sure what you mean by this, it is a serialization format.

    • the rdflib parse() method can parse strings or files or content at the end of a URI (online). I just haven't implemented that "end of a URI" version yet.

@rescribet
Copy link

rescribet commented Dec 6, 2021

Cool. I've just extracted our parser code into a modified version of our n-quads-parser package. It also has an rdflib-compatible class, but that was rdflib compatible at the time of writing the n-quads parser, no idea if that is still useful today.

At least it should be useful if people outside of rdflib would want to add this capability to their JS/TS project https://github.com/ontola/hextuples-parser

@rescribet
Copy link

@nicholascar A small but significant change, the individuals which were on the rdf namespace have been replaced with just globalId or localId. So:

["http://example.com/s01", "http://example.com/a", "http://example.com/Type1", "globalId", "", ""]
["http://example.com/s01", "http://example.com/b", "b432", "localId", "", ""]
["http://example.com/s01", "http://example.com/age", "42", "http://www.w3.org/2001/XMLSchema#integer", "", ""]

@nicholascar nicholascar merged commit 327de4e into master Dec 9, 2021
@nicholascar nicholascar deleted the hextuples branch December 9, 2021 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants