RDF to JSON should use @annotation, where possible #11

gkellogg · 2021-01-30T20:57:54Z

The existing algorithm only uses embedded nodes.

pchampin · 2021-02-03T18:44:15Z

I see two things that can make this tricky:

the annotating triple may be encountered before the asserted annotated triple, so this would need to be done in a second phase
annotations themselves can be nested.

Here's an idea:

first generate the node maps normally (using embedded nodes)
for each node map (corresponding to a named graph or the default graph):
- look for all subjects that are embedded nodes, and sort them by decreasing "depth" in annotation_candidates
- for each embedded node en in annotation_candidates
  - search the node map for the corresponding asserted triple
  - if found, move the en entry of the node map as an @annotation (without its @id)

Example: we start with:

{
    "<< << ex:a ex:b ex:c >> ex:d ex:e >>": {
        "@id": { "@id": { "@id": "ex:a", "ex:b": {"@id": "ex:c"} }, "ex:d": {"@id": "ex:e"} },
        "ex:f": [ { "@id": "ex:g" } ]
    },
    "<< ex:a ex:b ex:c >>": {
        "@id": { "@id": "ex:a", "ex:b": {"@id": "ex:c"} },
        "ex:d": [ {"@id": "ex:e"} ]
    },
    "ex:a": {
        "ex:b": {"@id": "ex:c"}
    }
}

We find that the first entry has a match in the second entry, so we move it there as an annotation:

{
    "<< ex:a ex:b ex:c >>": {
        "@id": { "@id": "ex:a", "ex:b": {"@id": "ex:c"} },
        "ex:d": [ {"@id": "ex:e",
            "@annotation": { "ex:f": [ { "@id": "ex:g" } ] }
        } ]
    },
    "ex:a": {
        "ex:b": {"@id": "ex:c"}
    }
}

and then we find that the (originally) second entry has a match in the (originally) third entry, so we move it there as an annotation:

{
    "ex:a": {
        "ex:b": {"@id": "ex:c",
            "@annotation": {
                "ex:d": [ {"@id": "ex:e",
                    "@annotation": { "ex:f": [ { "@id": "ex:g" } ] }
                } ]
            }
        }
    }
}

gkellogg · 2021-02-03T21:25:01Z

That's essentially the same method I had been thinking about, although I think we can simply reverse sort by length to get the same effect. Of course, there may be some corner cases where annotations are used at the beginning and ending of a chain, but not in the middle that could be tricky.

Thanks for thinking this through!

pchampin · 2021-02-04T11:21:42Z

If by "length" you mean the string-length of the key, I find this too brittle.. A simple triple with a very long term may be longer than a deeply nested triple with short terms.
If by "length" you mean the number of atomic terms, then yes, it might be a good proxy for depth, but is that much easier to check.

Idea: could we encode the depth of a triple in their key in the node map? That way, the sorting would be much easier ;-)

gkellogg · 2021-02-04T17:27:54Z

My reasoning is that the goal is to look for more deeply embedded triples before more shallowly, and that terms that might be chosen before deeper embedded terms would not interfere. It's possible that subject- vs object- embedding could be fooled, though. Did you have some examples of where this would go wrong?

Idea: could we encode the depth of a triple in their key in the node map? That way, the sorting would be much easier ;-)

Yes, we could come up with an algorithm for creating the key, which would get back a measure of the depth of embedding in addition to the canonicalized form, and that would probably work okay.

pchampin · 2021-02-05T10:26:10Z

My reasoning is that the goal is to look for more deeply embedded triples before more shallowly,

yes, we agree on that

and that terms that might be chosen before deeper embedded terms would not interfere

Oh, I see it now! Of course, length is not a good proxy for depth in general, that is not a problem. We are actually only interested in the partial order "contains", for which length is a good proxy!

So yes, key length is much simpler, and doing the job perfectly. I stand corrected 👏

gkellogg mentioned this issue Feb 9, 2021

Flattened annotations #13

Merged

gkellogg closed this as completed in #13 Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RDF to JSON should use @annotation, where possible #11

RDF to JSON should use @annotation, where possible #11

gkellogg commented Jan 30, 2021

pchampin commented Feb 3, 2021

gkellogg commented Feb 3, 2021

pchampin commented Feb 4, 2021

gkellogg commented Feb 4, 2021

pchampin commented Feb 5, 2021

RDF to JSON should use @annotation, where possible #11

RDF to JSON should use @annotation, where possible #11

Comments

gkellogg commented Jan 30, 2021

pchampin commented Feb 3, 2021

gkellogg commented Feb 3, 2021

pchampin commented Feb 4, 2021

gkellogg commented Feb 4, 2021

pchampin commented Feb 5, 2021