Literal as type #35

MattesWhite · 2020-01-22T13:17:23Z

This PR contributes to #34 .

With this PR the contents of Term::Literal are extracted into a new type LiteralData. This should allow third-party authors to easier add specific behavior for RDF-literals, e.g. serialization.

Furthermore, the calls to n3() has been reduced in the modified code (see #34).

In addition, the function Term::new_literal() is added. A shortcut for creating simple literals, i.e. literals with datatype xsd:string.

The content of the variant Term::Literal() is turned into an own type. This way it should be easier for thrid party libraries to implement custom functionality for RDF-literals.

…iteral-as-type

New constructor for simple literals, i.e. literals with datatype `xsd:string`. Where suitable the new constructor is invoked.

pchampin · 2020-01-22T18:35:48Z

I'm not sure exactly what problem you are trying to solve with the LiteralData type.

All I can see is additional verbosity due to the intermediate type.

I can see a few interesting ideas:

naming subfields of literals (text/value & kind),
having a simpler constructor for xsd:strings (although I tended to favor uniformity by having all (safe) term constructors return a result, even when they could not fail...),

but all of them could be implemented without the additional LiteralData type.

What am I missing?

MattesWhite · 2020-01-23T08:59:28Z

My intention is to seperate handling of the single RDF-terms (IRI, blank node, literal, variable) from an RDF-term that can be anything. In my opinion this makes it easier to reason about how a certain type of term is handled.

Let me explain it for serialization. Assume a trait for serializing something conformant to NTriples:

trait NtSerializable {
    fn seriallize<W: io::Write>(&self, w: &mut W) -> io::Result<()>;
}

This can be implemented for basic types like IriData, BNodeId, ... then serializing a Term would just be (simplyfied):

impl NtSerializable for Term {
    fn serialize<W: io::Write>(&self, w: &mut W) -> io::Result<()> {
        match self {
            Iri(iri) => iri.serialize(w),
            Literal(literal) => literal.serialize(w),
            ...
        }
    }
}

Continuing this up to Graph or Database (maybe with specialized traits for strict and so on). As a user it enables me to precisely find the code how different elements of RDF are serialized.

Another point I want to make is uniformity. As the same is already done for IRIs and blank nodes I thought it would just be consequent to do the same for literals (and maybe variables). Especially, as blank nodes are less complex as literals but already have an own type.

In a broader scale the seperation of Literals also supports a higher idea of mine. Actually, I wanted to come up with this later. The idea is to make Term a trait. Yes, this probably adds a lot more of generics, where-clauses and verbosity. But I think this only appies to the library side.

Applications will pick probably only a bunch of concrete crates from the sophia ecosystem. Therefore, they will not need to handle complex where-clauses.

On the other hand this would allow us to support external term implementations with existing sophia code, For example, use rio's terms directly in sophia::HashGraph without the need for conversion. Another example is to handle strict and generalized RDF by different implementations of Term reusing the base types for IRIs, blank nodes and literals.

Furthermore, extensions of the RDF-standard could easily interoperate with the sophia ecosystem. In fact, I'm thinking of Notation 3 here. In order to support quoted graphs in triples it is essential to support another kind of Term.

However, this is not the topic of this PR and should prabably be moved into an own issue.

pchampin · 2020-01-23T15:05:42Z

Ok, that is clearer, thanks.

In the current design, I thought of the Termenum as a sort of "class hierarchy", where the enum itself is the superclass, and the variants (Iri, Literal, etc...) are the subclasses. The "inner types" (such as IriData and BNodeId) were mere artifacts, introduced to overcome the limitations on enum variants (they can not have private attributes or methods of their own).

As I understand, you propose an approach where each subclass is defined as a separate type, and variants of the superclass enum are mere wrappers. I agree this is more regular, probably easier to understand, and possibly more idiomatic (see for example https://docs.rs/json/0.12.1/json/enum.JsonValue.html).

If we go that way, though, I would prefer:

to name the subclass types as closely as possible to their variant (so renaming IriData to Iri, and so on), and have them defined in respective submodules (term::iri, term::litral, etc.);
to make Literal itself an enum, with variants Typed{value: T, datatype: T}, Lang{value: T, tag: T} and get rid of LiteralKind, because I find that the currently proposed design makes pattern matching on literals way too verbose...

As for turning Term into a trait, this is indeed a totally different topic. Feel free to open an issue about it...

MattesWhite · 2020-01-24T07:38:18Z

I'm happy to see that we came to an agreement, my intentions are exactly as you understand.

Your preferred points are reasonable and I agree on them, too.

As this PR starts to heavily conflict with #37 and the changes required differ more and more from my initial proposal I close this PR and will commit a new one when the serializer PR is merged.

MattesWhite added 3 commits January 22, 2020 13:38

Introduce LiteralData as an own type

32330c8

The content of the variant Term::Literal() is turned into an own type. This way it should be easier for thrid party libraries to implement custom functionality for RDF-literals.

Merge branch 'master' of https://github.com/pchampin/sophia_rs into l…

6a032de

…iteral-as-type

Add Term::new_literal()

8c29802

New constructor for simple literals, i.e. literals with datatype `xsd:string`. Where suitable the new constructor is invoked.

MattesWhite mentioned this pull request Jan 22, 2020

Remove error chain #36

Merged

MattesWhite closed this Jan 24, 2020

This was referenced Feb 25, 2020

Make sophia ready to support external serializers #34

Closed

[merged] Variable as own type #47

Closed

MattesWhite deleted the literal-as-type branch February 27, 2020 06:43

This was referenced Mar 12, 2020

IRI as own public type #49

Merged

Introduce Literal as an own type #53

Merged

MattesWhite mentioned this pull request Mar 24, 2020

Make 'Term' a trait #55

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Literal as type #35

Literal as type #35

MattesWhite commented Jan 22, 2020

pchampin commented Jan 22, 2020

MattesWhite commented Jan 23, 2020

pchampin commented Jan 23, 2020

MattesWhite commented Jan 24, 2020

Literal as type #35

Literal as type #35

Conversation

MattesWhite commented Jan 22, 2020

pchampin commented Jan 22, 2020

MattesWhite commented Jan 23, 2020

pchampin commented Jan 23, 2020

MattesWhite commented Jan 24, 2020