Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC for RDF support in SAFE Client Libs #288

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nbaksalyar
Copy link
Contributor

Markdown rendered version can be found here.

@happybeing
Copy link

Looks very nice @nbaksalyar, great to see 😄

For Solid compatibility we need to support the LDP protocol at some level. As that isn't mentioned in this RFC I take it that this would be provided in safe_app_nodejs or in separate libraries on top of both. I'm curious how this will work. Is there anything that shows how LDP for example might be supported, and for example, how LDP requests would be translated into use of this API. I only know how node-solid-server does this atm, so I'm trying to imagine how we can do the same without a file system like layer (as in storing RDF resources via SAFE NFS).

Going from low level to higher level, I'm imagining something like:

  • LOWEST LEVEL: creation, storage and access to RDF resources as immutable data (using this RDF API)
  • HIGHEST LEVEL: Solid protocols implemented as libraries and/or within safe_app_nodejs (so WebID profiles, LDP containers)

I was thinking there might be more layers than that, but see it might be all that is needed. If so, it begs the question of what happens to SAFE NFS, and if that can sit on top of this API - is that a possibility or the intention? If LDP and NFS were able to work with the same underlying data structures I think that would be cool 😺

If so, it makes me wonder about the efficiency of implementing those higher level functions via RDF based data structures, and what would be in an individual RDF resource stored at an XorURL. For example, might each NFS and/or LDP container be a single RDF resource? I have the same difficulty thinking about these questions with the existing implementations! For example, whether an NFS directory should be a separate resource (e.g. NFS container or RDF resource in this case), and when it should just be an entry in an existing resource (e.g. NFS keys with multiple '/').

In the case of RDF, maybe the question is answered for us. For example, if every LDP container were expected to be a separate resource, but I'm not sure that is mandated. So the question of how to implement this may still remain, with obvious implications for performance (i.e. number of requests, size of RDF resources etc.)

Most/all of the above is outside the scope of this RFC, but I find it helpful to discuss that while trying to understand where this fits and if it has implications for the 'whole'. So I'm trying to see how this RFC fits into the bigger plan, or possible plans. So no need to try and answer all this. If you can point out any wrong thinking on my part, or fill in any areas that would be great. Or if further RFCs are imminent for other areas I'm content to wait. I can't wait to see this in action!

@joshuef
Copy link
Contributor

joshuef commented Mar 15, 2019

I have to agree about putting the RDF apis in the core. I think we need to rely on this for building out our client side structures etc (NFS/PNS as you touch on), and this makes much more sense than a separate lib or getting into vaults (which if it were wanted later on could be added in, I think).


For Solid compatibility we need to support the LDP protocol at some level.

I think that's largely out of scope for this RFC @theWebalyst . If LDP/SOLID compatibility are desirable, IMO, that should be catered for by an application further up the stack (a modified SOLID server eg).

@happybeing
Copy link

happybeing commented Mar 15, 2019

@joshuef sounds good. So I'm hearing:

  • Client Libs: RDF Data support available as this RDF alongside other 'primitives' (such as NFS, WebID, immutable data)

  • SAFE Nodejs AIP: exposing Client Libs pretty much as is still (so not LDP)

  • TBD: separate modular libraries, Maidsafe or third party, providing features such as LDP emulation and closer compatibility with Solid. This is what I now have working at PoC standard for Solid.

At the moment that means: swap this fork of solid-auth-client into your Solid app, and it will work [cough] on SAFE (or on http if you upload to the Web). That module uses the LDP emulation from Safenetworkjs which in turn uses SAFE NFS, so my question is about MaidSafe's plans for NFS.

For example, do you think NFS will remain as is, or might the data structures change to use RDF to represent the file system rather than the keys to Mutable Data?

Any thoughts or guidance in how that might turn out in practice?

Whatever happens, I prefer LDP and NFS to share the same underlying data structures to make it easy for developers to maintain access to the same files and resources, whichever API is used to create and access them.

Sorry this is out of scope for the RFC - we could take it to the forum if there's more to discuss.

@ashwinkumar6
Copy link

  1. When removing triplets from the graph, will the user have to fetch the graph from the network, perform the 'remove triple' operation and then finally perform a mutate operation to update the graph. Or can the user directly perform the remove triple operation without having to fetch the entire graph from the network.

  2. The RFC says we do not force the user to follow a particular serialization format, but its recommended to support Turtle and JSON-LD. Would we be exposing API for serializing and deserializing all types of formats like (RDFa, TriG, XML)

  3. How does Schema work on the SAFE network, as according to our fundamentals, the SAFE network will not use any clearnet services)

  4. Would we have a Schema specific to the SAFE network later on

@lionel-faber
Copy link
Contributor

This looks great! Bringing RDF functionality to the core of safe_app will greatly improve the representation standard of data on the network. I have a couple of doubts though:

  • By saying, support the ability to change and archive the resources , we mean that the data can be updated and the previous versions will be archived, yes? Only then, will data be perpetual.
  • With regard to encrypted RDF data, there can be multiple ways of doing this:
Store plain text data in RDF -> Serialize and encrypt -> Store on the network

This way the data is completely obfuscated and even though the data is in RDF format, it would make no sense unless decrypted. Another way would be

Serialize and encrypt data -> Store ciphertext in RDF with some metadata -> Serialize and store the entire RDF document.

Now, I can make some sense of the data using the metadata that I don't need a key to decrypt.
Both of these have their own pros and cons and there might be other approaches too. Will approaches be documented / recommended ?

  • Looking at RDF graphs, can a "disjoint" triplet be a part of the same graph? For eg:
    Consider people A, B, C, D, X & Y. And -> represents a knows relationship.
    If I have a graph where:
A -> B
B -> C
B -> D

Then should I be able to add a triplet that defines that X -> Y to the same graph?

@nbaksalyar
Copy link
Contributor Author

@lionel1704

support the ability to change and archive the resources <...> we mean that the data can be updated and the previous versions will be archived, yes?

Yes, exactly, it's basically the same deal as with web pages on the SAFE Network: you should be able to modify them and remove information, but ultimately it will be preserved and archived perpetually, and you should always be able to refer an older version. That's a very important feature for Linked Data.

I agree that the RFC text could use better wording here though.

With regard to encrypted RDF data, there can be multiple ways of doing this

Yes, the simplest one would be to just encrypt all serialised triples. It's more of a problem if we want to have more efficient retrieval and querying though, and that is left up for the implementation. The plain encryption scheme should be good enough for most cases, but it might be inefficient for larger resources (e.g. having some 100K-1M triples in them).

Will approaches be documented / recommended ?

In a separate RFC maybe, but so far I believe we should deliberately omit this from the RFC and go for a simple first implementation.

can a "disjoint" triplet be a part of the same graph?

It can be for sure, RDF graphs can contain any sorts of data. It's just a matter of your preference and design. :) You can think about RDF resources as HTML pages: specifications tell you nothing about what kind of data you should store there. And it's not necessary to follow any kind of a logic when you choose which triples you want to store in a particular resource.

@nbaksalyar
Copy link
Contributor Author

@ashwinkumar6

can the user directly perform the remove triple operation without having to fetch the entire graph from the network

This is a good question. It might be possible to support this operation if we choose a specific triple storage format. But, more importantly, the network also should support removal as an operation. It's yet to be decided how it will work with the perpetual data and if we'll have it at all. So yes, for now it's like you said: fetching data, apply the remove operation, sync it with the network.

Would we be exposing API for serializing and deserializing all types of formats like (RDFa, TriG, XML)

Not necessarily; we can support only the widespread ones (e.g. JSON-LD and Turtle), and the rest can be added by other libraries building on top of Client Libs. Thus the idea to have a trait for supporting different serialisation formats.

How does Schema work on the SAFE network, as according to our fundamentals, the SAFE network will not use any clearnet services)

It doesn't have to: it's just a convention and recommendation to make schemas/URIs in RDF resolvable to valid schema descriptions or resources. It's not required though, so we can still safely use the HTTP schema in RDF URIs on the SAFE Network. And of course it makes sense to use the safe:// schema for Linked Data: that's why supporting XOR URLs is essentially a prerequisite for this RFC implementation.

Would we have a Schema specific to the SAFE network later on

Yes, it's defined by the RFC for XOR URLs.

@nadiaburborough
Copy link
Contributor

Just a thought @nbaksalyar wrt

Will approaches be documented / recommended ? ... In a separate RFC maybe, but so far I believe we should deliberately omit this from the RFC and go for a simple first implementation.

It might be worth briefly documenting alternate proposals to show why they are not suitable for this first implementation. That way its clear the option you are recommending is the 'simplest' for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants