Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discoverability of root and controllers of Pods - some thoughts #10

Closed
megoth opened this issue Aug 19, 2019 · 31 comments
Closed

Discoverability of root and controllers of Pods - some thoughts #10

megoth opened this issue Aug 19, 2019 · 31 comments

Comments

@megoth
Copy link

megoth commented Aug 19, 2019

This is probably a bit too early in the process, but wanted to share some thoughts for a couple of data points that we've had to assume in the current data browser (as the current spec doesn't help us).

I've written up the thoughts on https://megoth.inrupt.net/public/SolidDataBrowser/pod-info.html. If you use dokieli to annotate, please let me know, so I can add you to my WebID as foaf:knows (which should enable me to see your annotations).

I would hope that we can standardize some mechanism that allows for discoverability of storage for a given resource. I also think discoverability of controllers are useful, but I understand that there are tougher questions to deal with on that one.

@RubenVerborgh

This comment has been minimized.

@RubenVerborgh
Copy link
Member

I don't think "root of pod" should be a thing; I don't think the concept of "pod" has a place in the data model. It's only resources and folders.

@megoth
Copy link
Author

megoth commented Aug 19, 2019

I don't think "root of pod" should be a thing; I don't think the concept of "pod" has a place in the data model. It's only resources and folders.

If all of these instances are replaced with storage, does that make it better?

The goal is to have some way of knowing what the storage for a given resource is without assuming anything about the setup for the given Pod.

@RubenVerborgh
Copy link
Member

If all of these instances are replaced with storage, does that make it better?

Probably, yes.

The goal is to have some way of knowing what the storage for a given resource is

Or in the solution-oriented (which is not always good) terminology I currently have in mind: what shape it conforms to. So then the resource would be part of a folder, which would conform to a shape structure, recursively.

@megoth
Copy link
Author

megoth commented Aug 19, 2019

The goal is to have some way of knowing what the storage for a given resource is

Or in the solution-oriented (which is not always good) terminology I currently have in mind: what shape it conforms to. So then the resource would be part of a folder, which would conform to a shape structure, recursively.

I agree that shapes are important for resources, but that doesn't solve the question of which storage a resource belongs to, I think? Just asking in case I have missed something wrt shapes.

@RubenVerborgh
Copy link
Member

doesn't solve the question of which storage a resource belongs to, I think?

I don't think that question really has an answer though. A resource might be associated with zero, one, or multiple storages.

@megoth
Copy link
Author

megoth commented Aug 19, 2019

doesn't solve the question of which storage a resource belongs to, I think?

I don't think that question really has an answer though. A resource might be associated with zero, one, or multiple storages.

True, but in the cases where straightforward answers exist, why not serve them? I understand that there are complex cases around this, but I think these data points are useful to clients such as data browsers to offer context.

@RubenVerborgh
Copy link
Member

True, but in the cases where straightforward answers exist, why not serve them?

Name one 🙂

I randomly picked https://drive.verborgh.org/public/icon-decentralization-markets.png, but feel free to pick any other example. What is the storage of this resource?

@megoth
Copy link
Author

megoth commented Aug 19, 2019

True, but in the cases where straightforward answers exist, why not serve them?

Name one 🙂

I randomly picked https://drive.verborgh.org/public/icon-decentralization-markets.png, but feel free to pick any other example. What is the storage of this resource?

I would guess https://drive.verborgh.org/ is the storage of this resource.

@RubenVerborgh
Copy link
Member

I would guess drive.verborgh.org is the storage of this resource.

Possibly. Could also be https://drive.verborgh.org/public/.
They are not really distinguishable from an external point of view.

The point I was trying to make is that the straightforward cases are probably the exceptions rather than the other way round.

@megoth
Copy link
Author

megoth commented Aug 19, 2019

I would guess drive.verborgh.org is the storage of this resource.

Possibly. Could also be https://drive.verborgh.org/public/.
They are not really distinguishable from an external point of view.

Exactly, but only one of them is the root of the Pod.

Now, perhaps this is a good time to ask why shouldn't the root of the Pod be a thing in the data model? (I should've asked that in the beginning.)

The point I was trying to make is that the straightforward cases are probably the exceptions rather than the other way round.

I understand, and it might be that this is too expensive for too little value. But in the current data browser I think it gives value, so I would like us to explore the possibility of having something like a pointer to the root of the Pod standardized.

@RubenVerborgh
Copy link
Member

Exactly, but only one of them is the root of the Pod.

You don't know that actually. I can perfectly declare https://drive.verborgh.org/public/ to be the root of the pod, and the contrary cannot be proven.

Now, perhaps this is a good time to ask why shouldn't the root of the Pod be a thing in the data model?

As much as I dislike turning the tables as an argument, I need to here 🙂 Solid is just LDP, and LDP does not define any special root concept. So the question is: does Solid need such a root concept, and why?

So far, we have not had that need. My profile points to a location apps can write to. That might or might not be a (root) pod. It's just a folder and it behaves the same as all other folders. The root has no interesting observable behavior that makes it different.

having something like a pointer to the root of the Pod standardized.

But what is the use case? Like, on a high level?

Knowing that there's no 1-to-1 user to pod mapping, i.e., there's just WebIDs with access to folders.

@ericprud
Copy link
Contributor

@megoth, can you elaborate a bit on the use cases for the root? My guess is that it serves as a start of authority for ACLs/delegation and a slew of metadata we've not yet explored like associations between shapes and storage locations. Also, maybe a place to look for a .well-known directory.

@TallTed
Copy link
Contributor

TallTed commented Aug 21, 2019

@RubenVerborgh

Solid is just LDP

If Solid were just LDP, there'd be no need for a distinct Solid spec.

@RubenVerborgh
Copy link
Member

If Solid were just LDP, there'd be no need for a distinct Solid spec.

We both know that—I was trying to summarize something more specific:

Now, perhaps this is a good time to ask why shouldn't the root of the Pod be a thing in the data model?

As much as I dislike turning the tables as an argument, I need to here 🙂 Solid is just LDP, and LDP does not define any special root concept.

"The parts of the Solid specification that deal with resource access and relations between resources are simply lifted from the LDP spec. The LDP spec does not mention the notion of a root or a pod. So on the resource/HTTP level, a pod does not really 'exist'. Just like the HTML and HTTP specifications will not define a "website", but rather access to a set of interconnected resources."

@TallTed
Copy link
Contributor

TallTed commented Aug 21, 2019

Thank you for the detailed expansion. I think that does a much better job of answering the concern, and avoids down-the-road misunderstandings stemming from the shorthand.

@megoth
Copy link
Author

megoth commented Aug 22, 2019

@megoth, can you elaborate a bit on the use cases for the root? My guess is that it serves as a start of authority for ACLs/delegation and a slew of metadata we've not yet explored like associations between shapes and storage locations. Also, maybe a place to look for a .well-known directory.

The main reason for this is that I think it is helpful to give some context to the resource a user is exploring. We've done this in the latest iteration of mashlib (aka the data browser) where a user can click on the Solid icon to navigate to the root of the current Pod, so that they can explore more of the Pod that way. But as this may be based on faulty assumptions, I thought it prudent to standardize it some way.

(The user can of course try this out manually by manipulating the URL, but I thought it useful to offer this navigation.)

I understand this might not be how LDP and Solid is envisioned. And even though there might be no concept of root in LDP, there is in the WAC spec, that states:

  1. The root container of a user's account MUST have an ACL resource specified. (If all else fails, the search stops there.)

Side note: Another use case is for the data browser to know which ACL files which cannot be deleted. But I think this can be handled better by using the WAC-Allow header when getting the ACL resource.

In the end I cannot offer any good use cases for the ability to discover a root of a Pod except that I think it's useful for users who want to explore the storage(s) where the resource is located. If this is counter the philosophy of Solid, I understand that we should not extend the data model to describe this.

@RubenVerborgh
Copy link
Member

The main issue is going to be defining what the root of a pod is. It's similar to (but more complex than) determining what website a page belongs to. If I have http://sub.example.de/site/subsite/page.html, what website does it belong to? Justifications can be given for any of:

So I think we are just working with ill-defined concepts, and that would be our main problem forward.

@megoth
Copy link
Author

megoth commented Aug 22, 2019

Ok, so the root of Pod is not a concept we want to introduce, I understand that it is problematic.

But what about the other part of the document I started this thread with, the notion of getting a list of controllers for a given resource?

@RubenVerborgh
Copy link
Member

"Controllers" meaning "agents with Control access"?

Two comments there:

  1. We might want to limit access to that list to people who have Control access themselves (cfr. how it works on platforms like GitHub, I think). But in in that case, this requirement might already be fulfilled, given that Control agents have access to the ACL and thus to other Controllers' WebID.
  2. Given 1., we might want to look into the use case above. Do we need access to the list of people, or do we want to send those people a message (such as an access request)? Because for the latter, the list itself does not need to be exposed.

@megoth
Copy link
Author

megoth commented Aug 22, 2019

"Controllers" meaning "agents with Control access"?

First and foremost, yes - but we could extend the discussion if this list should also include a teams/groups who have access to the server, e.g. the maintainers of the server.

  1. We might want to limit access to that list to people who have Control access themselves (cfr. how it works on platforms like GitHub, I think). But in in that case, this requirement might already be fulfilled, given that Control agents have access to the ACL and thus to other Controllers' WebID.

If we require Control access to see the list of controllers, then we already have all the support in WAC. So what I'm asking is if this list should be exposed to users who do not have Control access. Again, this is about providing context for resources (I think it's useful to know who has Control access to the resource).

  1. Given 1., we might want to look into the use case above. Do we need access to the list of people, or do we want to send those people a message (such as an access request)? Because for the latter, the list itself does not need to be exposed.

A request to get the list might be a solution - I'm a bit curious of how the flow would be on this though, e.g. when do users get notification that they've gotten access? Every time a controller responds, e.g. send a notification that the request has been declined or accepted, and then perhaps a final notification when all in the list has responded? What if one or more controllers never respond? Should there perhaps be a time-limit? Should there be a possibility of sending reminders? Perhaps automate the reminder?

I guess the question is really about which level of anonymity a controller can assume?

@RubenVerborgh
Copy link
Member

(I think it's useful to know who has Control access to the resource).

Not disagreeing, but this seems counter to how existing platforms are implemented. (e.g., I can't see for random GitHub repositories who manages issues, despite my rights to read and write them.) So at the very least, might need to be opt-in.

A request to get the list might be a solution

Sorry, expressed myself unclearly. I was thinking about use cases for needing the list of controllers; one such use case is "asking permission to read/write a resource". And I wanted to point out that this specific use case could be solved by a read/write request, so it does not require that list. But there might be other use cases that do require that list.

@megoth
Copy link
Author

megoth commented Aug 30, 2019

I think I'll close this issue for now. I still think it's useful to be able to point to the root of a storage and the ability to get a list of controllers for a resource, but I cannot argue for it better than that I think it can be useful for users familiarizing themselves with decentralized resources.

I might revisit this when I have better use-cases in mind, but unless anyone objects I'll close this thread for now. (Good to not clutter with unresolved issues, after all.)

@megoth megoth closed this as completed Aug 30, 2019
@megoth
Copy link
Author

megoth commented Sep 2, 2019

Remembered one use case that we have in the current data browser, so wanted to add it to hear what people think.

We have something we call the homepagePane which we want to show when a user is visiting the top-most container of a Pod, aka the root (at least in my mind). Is this something that warrants the need for knowing that a certain container is in fact the root of the Pod?

If not, please feel free to close this thread again.

@megoth megoth reopened this Sep 2, 2019
@RubenVerborgh
Copy link
Member

Top-most container seems like a well-defined concept, so it's at least technically possible.

@elf-pavlik
Copy link
Member

While I consider pod as some informal term which different people use meaning different things solid/specification#16 I think we could try to clarify further object of statements with space:storage like in.

<https://ruben.verborgh.org/profile/#me> pim:storage <https://drive.verborgh.org/> .

From http://www.w3.org/ns/pim/space

    ws:Storage     a :Class;
         :comment """A storage is a space of URIs in which you have access to data.
""";
         :label "storage" .

    ws:storage     a rdf:Property;
         :comment "The storage in which this workspace is";
         :label "storage";
         :range ws:Storage;
         owl:inverse  [
             :label "workspace included" ] .
    ws:uriPrefix     a rdf:Property,
                owl:DatatypeProperty;
         :comment """URIs which start with this string are in this workspace or storage.
This may be used for constructing URIs for new storage resources.
""";
         :label "URI prefix";
         ui:prompt "Give the first part of the URis in this workspace" .

I think https://www.w3.org/TR/void/ has some similar and possibly clearer definitions

  • void:Dataset A set of RDF triples that are published, maintained or aggregated by a single provider.
  • void:inDataset Points to the void:Dataset that a document is a part of.
  • void:subset | A void:Dataset that is part of another void:Dataset
  • void:uriSpace | A URI that is a common string prefix of all the entity URIs in a void:Datset.
  • void:rootResource | A top concept or entry point for a void:Dataset that is structured in a tree-like fashion.

@megoth the last one void:rootResource sounds like something you talk about

An old issue about VoID and PIM solid/vocab#7

@elf-pavlik
Copy link
Member

Another reference from RFC 7235 Hypertext Transfer Protocol (HTTP/1.1): Authentication

Protection Space (Realm)

The "realm" authentication parameter is reserved for use by
authentication schemes that wish to indicate a scope of protection.

A protection space is defined by the canonical root URI (the scheme
and authority components of the effective request URI; see Section
5.5 of [RFC7230]) of the server being accessed, in combination with
the realm value if present. These realms allow the protected
resources on a server to be partitioned into a set of protection
spaces, each with its own authentication scheme and/or authorization
database. The realm value is a string, generally assigned by the
origin server, that can have additional semantics specific to the
authentication scheme. Note that a response can have multiple
challenges with the same auth-scheme but with different realms.

It appeared in conversations around Authentication and Authorization. Not sure yet about how those Protection Spaces (Realms) supposed to align with space:storage instances, still we have concept of root URI for partition of resources defined in HTTP spec. Mentioned issue consider handling multiple realms per origin #9

@csarven
Copy link
Member

csarven commented Mar 13, 2020

Generally speaking, the discovery of a resource's root would require a declaration, like a relation (eg. via Link, an RDF property). On the other hand, deriving the root resource (which is typically a container but not necessarily if we really broaden the definition [I'd suggest against that]) by examining the URI is straightforward as per RFC 3986. Moreover, there is the slash semantics and hierarchical containment that clarify the structure of memberships, as well as the restriction on having the same authority for a container and its member resource.

I'm not quite convinced about the use case which may end up requiring a relation, but I do think that applications needing to determine the root for their use case can do it by examining the URI.

@elf-pavlik
Copy link
Member

I think this issue should be transferred to https://github.com/solid/specification /cc @csarven

@csarven
Copy link
Member

csarven commented Feb 3, 2022

The Solid Protocol describes Storage and its owners. If @megoth is satisfied, the issue should be closed.

@megoth
Copy link
Author

megoth commented Feb 3, 2022

The Solid Protocol describes Storage and its owners. If @megoth is satisfied, the issue should be closed.

TL;DR: The original problems should be solveable, and this issue can be closed.

Yes, I think this should be enough, but just to be sure I want to lay out how the original problems could be solved. Apologies for the use of informal terms beforehand, I hope I make it clear enough anyway.

Context: A user has opened a resource in a Pod browser app (apps that allows users to browse data in a Pod, e.g. Inrupt's PodBrowser, mashlib). The app wants to provide some context about the resource.

Find the root for a breadcrumb navigation

Use-case: To add context about the location of a resource

Given that resources might reside in multiple storages, I understand that this use case might not make sense. I do still think though for a lot of cases there still might be useful to show the location if a storage's URL is part of the resource's URL, e.g. the resource https://storage.test/foo/bar has one of its topmost containers at https://storage.test/. So, given this assumption, an app could check each container URL, and if any of them states that they're a storage, the app can assume that this should be presented as the root of the breadcrumb navigation.

In the case that none of the container URLs are described as a storage, the app cannot know what the root of the breadcrumb navigation is, and could hide the navigation altogether or indicate some other way that root cannot be determined.

In the case that there are multiple containers that describe themselves as storage (e.g. https://storage.test/foo/ and https://storage.test/ both describe themselves as storage), the app should probably stop at the first container when traversing from the resource URL (i.e. it should stop at https://storage.test/foo/ and present it as root of the breadcrumb).

I'm ok with this solution, although I'm not thrilled about how expensive it can get. An app might have to do a lot of fetches to determine the root of a breadcrumb navigation (e.g. https://storage.test/a/b/c/d/e/f/g/h/i/j.ttl could result in 10 fetches to determine storage). But since it could probably be more expensive (not to mention hard to manage) to maintain a link from resources to their storages, I understand that this is the least of two evils (and probably a bunch of other complications that I can't think of now).

Display the owner's name in the root of the breadcrumb navigation

Use-case: To provide context to visitors about the owner a storage (it could be argued that it's probably more useful with "Arne's Pod" than "https://storage.test").

Given that a storage can be resolved (as explained in previous section), and the server link to the owner's WebID (the server always know the owner, but might withhold that information), and the WebID can be dereferenced (the WebID can be private), and the WebID document provides a name (e.g. foaf:name, vcard:fn), the app could use that name in root of the breadcrumb navigation. Lots of things might fail, so the app needs to be aware of this possible scenarios and handle them appropriately.

I think this is a satisfactory solution.

Given that my understanding of this is correct (or close enough at least, in that details might be somewhat off, but the problems can be solved), I think we should close this issue. I'll assume that they are, and close this issue for now. Please reopen if there are some problems that needs to be adressed.

@megoth megoth closed this as completed Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants