Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pim:storage in profile documents by default #910

Open
Vinnl opened this issue Aug 16, 2021 · 19 comments
Open

Add pim:storage in profile documents by default #910

Vinnl opened this issue Aug 16, 2021 · 19 comments

Comments

@Vinnl
Copy link

Vinnl commented Aug 16, 2021

pim:storage in the profile should exist, according to the spec according to @csarven:

Clients can discover a storage by making an HTTP GET request on the target URL to retrieve an RDF representation [RDF11-CONCEPTS], whose encoded RDF graph contains a relation of type http://www.w3.org/ns/pim/space#storage. The object of the relation is the storage (pim:Storage).

@matthieubosquet believed this wasn't the case and was going to respond to the linked chat, but since @RubenVerborgh asked me to report it anyway, here we are :)

@joachimvh
Copy link
Member

According to the current version of the spec the only thing that is required is the link header, so perhaps that part is outdated? https://solid.github.io/specification/protocol#storage

But we can easily update the template, so it's not a big issue to add it.

@csarven
Copy link

csarven commented Aug 16, 2021

I didn't say "should". I'd say "encouraged" or "can" :) The Protocol does not require that a profile includes pim:storage.

@Vinnl
Copy link
Author

Vinnl commented Aug 16, 2021

@joachimvh That sounds like a good idea, given how apparently there is no good alternative, so this is what apps will look for today, or they'll only work if they happen to also implement the fallback of travelling up the tree looking for the relevant Link header and the profile is in the same tree.

@matthieubosquet
Copy link
Contributor

According to the current version of the spec the only thing that is required is the link header, so perhaps that part is outdated? https://solid.github.io/specification/protocol#storage

There is already a MUST and clients should rely on that link header to determine that a resource is a storage. We should discourage clients to rely on non standard behaviour to assert the properties of a resource (for example it being a storage).

Relying on the statements in a resource as authoritative is really problematic since there is to my knowledge no standard way of restricting what data will be written to a resource (anyone having write access to a resource X will be able to state that resource X is a pim:storage even if it isn't).

Beyond that, @Vinnl's question is about a client determining where to store data. A client should not rely on a WebID to determine where to store data because there is no requirement for a WebID to be stored in a Solid server or in the Solid server where the user might want to store data created using a particular app.


PS:

if they happen to also implement the fallback of travelling up the tree looking for the relevant Link header.

This is not a fallback. It is the standard behaviour.

@Vinnl
Copy link
Author

Vinnl commented Aug 16, 2021

We should discourage clients to rely on non standard behaviour to assert the properties of a resource

Is it non-standard if the spec explicitly says clients can do that? Besides, I would emphasise that giving no alternative is not "discouraging", it's "making impossible". If we want clients to adhere to standardised behaviour (which I agree is the ideal to strive for), then there needs to be viable standard behaviour.

The choice at this point in time is, in my view, between making sure CSS will work most of the time, or theoretical purity and hoping the ecosystem will land somewhere workable years down the road. Up to you all I think, but I guess that's it clear what my preference would be :P

@matthieubosquet
Copy link
Contributor

Is it non-standard if the spec explicitly says clients can do that?

Yes, it is non-standard if it relies on something that is not a MUST in the spec. That statement should probably be removed from the spec as far as I'm concerned.

I have to say that ultimately you are right about a user experience relying on manual entry of base URIs being suboptimal. Specifically because said URI would have to be re-entered every time one decides to use a new application.

However, as discussed in Gitter, by relying on this non-standard and restrictive convention, you are forcing people using your applications to publicly advertise their storages. One could think of clever tricks making part of their WebID private and pointing to that private part through an owl:sameAs statement to make this work, but I doubt your application would support that or that it would be easier for you to implement for you. And in any case, that would remain relying on convention, not specified behaviour.

Providing a user with the freedom to choose the base URI for storing/reading resources with a specific app seems quite fundamental, doesn't it?

By the way, it also makes it easier to enforce client based restrictions. Let's say you create a todo app and I want to sandbox it in my solid server to mysolidserver.example.com/vincent'stodoapp/, then I would just have to open app permissions on that base URI to keep it away from everything else instead of having it at the root storage level... which doesn't seem unreasonable or like a secondary requirement at all.

@ianconsolata
Copy link
Contributor

So what is the expected way to find the root storage for a given WebID, if this predicate is not present? This broke our app when trying to support CSS based pods, and it seems like it has broken other apps as well based on that media kraken issue and Vincent's comments in gitter. It seems like there is a clear expectation in the developer community that the pim:storage predicate is managed by the pod server and publicly exported by it.

Justin mentions in that thread that there are alternative proposals in the spec for how to solve this problem, but it's unclear to me whether any of those are supported in community solid server yet. Are they? If so, how do I use them?

If not, the question remains: how do I know where to put user data? Are we really expected to ask folks to know and input a URI string manually (as seems to be @matthieubosquet's solution)? How is someone who just signed up for a pod on a new CSS pod host supposed to know exactly which path that was just automatically created for them corresponds to their storage?

@NoelDeMartin
Copy link
Contributor

@ianconsolata An alternative to pim:storage is to read parent containers from the webId document until a storage is found. This is actually what is documented in the spec:

Clients can determine the storage of a resource by moving up the URI path hierarchy until the response includes a Link header with rel="type" targeting http://www.w3.org/ns/pim/space#Storage.

That's actually what Media Kraken is doing, and as far as I know it works with CSS. If you're referring to #1138, it seems like the issue there was that the storage and the webId document where in different PODs. I'm not sure how common that scenario is, but I think if a user has enough knowledge to do that (having a webId in a different POD), they should know how to add the proper pim:storage triple themselves. In any case, it's always a good idea to add an option for advanced users to specify the location manually (Media Kraken doesn't do this yet, but I'll do it at some point).

Also, related with this, there are some efforts going on to standardize these fields in the webId document, you may be interested to check this out: https://github.com/solid/webid-profile/

@matthieubosquet
Copy link
Contributor

matthieubosquet commented May 20, 2022

In any case, it's always a good idea to add an option for advanced users to specify the location manually (Media Kraken doesn't do this yet, but I'll do it at some point).

I agree with @NoelDeMartin. The option to add storage manually should be a default in Solid Apps, if anything, because it is supposed to be decentralised and allow users to have multiple storages, some of which they may not wish to advertise publicly. They might also want to pull data from random Solid resources.

PS: Also worth mentioning that WebIDs are not necessarily hosted in a Solid server.

@ianconsolata
Copy link
Contributor

ianconsolata commented May 24, 2022

So if the WebId is not hosted in a Solid server, crawling up to parent containers might never result in a pim:storage, right? Or is pim:storage not necessarily always used to denote a Solid-compliant storage space? At best, this solution results in multiple API calls every time I want to figure out where to store things for a user, and will never return to me a full list of all the storage spaces a user has. I can do this if I have too, but it doesn't seem like a very good solution to the problem.

I agree having the option to add multiple storages is a good thing, but crawling up the path making multiple requests makes implementing that a lot harder. For example, with that solution I cannot easily display a UI to the user to allow them to choose where to store something from a prefilled list, but rather must ask them to input a URI manually. After I do that, then what? Do I ask them to do that every time, or do I set it in some file somewhere? It sounds like is being proposed here is that the list of storages should be managed and set by users / applications, which requires developers to duplicate a lot of effort.

I think the list of storage spaces is something the pod server needs to be responsible for managing and exposing to applications in a secure way. Asking users / application developers to manage a triple like this manually seems silly and error-prone when it could trivially be done by the pod host, especially given that a storage space cannot be allocated without a pod host being involved. Honestly, it seems like a resource that should never really be changed by applications, because only storage providers can declare with any authority that a given space belongs to a particular user.

I understand that the public profile document may not be the right place to expose them, but what about exposing them in a private pim:preferencesFile file, and then including a link to that document in the public webId? That way anyone with permission only has to make one additional call to figure out where to store things, and can get the full list of configured storage spaces regardless of where they are. It also provides a clearly defined space for applications / users to add new spaces, if they want to add additional choices manually.

I'm also a little confused, because this document in the webid-profile repo just referenced (https://github.com/solid/webid-profile/blob/main/notes/storage.md) and this document in the spec repo that it links to (https://github.com/solid/solid-spec/blob/master/solid-webid-profiles.md) both clearly state that profile documents SHOULD expose this field. So is this something that is expected to be part of the profile document or not? It seems like some efforts to standardize profiles say it SHOULD be included, but yet it seems like folks here are stating that it should not be included because it's not part of the spec. Is the position that Community Solid Server only implements the required features, not the recommended ones? If so, why? It seems like the main open source implementation should default to the recommended settings rather than the minimum settings, and allow for removing those settings if the host desires.

@joachimvh
Copy link
Member

From how I understand it, the repository that you linked (https://github.com/solid/webid-profile/blob/main/notes/storage.md) is something that was started specifically to investigate and create such a consensus on what profiles should contain, but is not finalized yet as far as I know. This issue here is actually older than that repository (and one of the causes for it to get started). Some related resources where the discussion also got started and where the people working on solving this are involved:

@NoelDeMartin
Copy link
Contributor

NoelDeMartin commented May 24, 2022

So if the WebId is not hosted in a Solid server, crawling up to parent containers might never result in a pim:storage, right? Or is pim:storage not necessarily always used to denote a Solid-compliant storage space? At best, this solution results in multiple API calls every time I want to figure out where to store things for a user, and will never return to me a full list of all the storage spaces a user has. I can do this if I have too, but it doesn't seem like a very good solution to the problem.

Something you can do is that your app writes the pim:storage in the profile if it doesn't exist. In my apps, I'm doing something similar by writing solid:privateTypeIndex if it doesn't exist and that's what I use to locate containers every time instead of doing multiple API requests. One worry about this approach would be that your app could be exposing the storage location, but personally I don't think that's an issue with the current state of the protocol (you can always ask for confirmation before modifying anything as well, but I think it will be confusing for most users).

Of course, this would not work for static webIds but I don't think that's a common use-case and if anyone is doing that I'd consider that they are experienced enough with Solid to know how to configure pim:storage and solid:privateTypeIndex on their own. We just need to make sure to communicate errors clearly in our apps.

I think the list of storage spaces is something the pod server needs to be responsible for managing and exposing to applications in a secure way. Asking users / application developers to manage a triple like this manually seems silly and error-prone when it could trivially be done by the pod host, especially given that a storage space cannot be allocated without a pod host being involved. Honestly, it seems like a resource that should never really be changed by applications, because only storage providers can declare with any authority that a given space belongs to a particular user.

I agree with most of that, but I think you should open an issue in the specification repo not here (actually, it seems like there is already one: solid/specification#310).

I'm also a little confused, because this document in the webid-profile repo just referenced (https://github.com/solid/webid-profile/blob/main/notes/storage.md) and this document in the spec repo that it links to (https://github.com/solid/solid-spec/blob/master/solid-webid-profiles.md) both clearly state that profile documents SHOULD expose this field.

The first repo is an ongoing effort to document what applications are currently using, technically it's not part of the spec yet (I don't know if it ever will). And the second repo has been deprecated and archived, the solid spec is here: https://github.com/solid/specification In any case, as you mentioned both say SHOULD, so as long as it doesn't say MUST you can't rely on this being implemented on a pod provider.

Is the position that Community Solid Server only implements the required features, not the recommended ones? If so, why? It seems like the main open source implementation should default to the recommended settings rather than the minimum settings, and allow for removing those settings if the host desires.

I'd be interested to hear an official position about this as well :).

@joachimvh
Copy link
Member

Is the position that Community Solid Server only implements the required features, not the recommended ones? If so, why? It seems like the main open source implementation should default to the recommended settings rather than the minimum settings, and allow for removing those settings if the host desires.

I'd be interested to hear an official position about this as well :).

I posted a reply in a semi-similar discussion with my opinion of how this server relates to the spec, specifically the first section: #233 (comment) @RubenVerborgh gave it a thumbs up so I guess that can be seen as an official reply 😄.

The summary is that CSS follows the official spec (or at least tries to), but is also open for potential features that are not strictly defined but can be used to push specifications forward. Of course the last part is a bit fuzzy and can't be strictly defined. On the one hand we don't want to fill the server with a bunch of features that are only vaguely Solid related, but on the other hand we also don't want to block things off.

It also depends on what we are focussing on and how much time/effort is available. All of the specs are still evolving, meaning that each new one that gets added also requires the additional effort of maintaining it. But we do want to add them, it just sometimes takes time. E.g., the Notifications spec is something that has been looked into but has not fully been added yet.

Things like this is one of the reasons why we made the server so modular. In the end this repository is just a bunch of building blocks we provide that can be used to set up a Solid server. If you want a Solid server with an extra feature that is not provided by this repository, you can create it and inject it through Components.js in an external dependency. This is something that has already been done by different parties, e.g. #1154 (comment). Specifically for the pim:storage triple, it requires adding 1 line to the pod templates provided here.

I can't tell you explicitly why that triple wasn't added originally since it has been a while, perhaps because the spec now had this new way of finding the root of a pod, or simply because it was not explicitly defined somewhere. Or even to explicitly differentiate from existing assumptions to reach consensus on what is expected.

I do think it's important to not have features that are required by Solid clients but are not explicitly defined somewhere. Otherwise you end up in a situation where it is harder to both make a client or a server as you can't know what is available or needed.

In the case of the pim:storage triple, it is clear from the discussions above and the linked issue that the problem is not a lack of potential solutions, where CSS could have been used to implement a suggestion, but more a lack of consensus on what solution is needed, which I don't think would be helped by CSS randomly implementing one of the available suggestions since it is already clear what is missing.

I hope I covered most of the relevant points. Summary is that we want CSS to be something people can use to experiment and play around with, and we want to (try to) do it in such a way that it doesn't break the Solid ecosystem.

@matthieubosquet
Copy link
Contributor

matthieubosquet commented May 24, 2022

Maybe of interest in terms of storage discovery, the WebID Profile group just summarized their current thinking on the subject: https://github.com/solid/webid-profile/blob/main/notes/pre-final-draft.md

@ianconsolata
Copy link
Contributor

Ok, I'm a little unclear on which of those specifications needs to include the MUST statement for CSS to support it. For now, I just added it to the issue in the main Solid spec repo.

In the meantime, I'd like to do this:

Specifically for the pim:storage triple, it requires adding 1 line to the pod templates provided here.

Is there a documented list somewhere of the values passed into those template files? I'd be happy to add that one line to our server, but I don't actually know the value for the storage, nor which variable (if any) in the template file exposes it.

@ianconsolata
Copy link
Contributor

It looks like those values might be coming from here (https://github.com/CommunitySolidServer/CommunitySolidServer/blob/main/src/pods/settings/PodSettings.ts), but that doesn't include storage so I am not sure what 1 line change you are referring to.

@joachimvh
Copy link
Member

That interface is not fully accurate (and has some issues that still need to be resolved).

The field you're looking for is podBaseUrl, which will point to the root of the pod

If you don't want the storage to be in the root of the pod you can append a container to that path in the template.

dj-sf pushed a commit to djsf-kobayashi/penny that referenced this issue Sep 30, 2023
Unfortunately, CSS does not set a pim:storage to the user's WebID
by default:
CommunitySolidServer/CommunitySolidServer#910

In that case, we make a bunch of additional HTTP requests to parent
URLs to find one that advertises itself as a storage root.
@elf-pavlik
Copy link
Contributor

If WebID profile is hosted in solid storage, information about that storage can be found in the storage description

Servers MUST include the Link header with rel="http://www.w3.org/ns/solid/terms#storageDescription" targeting the URI of the storage description resource in the response of HTTP GET, HEAD and OPTIONS requests targeting a resource in a storage.

IMO user should always opt-in to have their storage listed in their public webid document.
To respect the user's privacy any discovery of protected resources, which aren't meant to be publicly indexed and accessible, should require prior authorization.

@csarven
Copy link

csarven commented Dec 18, 2023

solid/specification#607

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants