-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement/ensure Portability of user data #285
Comments
Perhaps the phrasing should be about interoperability? Two apps that use different RDF vocabularies are not interoperable, by default. With solid, there are other things that break interop, too, like container layout and data shapes and permissions and inbox use. |
Interoperability is for a different issue. (In general, interop is a much broader issue, and is an overall aim of our project.) Portability specifically means that "the user can take their data elsewhere", whenever they wish. I updated the issue description to clarify. |
Ah, portability between pods instead of portability between applications. Sure. I'd suggest phrasing the issue in terms of user experience, not technologies. I think it will emerge from that framing that the heart of this issue is actually the redirection mechanism (of which 301 is only one option). Saying that's optional is a bit like saying freedom of the press is optional. It only a problem when you need it. |
Saying it's optional recognizes the fact that maintaining 301 redirects by the old server, when the user is no longer a paying customer and has moved on, is a policy decision on the part of the service provider. Hence a strong recommendation, and not a spec-level requirement. As for phrasing -- the specific phrasing here is in terms of server capabilities. As in, this is a placeholder/reminder for us to implement those features. |
I dont think this is a feature that should be advertised too heavily, as it creates a false expectation. Moving links on the web is always going to be hard. We can help a bit, but never fully solve the problem, nor should we try to. If you want to guarantee portability, dont use the web. Use another scheme. The web's strength isnt portability, it's stability, which leads to a giant network effect. You cant have both. Users should instead be encouraged to choose URIs well with long term expectations. I've seen too many projects get hung up on this issue, worry about redirects, invent new URI schemes and end up losing their value proposition, for marginal or negligible gain. Let's do what we can but bear in mind the diminishing returns inevitable with this difficult issue. |
From what I understand, one of the core goals of both Solid and the Crosscloud Project is to enable portability of user data on the web. |
Use case: Alice uses databox.me as her pod provider for a couple years, at alice.databox.me. Then she decides she wants to switch to her own domain. Maybe databox.me gets sold to some company she doesn't like. Maybe they have a security breach that concerns her. How can she move without significant pain? The current designs for solid basically leave her stuck, as far as I can tell. At absolute minimum, she has to be able to set up 301 redirects from everywhere in alice.databox.me. But that still allows databox's owners to see and influence lots about her visitors. On the current web, no one actually changes links because of a 301 so software might be relying on those redirects for many years, and every time giving the now-untrustworthy databox.me all their request headers, possibly including their webid. And what if it goes down? For some years, I've been calling this problem "subdomain portability". Are there other data portability issues? |
It's quite interesting that alice expects a company that she doesnt like, to give her a free service after she leaves. As an aside: That company she doesnt like can also track her activity. In general, links out are relatively portable, and links in (when not relative ie cross origin) are relatively hard to change. I think we could mathematically prove that the open world assumption makes this an intractable problem. It's important to note that you'll never solve ever use case, we can get some tho. |
(I removed my previous comments, given that they make no sense now that the initial post has been filled up.)
|
@bblfish +1 |
This does not look like the right way to do things but as a data point: WebDAV Redirect Reference Resources |
It seems to me that if one is to edit the headers of a resource it may be better to have a Link header to a resource that contains those headers that it makes sense to edit, and allow those to be edited. Perhaps the Link: <doc,meta>; rel="meta" Then with an ontology of headers a client could just PUT or PATCH that resource with the right vocab, so that it read: <doc> redirect <http://other.server/doc> . This clearly needs a lot more thinking about. |
I think there's a privacy issue here. While the user experience may be apparently seamless in many cases, and transparent to the user. Should the user be informed that their previous provider will continue to be able to track aspects their social activity, even after they have moved (and in some cases indefinitely)? |
Longer term the key is to have solid software implement permanent redirects properly, changing the referrer. And probably checking all links at least once every 30 days. That way, traffic to the old URLs should drop off quickly to just the URLs humans are still using. Maybe after 30 days or so html requests can switch to a page which explains the old URL is going away soon. |
Thanks @bblfish it was looking pretty good until the XML :-) |
I assume you mean rel=describedBy when you say rel=meta |
I don't think rel=describedBy works for linking to a graph of link header triples. The describedBy graph is clearly under the control of applications. It's where you put ask the metadata about a jpeg, for instance, in LDP. The headers, on the other hand, are meant to be communication to/from the server. I can't find any specs about the meaning of link headers on PUT or PATCH. How about we go with the semantics that PUT resets (removes) any links the client can control and PATCH leaves them intact; in both cases the provided ones are then added. That doesn't work for my redirect suggestion though, since the PUT would be redirected, not clear the redirection header. Hmm. Acl is a link rel the client is not allowed to affect. Type is a link rel the client is allowed to affect. The use cases are pretty varied and I don't know some of them. So, my revised proposal is links are added by using them on post/put/patch and removed by pointing then at some kind of magic flag value, or something like that. |
So are you prosing that the protocol will potentially change my turtle files if I link somewhere? Will the linked to resource need to have an entry in the ACL to do this? |
The specs for the permanent redirects, 301 and 308, say that when you get them, you're supposed to change the referring URL if you can. I'm suggesting:
I think this works. It relies on the idea that when you put something in RDF, you're actually stating what the RDF states, so you shouldn't mind if it gets rewritten into another form that means the same thing. This is why Skolemization and triple-reordering are okay. |
(I.e. This is something we can do with RDF but not other data formats) |
Thanks for pointing this out. Good info.
Is it worth sketching this protocol out as a proposal?
I see advantages, particularly in the general case of giving users more control over return codes. But Im cautious about adding this complexity at this point to the servers, which are supposed to be a dumb as possible (tho im not saying it wrong). Anecdotal evidence over the years have shown redirects to be a pain and add complexity (both to client and server). Ultimately it's a problem that can never be fully solved, because a provider has no obligation to offer a service that costs them resource, if a customer has terminated their relationship with them. Neither do links in have to comply with automatically changing, some will not want to do that, and can never be forced to. Its an interesting problem to think about tho. |
I'm leaning toward the notification protocol just being a PATCH. I'd like to add a flag suggesting it's because of redirection, but I can't think of a good way to do that. Ideas? A weird hack might be you set the webid to https://www.w3.org/2016/anonymous-link-corrector . I agree it's complexity, and I don't think we need it right away, but I think we should have it in mind as possibly necessary for the ecosystem. True, the problem can't be 100% solved, but I think it can be sufficiently mitigated, just like when people move to another house. Once in a while they'll lose an important connection, but no one feels trapped, unable to relocate, because they might miss some mail. This proposal doesn't rely on providers being nice. You keep paying your old provider during the redirection period, which is likely to be one month, although you watch your traffic dropping off and decide for yourself when it's low enough to stop paying. In fact, I think the baseline RTOS should allow customers to take subdomains with them (paying a nominal service fee), so this stuff isn't needed just because you want to switch providers. So this isn't about service providers, it's about branding and image. Maybe you got alice.databox.me and you don't want to be associated with databox.me any more, in the minds of your contacts. Or maybe you've been using cleverkid251.com and decided that name doesn't fit you any more. People need to be able to change names. |
In the proposal I think there's one obvious method overlooked that is actually optimal. As part of your storage you can either
Many web hosters do this already. The web has a well deployed redirection system, namely, DNS. History has shown us that almost all projects that try and replicate this in some way, have failed to get traction. Let's not be one of them! If someone cares alot about moving this is the tried and tested way to do it. As soon as an HTTP URI is shared in the global namespace is starts to accrue reputation and value. In an open world assumption, moving it will destroy some of that reputation and value. We should not be trying to tell the user otherwise, as I feel that is misleading. |
I wouldn't want to use a system where I was stuck with one public persistent identification string for the rest of my life, or longer. Rebranding will never be trivial, as you point out, but it needs to be seen as possible, and it would be great if we can make our part of it relatively painless. |
This issue is to discuss possible ways to enable portability of user data.
(Originally opened by @nicola)
(Moved from solid/solid-spec#72)
Portability specifically means that "the user can take their data elsewhere", whenever they wish. It is a combination of the following features:
The text was updated successfully, but these errors were encountered: