Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why MUST /foo and /foo/ be the same ? #242

Closed
bblfish opened this issue Feb 24, 2021 · 10 comments
Closed

Why MUST /foo and /foo/ be the same ? #242

bblfish opened this issue Feb 24, 2021 · 10 comments

Comments

@bblfish
Copy link
Member

bblfish commented Feb 24, 2021

In the specs § URI slash semantics I read

If two URIs differ only in the trailing slash, and the server has associated a resource with one of them, then the other URI MUST NOT correspond to another resource. Instead, the server MAY respond to requests for the latter URI with a 301 redirect to the former.

I think restrictions using MUST must have very good justifications, or else one is closing doors unnecessarily.

This restriction seems unnecessary and feels confusing. As shown in the predecessor to this issue. Node Solid Server which was widely deployed behaves differently. Did that cause problems? What problems?

There is of course no problem if servers do behave the way described. But why could there not be good use cases for them behaving differently?

Here are some use cases I can think of

/Ulysses as an external view to /Ulysses/

Let us imagine that /Ulysses/ is protected by the publisher, but the publisher would like people to know what the contents - the Cover - of /Ulysses/ is. Then /Ulysses could return the cover information, price, age restrictions or whatever in human readable form. I.e. /Ulysses could content negotiate to /Ulysses.html

Tar files

Perhaps we want a way to upload tar files with a POST in one go. So a request to /cats Connegs to /cats.tar but a request on /cats/ could show the (read-only) contents on the tar file.

Blog with picture content

Someone writes a blog at /2021/04/01 quickly telling a story of what happened to them that day. Later that person wants to add pictures. Why not put those in /2021/04/01/ while keeping /2021/04/01 and /2021/04/01.html for the blog post ?

These are three use cases that occur to me. Perhaps there is a good reason to exclude them?

Addendum

Furthermore the next sentence in § URI slash semantics goes on to say

Instead, the server MAY respond to requests for the latter URI with a 301 redirect to the former.

So the Resources MUST be the same but they MAY redirect to one another. What is the sense of "the same" that we are speaking of here?

@bblfish bblfish changed the title Why MUST one of /foo and /foo/ be the same ? Why MUST /foo and /foo/ be the same ? Feb 24, 2021
@bblfish
Copy link
Member Author

bblfish commented Feb 24, 2021

My guess is that the redirect from /foo to /foo/ is just a good default to have when people are entering URLs by hand. In that case I suggest replacing the text with a best practice guideline instead:

If two URIs differ only in the trailing slash, and the server has associated a resource with only one of them then the server should by default redirect on a GET from the non-existent resource to the other using a 303.

That allows the other use cases to be build where it makes sense, but helps guide server authors to good defaults.

@mwherman2000
Copy link

mwherman2000 commented Feb 24, 2021

I don't know what the Solid project's view is towards the use of arbitrary DIDs (decentralized identifiers) and DID method spaces for identifying and locating resources, but, in that world, 2 different identifier (strings) that only differ in a trailing slash would still be considered to be identifiers for 2 distinct Subjects (aka resources).

Something to consider...

@TallTed
Copy link
Contributor

TallTed commented Feb 24, 2021

Maybe sticking with one thread on this apparently slippery topic makes some sense? At least, linking back from each newer thread to its precursor will help new readers catch up on this round of this undying thread (not quite, but coming close to, httpRange-14).


First thing, please note that this is not about the generic question of whether 2 URIs that differ only in trailing slash identify different resources. This is about the specific question of how *SS handles requests for two such URIs within its purview.

Please note that the question of the /foo->/foo/ (and /foo/->/foo/index.ttl or ->/foo/index.html or ...) redirect has been discussed multiple times (each for a long time), and that at least the latter is a configurable option in apache and nginx and similar, for which *SS is trying (in some if not all ways) to be a substitute.

Decisions were made about how *SS should behave, and those decisions were eventually enshrined in (what I hope is the current version of) the spec (which may or may not be aligned with current versions of *SS). These decisions were explicitly not about any software or general principle other than *SS.

@bblfish
Copy link
Member Author

bblfish commented Feb 24, 2021

@TallTed I clarified the link to the previous issue above. I opened a new one as I previously confused a couple of issues and it did lead to a minor PR. So I thought i would try to summarize the argument cleanly.

Note: I think the defaults are good as defaults, and that is why I suggested the text above. They don't seem to be needed as requirements, though. But I may very well be missing something.

@bourgeoa
Copy link
Member

@bblfish

Node Solid Server which was widely deployed behaves differently.

This used to be true and is not anymore.

Did that cause problems? What problems?

That indeed caused problems mainly seen when listing the parent folder content using mashlib or any editor not displaying the / as part of the name.
Semantic of / do not seem to imply the folder name to include a / in POSIX or Windows.
I think we are back to Tim's objective to maintain a high compatibility with how existing filesystems are viewed both by humans and machines.

@bblfish
Copy link
Member Author

bblfish commented Feb 27, 2021

Thanks for the feedback, @bourgeoa.
Ah I see mashlib is a solid desktop project which TimBL presented at Ted a while ago.

It would be nice to understand the problem more precisely. Perhaps there is a link to where the issue was discussed?

I am not quite clear what is meant by Editor. Is that something like TextEdit or Vim? Or is it an editor using mashlib?

If it is a problem with desktop clients then I can see that they would have problems allready with content negotiation.
For example in order to build good RDF one wants a file /card.ttl to be able to link to </group#g1>, whose content is in </group.ttl> on the web so that one does not tie oneself to one representation. This is discussed in Cool URLs for the Semantic Web, and there was a recent thread on this File System to Solid Mapping.

But perhaps the problem comes when one wants to write a FS view of Solid, in order to retrofit existing applications? Perhaps then one could not distinguish a file system path /cat with one for /cat/? And indeed content negotiation would be difficult to implement too there... But then perhaps the FileSystem analogy is being pushed a bit too far. I am all for thinking of Solid as an extension of the concept of file systems, indeed mostly of Filesystems done right, but we should not need to be constrained that much by older technologies.

@TallTed
Copy link
Contributor

TallTed commented Feb 27, 2021

@bblfish - Challenges abound. *SolidServers seek to present a filesystem-like interface via the web, while also permitting actual filesystem interaction with the same data & resources, while doing other things. Some of these things are actively in competition/contradiction, like the /foo/ convention for http directories and the /foo convention for filesystem directories. Permissions are another mishmash, with filesystems mostly being Read/Write/Execute via Users/Groups, and Solid enabling lots more granularity, and questions about filesystem owner vs Solid admin vs Solid user.

@bblfish
Copy link
Member Author

bblfish commented Feb 27, 2021

yes, I am implementing it currently as a mapping to the file system, so am grappling with these issues. I'll report back on my findings...

@bblfish
Copy link
Member Author

bblfish commented Mar 15, 2021

I am finding that my implementation is following the redirects as specified currently, as it is easier to do that way.
So this issue can be closed if cleanup of issues is needed. I may be able to come back to it at a later date.

@csarven
Copy link
Member

csarven commented Mar 16, 2021

Will close the issue with the understanding that the requirements in the specification are currently deemed to be useful and implemented/able - or at least do not pose a problem as it stands. If there are differences of opinion in the design or wording, we can revisit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants