Adress the possibility of using both file paths and URIs #12

Open
prathe opened this Issue Jul 13, 2012 · 8 comments

2 participants

@prathe

especially since filenames can be taken as relative URIs from the original context.

File systems are less restrictive than RFC 3986 regarding accepted charset and reserved characters set. After a bit of thinking I don't see how one can be see as a superset of the other. Some URI parsers would not be able to match a file path and some file path parsers would not be able to match an URI.

The algorithm that determines the context (the base directory or base URI may need to be different if the document uses URI references instead of file paths. So I guess the usage of URI relative references must be specified explicitly because they are not distinguishable; some file path are also valid URI relative reference.

@jonm
Owner

Ok, this is a fair point. I suspect the right way to go here is to revert and standardize the current format on POSIX filenames, if we can find a definition for that. That way we can at least capture the current diff output.

@prathe

This is what I'm thinking at the moment:

There should be a spec for the last version which is POSIX.1-2008. But it looks quite restrictive such as having no more than 14 characters within

ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
0123456789._-

text/diff may not specify anything about filenames or filepath but leave it to the OS which process the diff. Because the portability issue of a diff is the consumer responsibility. By default, maybe the spec should not restrict the path charsets, or define it as any 8bit char sequence which should not cause any problem?

But it may be interesting for those who wants to produce diff which are portable and complying to the POSIX specification, to advertise their diff as such. A parameter could be use.

text/diff; path-format=posix

And for uri

text/diff; path-format=uri

By default the default value for path-format could be something like char.

@prathe

Hi Jon,

It may be interesting to support the normal diff output: diff original new. I'm looking for the draft that it could also define new link relations. In fact the original and the new resource could be determined by link relations. So not having the path inline in the entity would be possible.

I'm working on a couple of new sections of the draft. Don't know if I'll be able to produce something good.

@jonm
Owner

Hi Philippe,

How would that work? Remember that this format has to stand alone, to some extent, so we cannot guarantee that the transport mechanism has a concept of links at all (for example, text/diff could get carried in the body of a MIME message). Standard practice thus far has been to document link relations as their own RFC, unless they are necessary to the processing of a media type (which, arguably they aren't, given that 'diff' doesn't support them currently).

I do think the link relations are going to be useful when figuring out how to work this into HTTP, for sure; I'm just not convinced they are part of this particular RFC.

Jon

@prathe

Jon,

I don't see the unified format as standing alone neither. When you receive a unified diff in an email you will have to rely on informations such as the mail subject or the sender name, email, etc. to figure out what that diff is about and/or where you could apply it on your file system.

Unless a unique target could be determined a client will need to rely on out of band informations in order to guess where to patch the diff file. Having relative file path does not give you enough information as where the diff document could apply to. I don't see what in a diff file allow you to determine the target just by inspecting the output. Any directory on your system can hold the same relative file paths as the one that appear in a diff. How would you determine the target in a diff file if a single line change was made to ./README?

And why does "standing alone" is important anyway? When you download a file from a form on a Web site, does that file itself have to stand alone? What does that mean?

The link relations affordance is just a matter of giving representations designer the possibility to link resources together and add semantics. I believe that as its simplest definition, without any file involved, a normal diff output is useful. Imagine a resource that represents the difference between two strings. What would stand alone mean and why is it relevant? I guess it would be better to argue that it would be more useful if the representation were an hypermedia.

This media type should be useful and should not restrict simple Web usage like the one related in my previous example. Of course some designers will use them as if the Web was a big file system, but this media type should [edited] not only define it as such because it would prevent being used as the Web is meant to be used.

@jonm
Owner
@prathe

Yes Jon I would like to write some part of the specs and send you a pull request. But I don't consider those conversations abstract. They help me to minimize the risk of going in the wrong direction.

@jonm
Owner

Fair enough! Just be careful of writing a long comment here when a pull request would be about as much work. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment