File URI Scheme #59

mnot · 2015-07-15T08:48:50Z

The IETF APPSAWG is working on the file scheme:
https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-02

mnot · 2015-07-16T09:18:48Z

Mike west notes that it'd sure be nice if this referred to the Origin spec.

torgo · 2015-07-16T10:36:27Z

At f2f: @travisleithead says MS people say it "looks great" - didn't answer the question about origins in relations to the file scheme but that is maybe an operating system-specific question.

mnot · 2015-07-16T10:39:16Z

@annevk any thoughts? Seems like fetch integration would be good.

annevk · 2015-07-16T10:44:40Z

Skimming through that draft I don't see it really tackling any problem (parsing or retrieving) well... Agreed that if someone defined retrieval based on a file URL that would be good for Fetch.

travisleithead · 2015-07-16T10:49:22Z

There hasn’t been – to this point – a document that has written done how this scheme works, and for that reason this is a good document.

The Web Origin Concept seems to punt on how to handle origins though--and that's what I'm most keen on seeing described, at least the mapping to the OS I care about. This is not directly related to this issue, however. See also this old blog post.

annevk · 2015-07-16T11:31:12Z

But it doesn't define how it works.

mnot · 2015-07-16T11:42:46Z

@annevk - do you think it would be possible to define implementation details in fetch, and align with this doc?

annevk · 2015-07-16T11:52:52Z

I don't understand what you think this document defines that's novel or useful?

File URL parsing should be defined by the URL Standard (that also defines the origin of URLs btw). File URL fetching/retrieval could be defined by Fetch, if we really wanted to (not sure why it would be useful).

mnot · 2015-07-16T12:01:36Z

Last I checked URL standard was still in limbo, no?

annevk · 2015-07-16T12:09:52Z

Not sure what you mean. Maybe in politics land. (There's open issues for sure, including around file URLs, but this draft addresses none of those.)

mnot · 2015-07-16T12:12:29Z

everything is political, Anne - even the WHATWG :)

annevk · 2015-07-16T12:20:49Z

Anyway, I guess what I'm saying is that it's being implemented and somewhat actively maintained (as time allows).

mikewest · 2015-07-17T13:24:57Z

file: URL's origins don't interop, and this new document ignores that issue completely. I agree with @travisleithead that we need a document that doesn't punt on this, but I agree with Anne that this document isn't it. :)

Note, of course, that URL also explicitly leaves file: URL origins as an exercise for the reader.

annevk · 2015-07-17T13:27:29Z

Other than parsing it's still not clear to me that interop on file URLs is needed much, but I'm happy to assist anyone who wants to do work on the other aspects. I'll fry some other fish meanwhile.

mnot · 2015-07-17T14:44:54Z

I can make suggestions / requests of the WG next week in Prague; anything in particular you'd like to see them do? Note that "don't publish this" is probably not going to fly, especially considering that (to them) the URL Standard doc has no status.

mnot · 2015-07-17T14:45:21Z

(mm fried 🐠)

annevk · 2015-07-17T15:05:43Z

They should do whatever makes them happy. It's just not of use to us.

dbaron · 2015-07-17T20:55:56Z

It also seems like it might be worth explaining slightly more of the security aspects that go beyond the Web's origin model. Gecko has linking prohibitions that apply to file URLs that are separate from origin checks, that, e.g., forbid inclusion of images from http to file and similar, to prevent attacks like the ones mentioned briefly in Security Considerations (e.g., using files in /dev/ for various things such as depleting randomness sources). (In Gecko, I think these are the CheckLoadURI* checks.) It also may be worth explaining origins (e.g., different directories being different origins) as well.

travisleithead · 2015-07-17T22:44:21Z

+1 to @dbaron; Trident has a variety of blocks from http -> file from things like iframe, object, and img sources to name a few.

Re: origins, We also have a request to define which file: origins have access to other file: origins, if at all. The common case is whether file://c:/directory/file.htm has access to something in its current directory (file://c:/directory/otherfile.htm) and/or in a sub-directory (file://c:/directory/subdir/file.htm). I'd love to explore thoughts about that.

diracdeltas · 2015-07-22T18:40:24Z

I was assigned to review this but am hosed with work these few weeks. @travisleithead do you want to take the lead on putting together the review?

travisleithead · 2015-07-23T17:46:38Z

I'd be OK with that, but I already have Issue 61 to prepare for 7/29... mayhaps someone else would like to grab this?

diracdeltas · 2015-07-23T22:55:12Z

I read through this and didn't have any comments; like others, I thought this was going to address issues like origin separation and fetch. It makes sense that it doesn't, I guess.

diracdeltas · 2015-07-27T21:21:06Z

@mnot are you interested in taking this on? Otherwise, I'll just write a doc summarizing this thread before the meeting tomorrow.

diracdeltas · 2015-07-29T19:18:57Z

very rough draft: https://github.com/w3ctag/spec-reviews/pull/65/files

diracdeltas · 2015-07-29T21:00:20Z

TAG meeting decision:

@mnot and others, please give thoughts on whether this should be part of the URL Standard.
i'll update the draft above with those comments

domenic · 2015-07-29T21:07:25Z

@mnot and others, please give thoughts on whether this should be part of the URL Standard.

Not sure if I'm others, but I'll be bold:

I think this document is not quite aligned in goals and style with the URL Standard. For example, it defines a BNF for matching or rejecting a given string as "a file URL," instead of rules for parsing strings. (Some of these are present, e.g. "Translating Local File Path to file URI" would be part of the parsing rules. But that does not handle the 'nonstandard' variations.) It also has case-sensitivity as optional instead of normatively choosing one way or another, and depends on network conditions for translating strings into file URLs ("Translating Non-local File Path to file URI"), as well as the host's Unicode support.

Finally, there's no real indication as to whether the doc has been tested against real-world browser behavior (or other URL parsing/serialization libraries). That might not be something RFCs include though, or I might have missed it, or it might be elsewhere...

As such I don't think it really can give a helpful answer to how to parse or serialize file URLs in an interoperable way across multiple pieces of software.

annevk · 2015-08-16T06:01:04Z

@travisleithead my thoughts on origins are that I would love to isolate all file URLs from each other, even from themselves, by always returning a new globally unique identifier as origin. I don't think folks should use file URLs for web development. Browsers should probably make it easy to serve up pages from http://localhost/ somehow.

However, I've not written that in the specification since implementations are doing different things and it's not clear they are interested in aligning on that.

domenic · 2015-08-16T13:20:53Z

FWIW it seems Blink is at least somewhat interested in aligning as such; see https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/--KjSROtcMQ. However, https://code.google.com/p/chromium/issues/detail?id=517819 and in particular https://src.chromium.org/viewvc/blink?view=revision&revision=200313 is not encouraging.

torgo · 2015-09-15T20:17:30Z

Taken up at Boston f2f. New version of File URI draft published in Late July. Does not seem to address the issues raised.

torgo · 2015-09-15T20:27:14Z

Working on feedback https://github.com/w3ctag/wiki/wiki/fileUrl

torgo · 2015-09-15T20:39:27Z

closed on the basis of that feedback - @mnot to send over to apps area wg people at ietf.

mnot · 2016-05-31T01:31:20Z

FYI: https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-10

4.  File Name Encoding

File systems use various encoding schemes to store file and directory
names.  Many modern file systems store file and directory names as
arbitrary sequences of octets, in which case the representation as an
encoded string often depends on the user's localization settings, or
defaults to UTF-8 [STD63].

When a file URI is produced, characters not allowed by the syntax in
Section 2 SHOULD be percent-encoded as characters using UTF-8
encoding, as per [RFC3986], Section 2.5.

However, encoding information for file and/or directory names might
not be available.  In these cases, implementations MAY use heuristics
to determine the encoding.  If that fails, they SHOULD percent-encode
the raw bytes of the label directly.

@annevk seem sane(r)?

annevk · 2016-05-31T06:28:07Z

It doesn't seem useful since it basically leaves it up to the implementation. I don't think that RFC matters though.

mnot · 2016-05-31T06:29:56Z

Why, because of the SHOULD? Would a MUST make you happier (to the extent you care about it)?

annevk · 2016-05-31T06:41:19Z

I'm not sure, but this is basically saying that file systems do whatever (true) and implementations can do whatever (true) except with a lot of confusing requirements that make it seem like there's some logic here (not true).

The algorithm you need takes a file URL, an OS-flavor, and returns a file. Describing that and saying what is not defined yet seems much more helpful than this.

mnot · 2016-05-31T06:57:21Z

This text is about starting with a file on a filesystem, some environment, and ending up with a file URI; i.e., generation, not parsing.

Parsing would be defined in URL Standard, presumably. I think this helps by making what the encoding is supposed to be less ambiguous (as opposed to previous drafts).

If your reaction is "mostly harmless", that's OK.

torgo assigned diracdeltas Jul 16, 2015

torgo added this to the tag-telcon-2015-07-29 milestone Jul 16, 2015

travisleithead closed this as completed Jul 16, 2015

travisleithead reopened this Jul 16, 2015

diracdeltas mentioned this issue Jul 29, 2015

Start File URI feedback #65

Merged

annevk mentioned this issue Aug 16, 2015

Is an URL’s path a list of strings or a single string? whatwg/url#33

Closed

torgo closed this as completed Sep 15, 2015

File URI Scheme #59

File URI Scheme #59

Comments

mnot commented Jul 15, 2015

mnot commented Jul 16, 2015

torgo commented Jul 16, 2015

mnot commented Jul 16, 2015

annevk commented Jul 16, 2015

travisleithead commented Jul 16, 2015

annevk commented Jul 16, 2015

mnot commented Jul 16, 2015

annevk commented Jul 16, 2015

mnot commented Jul 16, 2015

annevk commented Jul 16, 2015

mnot commented Jul 16, 2015

annevk commented Jul 16, 2015

mikewest commented Jul 17, 2015

annevk commented Jul 17, 2015

mnot commented Jul 17, 2015

mnot commented Jul 17, 2015

annevk commented Jul 17, 2015

dbaron commented Jul 17, 2015

travisleithead commented Jul 17, 2015

diracdeltas commented Jul 22, 2015

travisleithead commented Jul 23, 2015

diracdeltas commented Jul 23, 2015

diracdeltas commented Jul 27, 2015

diracdeltas commented Jul 29, 2015

diracdeltas commented Jul 29, 2015

domenic commented Jul 29, 2015

annevk commented Aug 16, 2015

domenic commented Aug 16, 2015

torgo commented Sep 15, 2015

torgo commented Sep 15, 2015

torgo commented Sep 15, 2015

mnot commented May 31, 2016

annevk commented May 31, 2016

mnot commented May 31, 2016

annevk commented May 31, 2016

mnot commented May 31, 2016