Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File URI Scheme #59

Closed
mnot opened this issue Jul 15, 2015 · 36 comments
Closed

File URI Scheme #59

mnot opened this issue Jul 15, 2015 · 36 comments
Assignees

Comments

@mnot
Copy link
Member

mnot commented Jul 15, 2015

The IETF APPSAWG is working on the file scheme:
https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-02

@mnot
Copy link
Member Author

mnot commented Jul 16, 2015

Mike west notes that it'd sure be nice if this referred to the Origin spec.

@torgo
Copy link
Member

torgo commented Jul 16, 2015

At f2f: @travisleithead says MS people say it "looks great" - didn't answer the question about origins in relations to the file scheme but that is maybe an operating system-specific question.

@torgo torgo added this to the tag-telcon-2015-07-29 milestone Jul 16, 2015
@mnot
Copy link
Member Author

mnot commented Jul 16, 2015

@annevk any thoughts? Seems like fetch integration would be good.

@annevk
Copy link
Member

annevk commented Jul 16, 2015

Skimming through that draft I don't see it really tackling any problem (parsing or retrieving) well... Agreed that if someone defined retrieval based on a file URL that would be good for Fetch.

@travisleithead
Copy link
Contributor

There hasn’t been – to this point – a document that has written done how this scheme works, and for that reason this is a good document.

The Web Origin Concept seems to punt on how to handle origins though--and that's what I'm most keen on seeing described, at least the mapping to the OS I care about. This is not directly related to this issue, however. See also this old blog post.

@annevk
Copy link
Member

annevk commented Jul 16, 2015

But it doesn't define how it works.

@mnot
Copy link
Member Author

mnot commented Jul 16, 2015

@annevk - do you think it would be possible to define implementation details in fetch, and align with this doc?

@annevk
Copy link
Member

annevk commented Jul 16, 2015

I don't understand what you think this document defines that's novel or useful?

File URL parsing should be defined by the URL Standard (that also defines the origin of URLs btw). File URL fetching/retrieval could be defined by Fetch, if we really wanted to (not sure why it would be useful).

@mnot
Copy link
Member Author

mnot commented Jul 16, 2015

Last I checked URL standard was still in limbo, no?

@annevk
Copy link
Member

annevk commented Jul 16, 2015

Not sure what you mean. Maybe in politics land. (There's open issues for sure, including around file URLs, but this draft addresses none of those.)

@mnot
Copy link
Member Author

mnot commented Jul 16, 2015

everything is political, Anne - even the WHATWG :)

@annevk
Copy link
Member

annevk commented Jul 16, 2015

Anyway, I guess what I'm saying is that it's being implemented and somewhat actively maintained (as time allows).

@mikewest
Copy link

file: URL's origins don't interop, and this new document ignores that issue completely. I agree with @travisleithead that we need a document that doesn't punt on this, but I agree with Anne that this document isn't it. :)

Note, of course, that URL also explicitly leaves file: URL origins as an exercise for the reader.

@annevk
Copy link
Member

annevk commented Jul 17, 2015

Other than parsing it's still not clear to me that interop on file URLs is needed much, but I'm happy to assist anyone who wants to do work on the other aspects. I'll fry some other fish meanwhile.

@mnot
Copy link
Member Author

mnot commented Jul 17, 2015

I can make suggestions / requests of the WG next week in Prague; anything in particular you'd like to see them do? Note that "don't publish this" is probably not going to fly, especially considering that (to them) the URL Standard doc has no status.

@mnot
Copy link
Member Author

mnot commented Jul 17, 2015

(mm fried 🐠)

@annevk
Copy link
Member

annevk commented Jul 17, 2015

They should do whatever makes them happy. It's just not of use to us.

@dbaron
Copy link
Member

dbaron commented Jul 17, 2015

It also seems like it might be worth explaining slightly more of the security aspects that go beyond the Web's origin model. Gecko has linking prohibitions that apply to file URLs that are separate from origin checks, that, e.g., forbid inclusion of images from http to file and similar, to prevent attacks like the ones mentioned briefly in Security Considerations (e.g., using files in /dev/ for various things such as depleting randomness sources). (In Gecko, I think these are the CheckLoadURI* checks.) It also may be worth explaining origins (e.g., different directories being different origins) as well.

@travisleithead
Copy link
Contributor

+1 to @dbaron; Trident has a variety of blocks from http -> file from things like iframe, object, and img sources to name a few.

Re: origins, We also have a request to define which file: origins have access to other file: origins, if at all. The common case is whether file://c:/directory/file.htm has access to something in its current directory (file://c:/directory/otherfile.htm) and/or in a sub-directory (file://c:/directory/subdir/file.htm). I'd love to explore thoughts about that.

@diracdeltas
Copy link
Contributor

I was assigned to review this but am hosed with work these few weeks. @travisleithead do you want to take the lead on putting together the review?

@travisleithead
Copy link
Contributor

I'd be OK with that, but I already have Issue 61 to prepare for 7/29... mayhaps someone else would like to grab this?

@diracdeltas
Copy link
Contributor

I read through this and didn't have any comments; like others, I thought this was going to address issues like origin separation and fetch. It makes sense that it doesn't, I guess.

@diracdeltas
Copy link
Contributor

@mnot are you interested in taking this on? Otherwise, I'll just write a doc summarizing this thread before the meeting tomorrow.

@diracdeltas
Copy link
Contributor

@diracdeltas
Copy link
Contributor

TAG meeting decision:

  • @mnot and others, please give thoughts on whether this should be part of the URL Standard.
  • i'll update the draft above with those comments

@domenic
Copy link
Member

domenic commented Jul 29, 2015

@mnot and others, please give thoughts on whether this should be part of the URL Standard.

Not sure if I'm others, but I'll be bold:

I think this document is not quite aligned in goals and style with the URL Standard. For example, it defines a BNF for matching or rejecting a given string as "a file URL," instead of rules for parsing strings. (Some of these are present, e.g. "Translating Local File Path to file URI" would be part of the parsing rules. But that does not handle the 'nonstandard' variations.) It also has case-sensitivity as optional instead of normatively choosing one way or another, and depends on network conditions for translating strings into file URLs ("Translating Non-local File Path to file URI"), as well as the host's Unicode support.

Finally, there's no real indication as to whether the doc has been tested against real-world browser behavior (or other URL parsing/serialization libraries). That might not be something RFCs include though, or I might have missed it, or it might be elsewhere...

As such I don't think it really can give a helpful answer to how to parse or serialize file URLs in an interoperable way across multiple pieces of software.

@annevk
Copy link
Member

annevk commented Aug 16, 2015

@travisleithead my thoughts on origins are that I would love to isolate all file URLs from each other, even from themselves, by always returning a new globally unique identifier as origin. I don't think folks should use file URLs for web development. Browsers should probably make it easy to serve up pages from http://localhost/ somehow.

However, I've not written that in the specification since implementations are doing different things and it's not clear they are interested in aligning on that.

@domenic
Copy link
Member

domenic commented Aug 16, 2015

FWIW it seems Blink is at least somewhat interested in aligning as such; see https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/--KjSROtcMQ. However, https://code.google.com/p/chromium/issues/detail?id=517819 and in particular https://src.chromium.org/viewvc/blink?view=revision&revision=200313 is not encouraging.

@torgo
Copy link
Member

torgo commented Sep 15, 2015

Taken up at Boston f2f. New version of File URI draft published in Late July. Does not seem to address the issues raised.

@torgo
Copy link
Member

torgo commented Sep 15, 2015

Working on feedback https://github.com/w3ctag/wiki/wiki/fileUrl

@torgo
Copy link
Member

torgo commented Sep 15, 2015

closed on the basis of that feedback - @mnot to send over to apps area wg people at ietf.

@torgo torgo closed this as completed Sep 15, 2015
@mnot
Copy link
Member Author

mnot commented May 31, 2016

FYI: https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-10

4.  File Name Encoding

File systems use various encoding schemes to store file and directory
names.  Many modern file systems store file and directory names as
arbitrary sequences of octets, in which case the representation as an
encoded string often depends on the user's localization settings, or
defaults to UTF-8 [STD63].

When a file URI is produced, characters not allowed by the syntax in
Section 2 SHOULD be percent-encoded as characters using UTF-8
encoding, as per [RFC3986], Section 2.5.

However, encoding information for file and/or directory names might
not be available.  In these cases, implementations MAY use heuristics
to determine the encoding.  If that fails, they SHOULD percent-encode
the raw bytes of the label directly.

@annevk seem sane(r)?

@annevk
Copy link
Member

annevk commented May 31, 2016

It doesn't seem useful since it basically leaves it up to the implementation. I don't think that RFC matters though.

@mnot
Copy link
Member Author

mnot commented May 31, 2016

Why, because of the SHOULD? Would a MUST make you happier (to the extent you care about it)?

@annevk
Copy link
Member

annevk commented May 31, 2016

I'm not sure, but this is basically saying that file systems do whatever (true) and implementations can do whatever (true) except with a lot of confusing requirements that make it seem like there's some logic here (not true).

The algorithm you need takes a file URL, an OS-flavor, and returns a file. Describing that and saying what is not defined yet seems much more helpful than this.

@mnot
Copy link
Member Author

mnot commented May 31, 2016

This text is about starting with a file on a filesystem, some environment, and ending up with a file URI; i.e., generation, not parsing.

Parsing would be defined in URL Standard, presumably. I think this helps by making what the encoding is supposed to be less ambiguous (as opposed to previous drafts).

If your reaction is "mostly harmless", that's OK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants