Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: net/url: add FromFilePath and ToFilePath #32456

Open
bcmills opened this issue Jun 5, 2019 · 7 comments
Open

proposal: net/url: add FromFilePath and ToFilePath #32456

bcmills opened this issue Jun 5, 2019 · 7 comments

Comments

@bcmills
Copy link
Contributor

bcmills commented Jun 5, 2019

In the course of investigating #27698, I learned that the conversion between URLs and file paths on Windows is rather non-trivial. (See also #6027.)

According to https://blogs.msdn.microsoft.com/ie/2006/12/06/file-uris-in-windows/, a file path beginning with a UNC prefix should use the UNC hostname as the “host” part of the URL, with only two leading slashes, whereas a file path beginning with a drive letter should omit the “host” part and prepend a slash to the remainder of the path.

Once you know those rules, the implementation in each direction is only about ten lines of code, but it's easy to get wrong (for example, by neglecting the possibility of a UNC prefix) if you haven't thought about it in depth.

I propose that we add two functions to the net/url package to make these cases more discoverable, with the following signatures:

// ToFilePath converts a URL with scheme "file" to an absolute file path.
// It returns a non-nil error if the URL does not have the scheme "file" or
// the resulting path is not well-formed.
func ToFilePath(u *URL) (string, error) {
	[…]
}

// FromFilePath converts an absolute file path to a URL.
func FromFilePath(path string) (*URL, error) {
	[…]
}

CC @alexbrainman

@gopherbot gopherbot added this to the Proposal milestone Jun 5, 2019
@bcmills
Copy link
Contributor Author

bcmills commented Jun 6, 2019

There is at least one place where we have already gotten this wrong in the standard library:

if !browser.Open("file://" + out.Name()) {

@mvdan
Copy link
Member

mvdan commented Jun 7, 2019

I assume most uses will be string-based - is there a reason why you've designed the API around url.URL? It's easy to use url.Parse and url.URL.String to use the proposed API with strings, but the same could be said about using a string-based API with URLs.

@bcmills
Copy link
Contributor Author

bcmills commented Jun 7, 2019

@mvdan, two reasons:

  1. Using url adds type-safety at the call site. (It's much less likely that someone will accidentally confuse a URL for a file-path — or vice-versa — if they have different types.)
  2. The subtle parts of converting Windows paths to file:// URLs involve the Host field of the URL. It's much easier to express those subtle parts in terms of the Host and Path fields, and seems easier for users to diagnose unexpected results when they can clearly see how the path was split into a Host and Path (rather than counting slashes).

@rsc
Copy link
Contributor

rsc commented Aug 13, 2019

I think there's enough subtlety in the implementation here - even though the docs sound simple - that this is worth a design doc laying out the issues that an implementation will have to address. There's no rush on this, @bcmills, but I think that's the next step.

Will mark it on-hold for design doc.

@cuishuang
Copy link
Contributor

Is there any progress on adding this method? ^_^

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/438175 mentions this issue: client: handle file URI scheme on windows

gopherbot pushed a commit to golang/vuln that referenced this issue Oct 4, 2022
A file URI takes the form of
  file://host/path

https://en.wikipedia.org/wiki/File_URI_scheme

On windows, for example, vulndb located in c:\dir\vulndb will be

  file:///c:/dir/vulndb

Previously, the code took `file://` prefix and attempted to use the
remaining as a directory of local vulndb. On windows, that caused
to os.Stat on /c:/dir/vulndb when a correctly encoded URI was passed in.

Turned out file-uri parsing is a known, complex issue.

See golang/go#32456 for context.

This CL includes the code copied from the Go project.

https://github.com/golang/go/blob/2ebe77a2fda1ee9ff6fd9a3e08933ad1ebaea039/src/cmd/go/internal/web/url.go

Updated the tests to exercise the windows-like file path correctly
when testing on windows. Previously, tests depended on relative paths
or assumed an incorrect form of windows file uri (e.g. file://C:\workdir\gopath\src\golang.org\x\vuln\cmd\govulncheck/testdata/vulndb)

Change-Id: I5fc451e5ca44649b9623daee28ee3210a7b2b96c
Reviewed-on: https://go-review.googlesource.com/c/vuln/+/438175
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Zvonimir Pavlinovic <zpavlinovic@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
softdev050 added a commit to softdev050/Golangvuln that referenced this issue Apr 5, 2023
A file URI takes the form of
  file://host/path

https://en.wikipedia.org/wiki/File_URI_scheme

On windows, for example, vulndb located in c:\dir\vulndb will be

  file:///c:/dir/vulndb

Previously, the code took `file://` prefix and attempted to use the
remaining as a directory of local vulndb. On windows, that caused
to os.Stat on /c:/dir/vulndb when a correctly encoded URI was passed in.

Turned out file-uri parsing is a known, complex issue.

See golang/go#32456 for context.

This CL includes the code copied from the Go project.

https://github.com/golang/go/blob/2ebe77a2fda1ee9ff6fd9a3e08933ad1ebaea039/src/cmd/go/internal/web/url.go

Updated the tests to exercise the windows-like file path correctly
when testing on windows. Previously, tests depended on relative paths
or assumed an incorrect form of windows file uri (e.g. file://C:\workdir\gopath\src\golang.org\x\vuln\cmd\govulncheck/testdata/vulndb)

Change-Id: I5fc451e5ca44649b9623daee28ee3210a7b2b96c
Reviewed-on: https://go-review.googlesource.com/c/vuln/+/438175
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Zvonimir Pavlinovic <zpavlinovic@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
sayjun0505 added a commit to sayjun0505/Golangvuln that referenced this issue Apr 8, 2023
A file URI takes the form of
  file://host/path

https://en.wikipedia.org/wiki/File_URI_scheme

On windows, for example, vulndb located in c:\dir\vulndb will be

  file:///c:/dir/vulndb

Previously, the code took `file://` prefix and attempted to use the
remaining as a directory of local vulndb. On windows, that caused
to os.Stat on /c:/dir/vulndb when a correctly encoded URI was passed in.

Turned out file-uri parsing is a known, complex issue.

See golang/go#32456 for context.

This CL includes the code copied from the Go project.

https://github.com/golang/go/blob/2ebe77a2fda1ee9ff6fd9a3e08933ad1ebaea039/src/cmd/go/internal/web/url.go

Updated the tests to exercise the windows-like file path correctly
when testing on windows. Previously, tests depended on relative paths
or assumed an incorrect form of windows file uri (e.g. file://C:\workdir\gopath\src\golang.org\x\vuln\cmd\govulncheck/testdata/vulndb)

Change-Id: I5fc451e5ca44649b9623daee28ee3210a7b2b96c
Reviewed-on: https://go-review.googlesource.com/c/vuln/+/438175
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Zvonimir Pavlinovic <zpavlinovic@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
stanislavkononiuk added a commit to stanislavkononiuk/Golangvuln that referenced this issue Jun 26, 2023
A file URI takes the form of
  file://host/path

https://en.wikipedia.org/wiki/File_URI_scheme

On windows, for example, vulndb located in c:\dir\vulndb will be

  file:///c:/dir/vulndb

Previously, the code took `file://` prefix and attempted to use the
remaining as a directory of local vulndb. On windows, that caused
to os.Stat on /c:/dir/vulndb when a correctly encoded URI was passed in.

Turned out file-uri parsing is a known, complex issue.

See golang/go#32456 for context.

This CL includes the code copied from the Go project.

https://github.com/golang/go/blob/2ebe77a2fda1ee9ff6fd9a3e08933ad1ebaea039/src/cmd/go/internal/web/url.go

Updated the tests to exercise the windows-like file path correctly
when testing on windows. Previously, tests depended on relative paths
or assumed an incorrect form of windows file uri (e.g. file://C:\workdir\gopath\src\golang.org\x\vuln\cmd\govulncheck/testdata/vulndb)

Change-Id: I5fc451e5ca44649b9623daee28ee3210a7b2b96c
Reviewed-on: https://go-review.googlesource.com/c/vuln/+/438175
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Zvonimir Pavlinovic <zpavlinovic@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
@bcmills
Copy link
Contributor Author

bcmills commented Jan 29, 2024

I'm not sure what detail of our own we would need to add in a design doc laying out the issues — there are already at least two such documents:

The behavior currently implemented in cmd/go/internal/web, and which I propose to adopt here, is:

POSIX and Plan 9 paths should be handled as described in RFC 8089 appendix D.1, accepting three forms:

  • FromFilePath should prefer the “empty auth-path” form (file:///path/to/file), which seems to be the most common today.
  • ToFilePath should also accept the “no auth-path” form (file:/path/to/file) and the “explicit localhost auth-path” form (file://localhost/path/to/file).

Windows paths should be handled as described in https://learn.microsoft.com/en-us/archive/blogs/ie/file-uris-in-windows, with reference to RFC 8089 appendix E.2 and E.3.

  • FromFilePath should prefer the “empty auth-path” form (file:///C:/path/to/file), since that is preferred in the Microsoft post.
  • ToFilePath should also accept the “no auth-path” form described in E.2 (file:c:/path/to/file), because it is easy and unambiguous to do so. (url.Parse puts the path in the Opaque section of the parsed URL.)
  • ToFilePath should reject the four- and five-slash variations from E.3.2, because the Microsoft post explicitly calls out that form as inappropriate, and because path.Clean would change the //host.example.com part of the path to be just /host.example.com, which is semantically different.

Note that using the url.URL type as input and output sidesteps the question of what to do with two of the variations (the “vertical line character” variation in E.2.2 and the “backslash as separator” variation in E.4) because url.Parse already rejects those, and I am not proposing that we change url.Parse in any way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants