Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isValid "\\\\?\\UNC\\" #191

Open
Bodigrim opened this issue Mar 11, 2023 · 7 comments
Open

isValid "\\\\?\\UNC\\" #191

Bodigrim opened this issue Mar 11, 2023 · 7 comments

Comments

@Bodigrim
Copy link
Contributor

> System.FilePath.Windows.isValid "\\\\?\\UNC\\"
True
> putStrLn "\\\\?\\UNC\\"
\\?\UNC\

I think this is wrong: \\?\UNC\ is incomplete, it is nether file nor folder name.

-- | Is a FILEPATH valid, i.e. could you create a file like it? This function checks for invalid names,
-- and invalid characters, but does not check if length limits are exceeded, as these are typically
-- filesystem dependent.

If we are in agreement that isValid should return False on this input, there is a harder question ahead. What should be the output of makeValid? Something like \\?\UNC\_\_?

@hasufell
Copy link
Member

hasufell commented Mar 11, 2023

Related: #92

isValid is a hot mess on windows.

I'm not sure how much improvement we can drive here with ad-hoc bugfixes.

The underlying problem is that we're not parsing windows filepaths, although there are pieces that allow us to put together a proper grammar:

With that we could implement a more meaningful version of isValid.

@hasufell
Copy link
Member

{--
; ABNF for windows paths
; based on https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/62e862f4-2a51-452e-8eeb-dc4ff5ee33cc?redirectedfrom=MSDN
; missing: unix separators
filepath = namespace *"\" namespace-tail
/ UNC
/ [ disk ] *"\" relative-path
/ disk *"\"
relative-path = 1*(path-name 1*"\") [ file-name ] / file-name
path-name = 1*pchar
file-name = 1*pchar [ stream ]
; namespaces
namespace = file-namespace / device-namespace / nt-namespace
namespace-tail = ( disk 1*"\" relative-path ; C:foo\bar is not valid
; namespaced paths are all absolute
/ disk *"\"
/ relative-path
)
file-namespace = "\" "\" "?" "\"
device-namespace = "\" "\" "." "\"
nt-namespace = "\" "?" "?" "\"
UNC = "\\" 1*pchar "\" 1*pchar [ 1*"\" [ relative-path ] ]
disk = ALPHA ":"
stream = ":" 1*schar [ ":" 1*schar ] / ":" ":" 1*schar
; path compontent charactes (all printable chars except '\')
pchar = %x21-5B / %x5D-7E
; stream compontent charactes (all printable chars except '\' and ':')
schar = %x21-39 / %x3B-5B / %x5D-7E
--}

@Bodigrim
Copy link
Contributor Author

I'm not sure how much improvement we can drive here with ad-hoc bugfixes.

I agree. My bigger concern is that while at least in theory isValid could be made correct, makeValid is fundamentally broken on Windows. It's not like you can meaningfully repair any Windows path at all. Even current behaviour makeValid "test*" == "test_" is a bit of WAAAAT? Maybe mark it as deprecated?..

@hasufell
Copy link
Member

hasufell commented Mar 11, 2023

Ok, so things are a little more complicated on windows wrt "\\\\?\\UNC\\".

These are not statically assigned special names afaiu. Instead those are some form of object symlinks that are maintained inside of windows (and can be viewed in the WinObj browser tool). Also see: https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#nt-namespaces

There are many more, e.g. look at:

\\?\UNC\localhost\c$\foo\bar                       -> \\localhost\c$\foo\bar
\\?\GLOBALROOT\GLOBAL??\UNC\localhost\c$\foo\bar   -> \\localhost\c$\foo\bar
\\?\HarddiskVolume2\foo\bar                        -> C:\foo\bar (if HarddiskVolume2 is C:)
\\?\GLOBALROOT\GLOBAL??\HarddiskVolume2\foo\bar    -> C:\foo\bar (if HarddiskVolume2 is C:)
\\?\GLOBALROOT\Device\Harddisk0\Partition2\foo\bar -> C:\foo\bar (if Harddisk0\Partition2 is C:)

(all the above are somewhat equal)

The fact that filepath as a library treats \\\\?\\UNC\\ special is in my opinion more of a wart than a feature. I don't consider \\\\?\\UNC\\ a special case in my grammar. The meaning of those object links can only fully be understood when performing IO. Some of them may be somewhat conventional, but still...

Maybe @Mistuke has another opinion.

@Bodigrim
Copy link
Contributor Author

AFAIU https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats#dos-device-paths, \\?\UNC\ is a special case. Namely, Windows filenames can be:

  • Traditional DOS paths, C:\foo\bar
  • UNC paths, which is a confusing name, because they are commonly known as "shared drive paths", \\server\share\file.
  • DOS device paths, which are an attempt at universal resource identification going beyond file system, roughly approaching Unix model. These start from \\.\, followed by resource name.

Now there is a bit of confusion. If you want to format a traditional DOS path as a device path, you can just append \\.\ to C:\foo\bar, obtaining \\.\C:\foo\bar. The same does not apply for UNC paths to shared drives, because you end up with \\.\\server\share\file and device paths are not supposed to contain \\ anywhere except the beginning. To overcome this restriction Windows introduces a workaround: instead of \\.\\server\share\file you are supposed to write \\.\UNC\server\share\file. So this is a special syntax.

@hasufell
Copy link
Member

So this is a special syntax.

It's not syntax, those are simply symbolic links. Again, there's also \\?\GLOBALROOT\GLOBAL??\UNC ...why we don't support that form? We can even do \\?\\GLOBALROOT\Device\Mup\localhost\c$\foo\bar.

UNC

@Mistuke
Copy link

Mistuke commented Mar 12, 2023

The fact that filepath as a library treats \\?\UNC\ special is in my opinion more of a wart than a feature. I don't consider \\?\UNC\ a special case in my grammar. The meaning of those object links can only fully be understood when performing IO. Some of them may be somewhat conventional, but still...

FWIW I agree, Inside GHC's handling we only really treat \\?\ and \\.\ as special.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants