Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide case-sensitivity for normal Windows processes. #2954

Closed
DDoSolitary opened this issue Feb 15, 2018 · 42 comments
Closed

Provide case-sensitivity for normal Windows processes. #2954

DDoSolitary opened this issue Feb 15, 2018 · 42 comments

Comments

@DDoSolitary
Copy link

DDoSolitary commented Feb 15, 2018

Previous discussion:
#449 (comment)
#449 (comment)
#449 (comment)
#449 (comment)

Per-process control of case-sensitivity really helps program that wants to interact with WSL's file system.

@benhillis
Copy link
Member

I believe @SvenGroot is in the middle of writing a blog post you'll be very interested to read. Long story short is Windows will be supporting case-sensitivity on a per-directory basis.

@fpqc
Copy link

fpqc commented Feb 21, 2018

@benhillis I'm really impressed with how much of the WSL stuff (or stuff driven by WSL needs) is making it into the core functionality of Windows itself!

@therealkenc
Copy link
Collaborator

therealkenc commented Feb 21, 2018

The ask here stems from:

But if obcaseinsensitive is set to 1 (which is the default setting), NtCreateFile is case-insensitive even if OBJ_CASE_INSENSITIVE is not set. I don't think this behavior is fine-grained at all. This is just why I'm asking here...

In other words, the straightforward ask is for an OBJ_CASE_SENSITIVE flag to go with OBJ_CASE_INSENSITIVE. OBJ_CASE_INSENSITIVE doesn't get you there, because the installation default registry state is insensitive, and changing the installation default has undesirable consequences.

I am not sure that question will be answered in the forthcoming blog, which I believe (?) is related to NTFS extended attributes on directories, not NT object case sensitivity. If the ask is not related to the blog post, it might be polite to answer that here. Because it is a good question, and because DDoSolitary was asked politely to file a new issue. Or putting it in my own words: what blocked OBJ_CASE_SENSITIVE. It seems self-evidently desirable. Allowing for the possibility that both DDoSolitary and myself are missing something fundamental. Which would be edifying in of itself.

@fpqc
Copy link

fpqc commented Feb 22, 2018

@therealkenc I'm guessing there will be a flag on Windows directories in the "properties" dialogue, and NtCreateFile will now check the case sensitivity setting on the target directory?

@therealkenc
Copy link
Collaborator

I'm guessing there will be a flag on Windows directories in the "properties" dialogue

Sure, exactly like that. Explorer could highlight the case sensitive versus case insensitive directories too, perhaps with some new folder artwork.

[No, wait. I think the cheese was bad on my pizza tonight and I am not feeling very well. Nevermind. I have no idea if a flag in the Windows "properties" dialog is in the plans or not. Time will tell.]

@DDoSolitary
Copy link
Author

I just noticed that in build 17110 there is a command fsutil.exe file setcasesensitiveinfo enable to enable per-directory case-sensitivity. I wonder if it works for NtCreateFile.

@fpqc
Copy link

fpqc commented Mar 1, 2018

@DDoSolitary. Not file, directory.

@DDoSolitary
Copy link
Author

DDoSolitary commented Mar 1, 2018

@fpqc Well, I mean if I can create a files that only differ in case in directories marked as case sensitive with NtCreateFile.

@SvenGroot
Copy link
Member

We've just put up a blog post covering per-directory case sensitivity. Check it out!
https://blogs.msdn.microsoft.com/commandline/2018/02/28/per-directory-case-sensitivity-and-wsl/

@fpqc
Copy link

fpqc commented Mar 1, 2018

@SvenGroot Are you discussing shell support with the shell team for the properties dialog?

@therealkenc
Copy link
Collaborator

Are you discussing shell support with the shell team for the properties dialog?

Curious if they're getting the IEEE 1003.1-2008 band back together. Been a while since mode_t got a new bit.

@therealkenc
Copy link
Collaborator

@DDoSolitary

I just noticed that in build 17110 there is a command fsutil.exe file setcasesensitiveinfo enable to enable per-directory case-sensitivity. I wonder if it works for NtCreateFile.

Hmmm. No joy looking at fsutil.exe file setCaseSensitiveInfo foo enable with procmon:

"8:06:28.1748674 PM","fsutil.exe","10280","CreateFile","C:\","SUCCESS","Desired Access: Synchronize, Disposition: Open, Options: Directory, Synchronous IO Non-Alert, Attributes: n/a, ShareMode: Read, Write, AllocationSize: n/a, OpenResult: Opened"
"8:06:28.1748904 PM","fsutil.exe","10280","QueryNameInformationFile","C:\","SUCCESS","Name: \"
"8:06:28.1749028 PM","fsutil.exe","10280","QueryAttributeInformationVolume","C:\","SUCCESS","FileSystemAttributes: Case Preserved, Case Sensitive, Unicode, ACLs, Compression, Named Streams, EFS, Object IDs, Reparse Points, Sparse Files, Quotas, Transactions, 0x3c00600, MaximumComponentNameLength: 255, FileSystemName: NTFS"
"8:06:28.1749142 PM","fsutil.exe","10280","CloseFile","C:\","SUCCESS",""
"8:06:28.1750510 PM","fsutil.exe","10280","CreateFile","C:\Users\there\source\foo","SUCCESS","Desired Access: Read Attributes, Write Attributes, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Attributes: n/a, ShareMode: Read, AllocationSize: n/a, OpenResult: Opened"
"8:06:28.1750904 PM","fsutil.exe","10280","<Unknown>","C:\Users\there\source\foo","SUCCESS",""
"8:06:28.1758618 PM","fsutil.exe","10280","CreateFile","C:\Windows\System32\en-US\fsutil.exe.mui","SUCCESS","Desired Access: Generic Read, Disposition: Open, Options: , Attributes: n/a, ShareMode: Read, Delete, AllocationSize: n/a, OpenResult: Opened"
"8:06:28.1759302 PM","fsutil.exe","10280","CreateFileMapping","C:\Windows\System32\en-US\fsutil.exe.mui","FILE LOCKED WITH ONLY READERS","SyncType: SyncTypeCreateSection, PageProtection: PAGE_EXECUTE_READ|PAGE_NOCACHE"

"<Unknown>"

You can probably spawn fsutil on every directory as you unwrap the tarball. I'm thinking that will get you there.

But don't. You've put a lot of effort into the project (and props for that), but the right way to unpack the thing is with a static ELF tar on the WSL side, just like the guys in the Premier League are doing it.

@fpqc
Copy link

fpqc commented Mar 1, 2018

@therealkenc 😢

@DDoSolitary
Copy link
Author

DDoSolitary commented Mar 1, 2018

@therealkenc Well, it is the official way and a correct way but it's not necessarily the only correct way. I don't want to use it because it forces me to edit the registry only for runtime configuration and distribute bsdtar with my binary. Also, my current way makes features like LxRunOffline move and LxRunOffline duplicate possible.

@Biswa96
Copy link

Biswa96 commented Mar 1, 2018

I've made a program which can set the case sensitivity. @therealkenc The <unknown> one is NtSetInformationFile. Here s my code snippet:

	RtlDosPathNameToNtPathName_U(path, &FileName, NULL, NULL);
	InitializeObjectAttributes(&ObjectAttributes, &FileName, 0, NULL, NULL);
	NtCreateFile(&FileHandle, GENERIC_READ | GENERIC_WRITE, &ObjectAttributes, &IoStatusBlock, NULL, 0,
		FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, FILE_OPEN_IF, 0, NULL, 0);

	/* caseSensitive enable = 1; disable = 0; */
	int unknown_var = 71;

	NTSTATUS result = NtSetInformationFile(
		FileHandle, &IoStatusBlock, (void*)&caseSensitive, sizeof(int), unknown_var);

You can find the full code in WSL_Reverse.

@therealkenc
Copy link
Collaborator

In the 17095 WDK Insider Preview :

typedef enum _FILE_INFORMATION_CLASS {
    FileDirectoryInformation         = 1,
    FileFullDirectoryInformation,   // 2
    FileBothDirectoryInformation,   // 3
    FileBasicInformation,           // 4
[...]
    FileIdExtdBothDirectoryInformation,      // 63
    FileDispositionInformationEx,            // 64
    FileRenameInformationEx,                 // 65
    FileRenameInformationExBypassAccessCheck, // 66
    FileDesiredStorageClassInformation,      // 67
    FileStatInformation,                     // 68
    FileMemoryPartitionInformation,          // 69
    FileStatLxInformation,                   // 70
    FileCaseSensitiveInformation,            // 71
    FileMaximumInformation
} FILE_INFORMATION_CLASS, *PFILE_INFORMATION_CLASS;

@therealkenc
Copy link
Collaborator

Be fun to throw a buffer at FileStatLxInformation too and see what comes back. Here's what I think, with a reasonable chance I'm wrong because I can only speculate. The LXATTRB EA lives on, but got some kind of citizenship in NT (outside the WSL lxcore.sys/lxsys.sys drivers). Meaning, possibly, they can (or could) be cached and so forth between Windows and WSL. Maybe.

@fpqc
Copy link

fpqc commented Mar 2, 2018

@therealkenc Could that be something they are introducing to support ext4 mounting in WSL, or maybe a thing for like vmware to implement WSL support in their shared drive driver? It seems weird that it would be in the driver kit.

Like, you could imagine that is an interface for driver writers to make their linux-style metadata read by the driver and then forwarded into this WSL attribute?

@Biswa96
Copy link

Biswa96 commented Mar 2, 2018

@fpqc May be ext4 mounting is not implemented. When fsutil.exe file setCaseSensitiveInfo is ran, GetDriveTypeW() checks if the drive is not a DRIVE_REMOTE and GetVolumeInformationW() checks if the volume is formatted as NTFS or CSVFS.

@fpqc
Copy link

fpqc commented Mar 2, 2018

@Biswa96 It looks like an interface for fs driver writers to make their devices available in WSL, whatever it is.

@therealkenc
Copy link
Collaborator

therealkenc commented Mar 2, 2018

@fpqc

It looks like an interface for fs driver writers to make their devices available in WSL, whatever it is.

Sort of. That enum is userspace facing (also kernel facing too, of course). What's happened is someone writing a NT filesystem driver can now implement NtGetInformationFile/NtSetInformationFile with FileStatLxInformation in theory. The FS API expanded a couple of notches. Bearing in mind I haven't actually run the thing, and have no idea of the shape of the structure it takes (I couldn't find it in the WDK in a brief look). But from a practical standpoint we are talking about NTFS here for the time being. Pie in the sky future we can gaze, sure, but that's not what this is about.

Anything has access to NTFS has access to "WSL" of course. DrvFs and LxFs aren't materially that different these days, notwithstanding the (important) fact that with LxFs some state appears to live in the WSL drivers and is thus not accessible to (let's call) the the Windows side. There are other loose ends. Symlinks are always going to be tricky, because the POSIX filesystem semantics and the NT filesystem semantics are still different enough to matter, for example. There are certainly other loose ends and details too.

There are a couple of hints in the two blog posts. From the first blog:

DrvFs also disables directory entry caching to ensure it always presents the correct, up-to-date information even if a Windows process has modified the contents of a directory.

And the second:

Note: if you change the case sensitive flag on an existing directory while WSL is running, please make sure WSL has no references to that directory. That means no WSL processes may have that directory, or any of its descendants, open.

The first makes DrvFs slower than they'd probably like. The second is a limitation they'd probably rather not have. Liking and getting are different things though.

@fpqc
Copy link

fpqc commented Mar 2, 2018

@therealkenc I guess, but I/O perf on lxfs isn't anything to read or write home about either. I understand that performance on DrvFS is worse than lxfs because it doesn't cache inodes (I think this is what you were referring to), but I wonder what the big bottleneck in lxfs is (and how much meddling in other parts of Windows they'll be allowed to do to improve it).

Have you done any private investigations?

@therealkenc
Copy link
Collaborator

I guess, but I/O perf on lxfs isn't anything to read or write home about either.

I have been holding off for two years on asking about that. There has to be a "reason" (good or otherwise), I just haven't quite figured it out. Russ (when he was around) used to allude to the problem but never tipped enough info to tell for sure. There are various things you can cache (dentries, data), and we know lxfs does cache stuff.

One big difference brought up a lot is *unix filesystems can stat a file on a dime, and NTFS cannot. Cygwin doesn't have the ability to cache anything (in the kernel), and they suffer from basically the same operational limitations as LxFs. Cygwin can't meddle with anything. But Cygwin's performance is fine, in the scheme, while LxFs is not. Someone will roll eyes and say "Cygwin is different", and okay they're not wrong. But I've never been able to square a 5x (much more in some scenarios, maybe less in others) performance difference, given WSL p0wns ring zero. They're working on it. If it were easy to fix they would have. Etc etc. Which is why I haven't asked.

@SHwareSystemsDev
Copy link

@therealkenc

Curious if they're getting the IEEE 1003.1-2008 band back together. Been a while since mode_t got a new bit.

The 1003.1 band is still going. IEEE 1003.1 is at 2017 now, not 2008, and 202x and 203x are on the horizon. Due to procedural requirements the fixes for some known issues can't be made part of the base until the 203x edition. So mode_t can remain able to be typedef'd as short, as that is how some older file systems still supported by some platforms store it on media, it will not be adding bits for 202x. This has been discussed a few times.

What the stat struct might get in 202x is new required or optional fields, but presently there is no existing practice I'm aware of, by Microsoft or any other company, suitable for basing a proposal on. So it getting any extensions, like for case insensitivity support, is doubtful. This is partly why the stat struct definition, and mode_t, has been static since the 2001 edition. What practice has been implemented by various file systems and platforms is more of the "works for me" variety instead. The FileStatLxInformation control and FILE_STAT_LX_INFORMATION struct are specific to the two MS file systems, but no one else's, as example of "works for MS". The same applies to how various Unix/Linux-specific file systems implement features like xattrs or alternate data streams; it isn't just an MS thing.

@therealkenc
Copy link
Collaborator

I was being sardonic (my bad; that doesn't always come across well in prose).

So it getting any extensions, like for case insensitivity support, is doubtful.

Correct.

"works for me"

Also correct. But YRMV.

@Biswa96
Copy link

Biswa96 commented Apr 9, 2018

Does the Insider SDK has that FILE_STAT_LX_INFORMATION struct definition?

@SHwareSystemsDev
Copy link

SHwareSystemsDev commented Apr 9, 2018

@Biswa96
Yes, the full WDK one does. Haven't checked the VS integration dist, but it should. Should be in km\ntifs.h or wdm.h.

@therealkenc
Copy link
Collaborator

typedef struct _FILE_STAT_LX_INFORMATION {
    LARGE_INTEGER FileId;
    LARGE_INTEGER CreationTime;
    LARGE_INTEGER LastAccessTime;
    LARGE_INTEGER LastWriteTime;
    LARGE_INTEGER ChangeTime;
    LARGE_INTEGER AllocationSize;
    LARGE_INTEGER EndOfFile;
    ULONG FileAttributes;
    ULONG ReparseTag;
    ULONG NumberOfLinks;
    ACCESS_MASK EffectiveAccess;
    ULONG LxFlags;
    ULONG LxUid;
    ULONG LxGid;
    ULONG LxMode;
    ULONG LxDeviceIdMajor;
    ULONG LxDeviceIdMinor;
} FILE_STAT_LX_INFORMATION, *PFILE_STAT_LX_INFORMATION;

@SHwareSystemsDev
Copy link

SHwareSystemsDev commented Apr 9, 2018

@therealkenc

Also correct. But YRMV.

That's the issue, people's results vary, in features and reliability; so it's difficult to say any of them is a standard platforms have to support.

@Biswa96
Copy link

Biswa96 commented Apr 9, 2018

So, one has not to add the EA value with FILE_FULL_EA_INFORMATION struct. As the documentation says:

The value(s) associated with each entry follows the EaName array. That is, an EA's values are located at EaName + (EaNameLength + 1).

Just set LXATTRB extended attributes and add the values from FILE_STAT_LX_INFORMATION struct. Am I right?

This may be useful for @DDoSolitary DDoSolitary/LxRunOffline#36

@DDoSolitary
Copy link
Author

DDoSolitary commented Apr 9, 2018

@Biswa96 yes, it's definitely useful. It means no more NtQuery/SetEaFile and its opaque EA data structure. I believe I can use FILE_STAT_LX_INFORMATION and NtQuery/SetInformationFile to access WSL's filesystem information if they work as expected.

@DDoSolitary
Copy link
Author

I'm closing this issue because MS has provided fine-grained case sensitivity settings. (I just tried it out and FileCaseSensitiveInformation worked well.)

Biswa96 referenced this issue in 0xbadfca11/lxsstat Apr 20, 2018
@therealkenc
Copy link
Collaborator

therealkenc commented Jun 15, 2018

I'm closing this issue because MS has provided fine-grained case sensitivity settings.

Might as well. But I'd sure still appreciate an answer to the original question. Which paraphrasing is: How did WSL implement case sensitivity in LxFS prior to the per-directory case sensitivity flags, on a system where HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\kernel\obcaseinsensitive is set to 0. The blog cited states:

As mentioned previously, once support for case sensitivity system-wide has been established, the ability to use files in a case sensitive way is then determined by the file system in use on a per-volume basis.

But of course, on most people's systems, it it is set to 1. We know is possible to coerce case sensitivity, because WSL did it. It is most unfortunate this question has not been answered.

@SvenGroot
Copy link
Member

SvenGroot commented Jun 15, 2018

WSL used to use a mechanism that allowed us to override the ObCaseInsensitive registry key on a per-thread basis. This option is highly locked down, and cannot be used by normal applications.

Even WSL had to do a number of mitigations to avoid security problems, such as not allowing the use of case sensitivity in the root of a drive. Even then, it was not fool proof, and that's why we eventually abandoned this approach. Now that WSL no longer requires it, we will probably remove per-thread case sensitivity from the system, because there is no way to use it safely.

@therealkenc
Copy link
Collaborator

This option is highly locked down, and cannot be used by normal applications.

And it wouldn't be. It would be used by a filter driver. What is the mechanism?

@SvenGroot
Copy link
Member

SvenGroot commented Jun 15, 2018

The mechanism is undocumented, unsafe to use, and will probably be removed in the future. Attempting to use it, even if it was a public API, would be extremely unwise. That's all I'll say about it, sorry. :)

@therealkenc
Copy link
Collaborator

Take a deep dive in ntdll.dll.

No that's not where the mechanism lives. ntdll.dll has been deep dived for a quarter century and is well understood. Like Sven was saying this isn't a 'application' (normal or otherwise) aka .dll thing. It is a .sys thing.

Anyway Sven, do appreciate the reply.

@Biswa96
Copy link

Biswa96 commented Jun 16, 2018

@therealkenc Here are your sweets, extracted from LxCore.sys. These may override the registry "per-thread-basis":

BOOLEAN LxpDrvFsEnableCaseSensitivityIfNeeded(int a1, int a2) {
  int ThreadInformation = 1;
  if ( !(*(_DWORD *)(a1 + 48) & 8) || *(_DWORD *)(a1 + 80) || a2 && a2 == *(_QWORD *)(*(_QWORD *)(a1 - 112) + 48i64) )
    return FALSE;
  ZwSetInformationThread(handle, (THREADINFOCLASS)43, &ThreadInformation, sizeof(int));
  return TRUE;
}

NTSTATUS LxpDrvFsRestoreCaseSensitivity(BOOLEAN var) {
  if (var) {
    int ThreadInformation = 0;
    NTSTATUS result = ZwSetInformationThread(handle, (THREADINFOCLASS)43, &ThreadInformation, sizeof(int));
  }
  return result;
}

Also there are other two registry values, seems to be interesting:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem]
"NtfsEnableDirCaseSensitivity"=dword:1

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\kernel]
"obcaseinsensitive"=dword:1

[HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\lxss]
"DrvFsAllowForceCaseSensitivity"=dword:1

@therealkenc
Copy link
Collaborator

@therealkenc Here are your sweets, extracted from LxCore.sys

Nice work, impressive 🏆. That THREADINFOCLASS second param is, well, odd. You wouldn't | the values like that. But anyway, it works out to (in base-10) 11 | 32 = 43 which is indeed just a little under MaxThreadInfoClass=50. 43 is curiously missing from the public enum (44 is also missing, 45 is ThreadSubsystemInformation).

But for now I think I'm going to heed Sven's warning that (I'm paraphrasing here): "Ken, the thing is fragile as ****. I can barely keep it afloat myself and I work here 50 hours a week and I have access to all the source code. Only some kind of idiot would try."

I'm willing to take his word for it. 😏

@therealkenc
Copy link
Collaborator

ThreadSubsystemInformation=45. But anything further on this would probably be better served in some kind of system development site like OSR.

@therealkenc
Copy link
Collaborator

Where did you find that THREADINFOCLASS?

It's in ntddk.h

Can you provide what is the last value of FILE_INFORMATION_CLASS enum in WDK?

Sorry I misread the question. FileCaseSensitiveInformation=71. But like I was saying...

@SHwareSystemsDev
Copy link

SHwareSystemsDev commented Jun 27, 2018

@therealkenc

No that's not where the mechanism lives. ntdll.dll has been deep dived for a quarter century and is well understood. Like Sven was saying this isn't a 'application' (normal or otherwise) aka .dll thing. It is a .sys thing.

Correct, mostly, about not being in ntdll but it starts there. The guts is split between *krnl and file system .sys drivers, apparently. Somewhere, based on a header grep of "SENSITIVE", !OBJ_CASE_INSENSITIVE passed into Zw/NtCreate/OpenFile gets converted to FO_OPENED_CASE_SENSITIVE and driver specific flags in Irps for the file system driver to process appropriately, as a request based consideration that might take into account symbolic roots in the registry or not, not stored on the filesystem with a FO_CREATED_CASE_SENSITIVE (sic) or equivalent private flag. This is obfuscated further with the client Create/OpenFile routines which interface through ntdll calling it FILE_FLAG_POSIX_SEMANTICS, not using anything *SENSITIVE, and not fully implementing in the kernel what the term "POSIX semantics" entails. There are unaddressed issues besides case sensitivity; how many I couldn't really say.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants