Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSYS/Cygwin performance is extremely low #15

Open
ghost opened this issue Jul 29, 2020 · 32 comments
Open

MSYS/Cygwin performance is extremely low #15

ghost opened this issue Jul 29, 2020 · 32 comments
Labels
Area-File-IO Issue in the IO layers above filesystem (e.g. filter drivers) Area-Filesystem Issue in the NTFS/*FAT filesystem

Comments

@ghost
Copy link

ghost commented Jul 29, 2020

Environment

Item Value
OS, Version / Build Win32NT 10.0.18363.0 Microsoft Windows NT 10.0.0.18363.0
Processor Architecture ___ (no output? but it's x64)
Processor Type & Model Intel Broadwell
Memory 16GB
Storage Type, free / capacity (e.g. C: SSD 128GB / 512GB) SSD 14.5 GB / 81.6 GB
Relevant apps installed mingw-w64/msys2

Description

Lots of software uses GNU autoconf or build systems written in POSIX shell. Typically, these use msys/cygwin to run on Windows at all (otherwise they couldn't support Windows). However, performance is extremely low. Essentially, Windows is missing a well-integrated POSIX environment, and the emulation done by Cygwin is, in many situations, extremely slow.

Please note that WSL does not solve the problem in general. It's not well-integrated, but it's more like a VM (WSL2 is literally a VM). A solution that is well-integrated is required. POSIX shell execution is actually only an example. Any project that makes use of POSIX, a standard for portable OS access, suffers from win32's non-orthogonal mess, that makes porting to Windows a nightmare. Even projects supported by Microsoft suffer from this problem. For example, consider: https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/win32compat/fileio.c https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/win32compat/w32fd.c.

It should not be necessary for every program to invent a POSIX compat. layer for Windows, which will be slow (as win32 does not provide the required capabilities). This will generally reduce the Windows experience, and it's Microsoft's responsibility to offer a better solution.

Steps to reproduce

A good test is running ffmpeg's configure program. FFmpeg is well-known enough, so no further description is necessary. However, it affects a lot of other projects.

Expected behavior

POSIX shell execution is as fast as on Linux.

Actual behavior

POSIX shell execution is several magnitudes slower than native or even virtualized Linux.

@bitcrazed bitcrazed added Area-File-IO Issue in the IO layers above filesystem (e.g. filter drivers) Area-Filesystem Issue in the NTFS/*FAT filesystem labels Jul 29, 2020
@bitcrazed
Copy link
Contributor

Thanks for posting this issue @wm4, though a polite ask: please file one issue per GitHub Issue - combining issues makes it very difficult to discuss and track over time. Thanks.

First, some context and background for others reading along who may be unfamiliar with some of these technologies:

On WSL

WSL currently comes in two versions:

  • WSL1 - runs unmodified Linux binaries atop a Linux-compatible layer in the NT kernel. All kernel operations are either provided by NT, or by WSL emulating Linux-specific kernel/OS behavior.

  • WSL2 - available in Windows 10 2004 or later, WSL2 runs unmodified Linux binaries in Linux containers (one per distro) atop a Linux kernel hosted in a lightweight VM that can boot from cold in < 2s. Because WSL2 runs atop a real Linux kernel, it can offer ~100% Linux syscall compatibility, and near-native IO perf when accessing files within the distro's filesystem.

Both versions of WSL also provides several useful integration features that enable you to:

  • Run unmodified Linux binaries on Windows, alongside your Windows apps and tools

  • Access your the Linux distros' filesystems from Windows, and vice-versa

  • Execute Linux commands, scripts, and binaries from Windows, and vice-versa
    image

    image

The guidance re. WSL is, if you need to runs Linux native binaries and tools, and/or build and run code that you plan to deploy & run in Linux environments, then run them in WSL

On Cygwin

Cygwin delivers a collection of GNU shells and tools ported to run atop Windows/Win32. Cygwin is a great toolset for those who need to run key GNU tools and scripts on Windows, cross-platform projects that share the same build system, but which generate Windows executables and binaries. However, Cygwin does not run unmodified Linux binaries and so you cannot "apt install ... your way to happiness".

On The Issues Described

So, to the issues you describe above:

Building POSIX apps on Windows

If you're able to pass arguments to a build system to emit binaries for a given platform, you may want to explore that as an option rather than relying on .configure to configure and build based on the environment. I know this isn't always possible, but if it is, one might then be able to build within WSL, but target Win32.

If you DO have to build on Windows for Windows, then why do builds run slower on Windows than on other platforms and why isn't porting easier?

In reverse order ...

POSIX compatibility

It doesn't matter which way one slices or dices it, Windows and POSIX (UNIX, BSD, Linux, etc.) have two very different and orthogonal philosophies, assumptions, architectures, and implementations: In *NIX, everything is a stream; in Windows, everything is an object. In *NIX, systems are constructed by chaining together lots of small tools that "do one thing and do it well", in Windows, systems are build out of larger, more sophisticated apps and tools. Etc.

These differences manifest EVERYWHERE and it can, as you point out, it can complicate porting from one to the other, especially while maintaining correct behaviors and expected levels of performance .

Performance of POSIX apps on Windows

Performance of POSIX apps on Windows is, indeed, a fundamental issue that is affected by some fundamental differences in the *NIX vs. NT IO subsystems:

For example, in POSIX systems, files and folders are enumerated by first collecting a list of files by calling opendir(), then repeatedly calling readdir() until it returns NULL, and then calling closedir(). If one then needs some/all of the file's attributes (length, last updated date/time, permissions, etc.), one must then call the stat() syscall on each file in question. In most *NIX systems, stat() is practically "free" from a performance perspective, and as a result, is called A LOT!

Windows has no direct equivalent to stat()! Why? The mechanism in Windows to enumerate the contents of a folder is to call FindFirstFile[Ex]() and then repeatedly call FindNextFile() until it returns zero, and then call FindClose(). Similar to POSIX, right? Yes, except in Windows, there's no need to then call stat() on each file in a folder to get it's attributes because the file's attributes were already returned by Find[First|Next]File()!

So, if a POSIX app is naively ported to Windows, it can result in a list of files and their properties being enumerated twice!

This is just one example and nicely demonstrates that this is not a simple issue to fix. There are many, MANY more, including how file information is cached, how files are deleted, copied and moved, etc.!

Stay Tuned!

However, don't think we're ignorant of these issues and not doing anything about them!

It's too early to discuss in detail yet, but we are working on a set of improvements to address some of the key, fundamental differences between POSIX and Win32, which we expect will provide substantial performance benefits for many POSIX apps on Windows, as well as many Windows-native apps!

We will share details when we have a better picture of what we'll be delivering and when. Until then, stay tuned!

@driver1998
Copy link

It's too early to discuss in detail yet, but we are working on a set of improvements to address some of the key, fundamental differences between POSIX and Win32, which we expect will provide substantial performance benefits for many POSIX apps on Windows, as well as many Windows-native apps!

Would it be possible to provide some POSIX-like APIs in the Win32 subsystem? I knew Windows once had a POSIX subsystem, but like WSL, being a subsystem means POSIX/Linux apps are separated from Win32 apps. Therefore people can't do things like call POSIX APIs from Win32 app, or call Win32 APIs from POSIX app, both of which are enabled by cygwin.

@ghost
Copy link
Author

ghost commented Jul 30, 2020

please file one issue per GitHub Issue

I know, this issue is very broad. There are multiple possible reasons as to why Cygwin could be slow. General mismatch between POSIX/win32 is often suspected (especially fork() performance in the case of shell scripts), it could be filesystem performance, it could just be something sub-optimal that Cygwin or win32 do but which was not identified as cause yet. The stat() issue is also something I didn't hear about before.

Which repository would be a better match for filing issues about win32/POSIX impedance mismatches, that are not necessarily about performance, but which affect developers? For now, this repository seems to allow performance-related issues only. So I just made this issue about performance. Such workarounds often end up in bad performance, as seen on Cygwin.

The guidance re. WSL is, if you need to runs Linux native binaries and tools, and/or build and run code that you plan to deploy & run in Linux environments, then run them in WSL

Yes, but it's still just a glorified VM. It doesn't help in the example of FFmpeg, unless you cross-compile to Windows. But there are problems with that (I could go into details). For other types of programs, this isn't feasible, because they need access to windows APIs.

Cygwin delivers a collection of GNU shells and tools ported to run atop Windows/Win32. Cygwin is a great toolset for those who need to run key GNU tools and scripts on Windows, cross-platform projects that share the same build system, but which generate Windows executables and binaries.

That omits the quite important fact that Cygwin is a POSIX environment on top of Windows. It's not just for GNU tools. You can use it to port almost any kind of POSIX-compliant software. If carefully written, an application that didn't even attempt to target Cygwin, will build and run just fine on Cygwin.

Even git for windows appears to use Cygwin, even though it's not a GNU tool. (I didn't look too closely though. I've only seen the dev folder, which seems to be an artifact of Cygwin going a bit too far to pretend to be Unix.)

However, Cygwin does not run unmodified Linux binaries and so you cannot "apt install ... your way to happiness".

No, that isn't Cygwin's goal. However, Cygwin has its own repository of pre-built binaries, which surely made a lot of people happy. In any case, I consider WSL1/2 to be out of scope wrt. this issue.

It doesn't matter which way one slices or dices it, Windows and POSIX (UNIX, BSD, Linux, etc.) have two very different and orthogonal philosophies, assumptions, architectures, and implementations: In *NIX, everything is a stream; in Windows, everything is an object.

I mean, that's certainly true, but on the other hand there are a lot of staggering similarities. For example, win32's HANDLE is extremely similar to a UNIX FD. At least on Linux, FDs are used whenever userspace needs a handle to a kernel object. There are many types of FDs that are not associated with any kind of byte stream (consider device files, memfd, epoll, signalfd, pidfd, listener-only sockets). HANDLE on win32 is surprisingly similar. It is used for file I/O, I/O completion ports (vaguely equivalent to epoll on a conceptual level), threads, and even devices (equivalent to device files on UNIX).

Microsoft's libc (MSVCRT) emulates some POSIX primitives to some degree. For example, the open/read/write functions, which all use UNIX FDs. And indeed, the libc just maps FDs to HANDLEs in a table. Portable programs can (mostly) just use open instead of CreateFile. But this "emulation" often has problems, so advanced portable programs keep doing similar stuff (like https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/win32compat/w32fd.c).

(And where win32 gets a real pain is because sockets are neither HANDLEs nor emulated FDs. They're their own thing, and it's awful. So awful.)

My point is, you shouldn't have to do this when porting to Windows.

Sorry, I guess that got quite offtopic wrt. the performance topic. Though going through these layers will also cost performance, and they require making a lot of choices that might impact performance.

A nice example which I've seen in libusb: they use win32 "events" to emulate wakeup pipes. Their central mainloop is a poll() call, which waits on all wakeup pipes. But unlike poll, WaitForMultipleObjects has a limit on the number of objects it can wait on. So they start an additional thread for every 64 objects, and at the end of the wait, they destroy the threads. Every time. Man, I sure I hope I never run into this case on Windows with my libusb CLI program. Code: https://github.com/libusb/libusb/blob/master/libusb/os/poll_windows.c#L239

Windows has no direct equivalent to stat()! Why? The mechanism in Windows to enumerate the contents of a folder is to call FindFirstFileEx and then repeatedly call FindNextFile() until it returns zero, and then call FindClose(). Similar to POSIX, right? Yes, except in Windows, there's no need to then call stat() on each file in a folder to get it's attributes because the file's attributes were already returned by Find[First|Next]File()!

That doesn't seem to be ideal. This probably affects native windows programs as well. Listing directory contents isn't the only purpose of stat(). Often, you may want to run it on a single file, or on a separate list of files (maybe "git status"? I don't know), so I wonder what native win32 programs do in these cases.

@bitcrazed
Copy link
Contributor

One issue at a time

We'd prefer if specific issues are filed, e.g.

  • "Enumerating files in a folder takes longer on Windows than on Linux"
  • "Forking sub/worker processes is faster on Linux than on Windows"
  • "Windows should better support POSIX style ____"

Assuming underlying reasons for an issue should be avoided at the outset - the more specific and reproducible the issue, the better.

On POSIX issues

We welcome the discussion about POSIX compat in this repo - we intend to broaden our scope out to include such issues anyhow. The perf caveat just indicates that we'd prefer perf issues at this time as a way of gating input to a level we can handle as we build our team and skills here.

Literally on a call discussing this as I type ;)

On WSL

The point re WSL was that WSL1 was not a VM, WSL 2 uses our current VM infrastructure, but the underlying infrastructure should be considered an internal implementation detail.

But yes, WSL provides a parallel POSIX / Linux runtime environment - it doesn't add POSIX capability to Windows per se. We're actively working to figure out how we can better support POSIX apps & runtimes on Windows itself in the future. Stay tuned for more info.

On FDs vs. Handles

Except in specific cases, handles are to be considered per-process, unique, and opaque. They should (generally) not be shared across processes, and one should avoid assuming underlying layout and structure of the handle's internal implementation. There are also several different underlying types of HANDLE on Windows (e.g. file handles, GDI handles, Registry handles, Console handles), but again, they should simply be considered as unique and opaque.

FDs describe files and are unique to a machine, so may be shared across processes. FDs index into file table entries which index into inodes - a fact that is often assumed and utilized for better, or for worse.

On stat()

Of course, Win32 provides GetFileAttributesEx to query the attributes for a specific file, but it isn't as cheap as stat() is in POSIX based systems. On Windows this isn't a major perf issue because code doesn't HAVE to call stat() on each file in a list to obtain it's attributes since those attributes are already returned during enumeration.

In our testing, individual or small batches of calls to stat(), which generally translate to calls to GetFileAttributesEx() make little perf impact, but code which naively issues storms of stat() calls (often repeatedly and unnecessarily several times in a call stack), can show up as a major perf issue.

In closing

So, to summarize:

  1. We hear you and understand & appreciate you raising the issues above
  2. We are actively working on some of the root-causes of the many of the issues and scenarios discussed above. We will share details when appropriate to do so
  3. We encourage you to file individual specific issues with repro steps if possible in order to help identify and focus on issues that we can action into improvements

Many thanks.

@bitcrazed
Copy link
Contributor

@driver1998 Great question: VC++ already implements many POSIX APIs which are implemented to call Win32 APIs. What we lack in Win32 are some of the fundamental APIs that behave and perform as they do on POSIX systems. This is an area we're actively exploring as I type.

@ghost
Copy link
Author

ghost commented Jul 30, 2020

Thanks, I appreciate that MS is working on this.

Though I'm getting confused about the following:

Except in specific cases, handles are to be considered per-process, unique, and opaque. They should (generally) not be shared across processes, and one should avoid assuming underlying layout and structure of the handle's internal implementation. There are also several different underlying types of HANDLE on Windows (e.g. file handles, GDI handles, Registry handles, Console handles), but again, they should simply be considered as unique and opaque.

There must be some sort of misunderstanding. win32 HANDLEs are not necessarily unique or process-local:

https://docs.microsoft.com/en-us/windows/win32/api/handleapi/nf-handleapi-duplicatehandle
https://docs.microsoft.com/en-us/windows/win32/sysinfo/handle-inheritance

Of course the HANDLE value itself will be different, but it still refers to the same kernel object.

FDs describe files and are unique to a machine, so may be shared across processes. FDs index into file table entries which index into inodes - a fact that is often assumed and utilized for better, or for worse.

A file descriptor is just an integer that can be used in a single process only. If you open() a file in one process, you can't use the same integer value in another process to perform a read(). FDs can be shared by fork() (then the integer value stays actually the same), or by sendmsg() when using unix domain sockets (the integer value may change in the target process).

A FD doesn't describe files either. A FD returned by socket() refers to an object in the network stack (often a network connection), a FD returned by memfd_create() refers to a block of memory, and a FD returned by epoll_create() doesn't even reference any kind of resources, just a specific kernel management object. It's possible that the Linux kernel has some sort of inode object per FD internally, but that's just an opaque implementation detail.

@nmoinvaz
Copy link

nmoinvaz commented Jul 30, 2020

What we lack in Win32 are some of the fundamental APIs that behave and perform as they do on POSIX systems.

There are some functions that are just missing, to name a few:
opendir,readdir,closedir,fsync,gettimeofday,getopt,getopt_long,getopt_long_only,strcasecmp

It seems like it would be an easy thing for Microsoft to add these and other missing functions compared to the amount of work it causes for developers all over the world.

@bitcrazed
Copy link
Contributor

@nmoinvaz Great point - we'll definitely discuss this with the VC libs team. It'd be great if we can close the gap between our current POSIX API support and modern-day POSIX API reality, esp. if there's a pretty close mapping between, for example opendir() / FindFirstFile() or fsync() / FlushFileBuffers().

@ghost
Copy link
Author

ghost commented Jul 31, 2020

Indeed, as I have pointed out, a lot of programs have such wrappers. Often they even replace wrappers that already exist in the CRT, for implementation quality reasons. MinGW-w64 also has a bunch of these. (I wonder whether we can get an issue about this topic somewhere, without the focus on performance considerations, which was just my way to make this not out of scope, to be honest.)

@bitcrazed
Copy link
Contributor

@wm4 LOL 😁 Don't worry about the perf scoping right now - you're spot-on above in your observation that some of the POSIX API differences do impact perf, so you're in-scope. Plus we absolutely do plan on broadening scope of this repo to discuss developer productivity and other scenarios too - just wanted to gate the repo at launch so that we weren't deluged at the start 😜

@avih
Copy link

avih commented Aug 5, 2020

though a polite ask: please file one issue per GitHub Issue

An off-topic comment, pardon the irony of making it even more meta, but while that quote is definitely true pretty much everywhere, this approach can fail to grasp some bigger pictures.

With big-picture issues (which I do consider this one to be), there's also a value IMHO in being able to discuss them as a whole, rather than discussing each micro-issue on its own.

For what it's worth, personally I find the issue itself, the responses, and the discussion exceptionally on (this) topic and to the point, despite the seemingly impossibility of doing that.

I applaud this discussion so far and all sides which take part in it, and hope to see other big-picture issues discussed as beautifully as here.

@orlando2378
Copy link

orlando2378 commented Oct 5, 2020

Thanks for the very interesting thread.

We are working on a project with very similar issues, trying to run some linux libraries (with lots of POSIX native calls) on windows.

At first we went through WSL1+Docker and while we were okay with the lower performance, as @wm4 described, the solution is not well integrated considering deployment at scale of the application.

In order to provide better integration, we went down the path of compiling and running the libraries on Windows using Cygwin, with quite big performance issues and not few headaches.

I know that the goals of Cygwin and WSL differ and the historical difference in between POSIX apps and Windows make the integration everything but easy, but at an higher level, what's Windows answer to easily and tightly integrate linux binaries in your Win application? @bitcrazed Will WSL2 answer this need somehow?

Again thanks for all the interesting points addressed here.

@bitcrazed
Copy link
Contributor

Hi @orlando2378 - thanks for sharing. Could I ask what the major perf issues were that you found when porting your Linux libraries to Windows?

The goal of WSL (regardless of version) is primarily to provide an environment in which you can run unmodified Linux binaries alongside all your favorite Windows apps and tools.

It is NOT a goal of WSL to enable one to build apps that contain Linux libs hosted and running in WSL within a Windows app process ... in fact, that'd be prohibitive in so many ways as to be impossible.

If you have code in a Linux lib project and want to reuse that code on Windows, then building it with MSYS/Cygwin is a great first step. If that code has perf etc. issues on Windows, you may need to adjust its implementation to better adapt to Windows' architecture/behaviors.

We are keen to figure out where we may be able to expose additional features in Windows that better support POSIX apps, but note that this will take some time to happen.

@orlando2378
Copy link

@bitcrazed Thanks for the prompt reply. The main issues we identified is very poor performance using multithreading. By disabling it, we actually run faster than when enabled. It seems like a common issue using Cygwin unfortunately.

Indeed we would need to adjust the implementation to adapt Windows needs but that could require quite some work, especially on big projects, defeating a bit the whole purpose of having a compatibility POSIX layer in the first place. (I know, too idealistic :))

Is in near future Windows roadmap to better support POSIX apps or something more long term?

@bitcrazed
Copy link
Contributor

bitcrazed commented Oct 6, 2020

@orlando2378

Without knowing anything about the nature of the perf issues you're seeing when "using multithreading" it's difficult to know if the root cause is simply in MSYS' implementation of threading, inherent perf issues mapping POSIX threads to Windows threads, perf issues in Windows threading, or something else.

We'd love it if you could file an issue detailing specifically what you're seeing with an easy to recreate repro case, etc. to help us narrow-down the root cause of the issue.

@ghost
Copy link
Author

ghost commented Oct 6, 2020

Uh what? POSIX threading is quite straight-forward and simple. The only problem I see is that win32 adds weird requirements, possibly has worse scheduling and worse startup performance than Linux.

@microsoft microsoft deleted a comment from Eli-Black-Work Oct 7, 2020
@Eli-Black-Work
Copy link

@bitcrazed Haha, okay, no worries. Sorry; hard to read tone through the internet sometimes 🙂

@insinfo
Copy link

insinfo commented Oct 22, 2023

something new with better compatibility of the kernel and c++ rumtime of windows with POSIX

@jcrben
Copy link

jcrben commented Nov 10, 2023

@bitcrazed one thing that's interesting is that as I've shifted over from MacOS to Windows - drawn by WSL as well as the broader Windows ecosystem - I've found myself using msys2 / git-for-windows / cygwin a lot. I still want my underlying host for the VM to be rock-solid and useful for scripts and services and I want those comfortable linux tools available in Windows.

My main ask is that Windows just consider Cygwin as it makes updates so as not to break existing functionality. It's not replaced or deprecated by WSL for me.

The message I'm getting here from the replies is that this is something that you all are thinking about which is encouraging.

In perusing the commits to cygwin, I noticed this commit for example: Cygwin: Adjust CWD magic to accommodate for the latest Windows previews. It's nice that the git-for-windows maintainer @dscho works for Microsoft and submitted that patch but hopefully this is on more than just them. People outside of Microsoft have limited ability and motivation to make patches for "magic" updates to Windows.

@bitcrazed
Copy link
Contributor

Hey Ben. I left Microsoft last March, and returned back to the UK to try out this thing folks refer to as "retirement", so am not able to drive this issue internally any longer. However, the awesome @marcpems @snickler and others are working on a bunch of stuff that will help improve MSYS2 on Windows.

Also, the new Windows Developer Drive was conceived in large part to address the POSIX file IO perf issues I discuss above and should deliver very sizeable perf improvements when running POSIX workloads & scripts on Windows itself.

Rest assured that the team are working on improving the performance of many POSIX-first apps, tools, libs, etc. when running on Windows. Do file additional new issues, esp. if you can provide repro cases to demonstrate the biggest offenders - this will be super-useful to the team when trying to diagnose and remedy.

Thanks for your continued patience and support.

@sskras
Copy link

sskras commented Nov 10, 2023

Although the project is still pre-alpha, an interested person could just try running the Midpix environment:
https://github.com/lalbornoz/midipix_build#1-what-is-midipix-and-how-is-it-different

Currently building it requires Linux.

It also requires a secret reference to a temporarily and small code repo, which can be obtained by chatting on #midipix IRC channel on Libera.chat.

Also I could try to share my own build from 2022.11.18 (if a person happens to trust that) via some web means:

image

In general, it uses NTAPI instead of WinAPI and is like 3-6 times faster than Cygwin.

@AdamBraden
Copy link
Collaborator

Wanted to update this thread, that last year the Windows filesystem team worked on new apis to improve perf of binaries with a posix background and rely heavily upon stat() behavior - with the goal to minimize the work required to port the code and still achieve great perf on Windows. As noted above this is called out as a pain point when porting to Windows. With this new api, Windows no longer needs to open the file and thus perf is greatly improved. These new apis in are available in the Windows Insiders builds including the upcoming Windows 11 24H2 and Windows Server 2025 release. You can find the headers in the Windows Insiders Preview SDK
GetFileInformationByName function (winbase.h) - Win32 apps | Microsoft Learn
FILE_STAT_BASIC_INFORMATION - Windows drivers | Microsoft Learn

Note that we've worked with a few OSS repos to take advantage of this already, notably Python and libuv (which is used by NodeJS, CMake, etc).
• Python - gh-99726: Improves correctness of stat results for Windows, and uses faster API when available
• libuv - win,fs: use the new Windows fast stat API

@bitcrazed
Copy link
Contributor

@AdamBraden HUGE thanks to you and the whole team involved in continuing to implement and deliver these important improvements!! It's awesome to see the features that so many across the company worked together to deliver finally arrive 😀🎉🥳

@jcrben
Copy link

jcrben commented Sep 20, 2024

@marcpems @snickler @bitcrazed random question - did some googling and couldn't find a straight answer - is there any movement in the Dev Drive world or elsewhere to align the file locking behavior with Linux/Unix? I feel like the differences there create a significant amount of friction. There may be other things different which create friction also besides the performance or locking. For example if I could mount a dev drive with an ext4 filesystem that could help?

@bitcrazed
Copy link
Contributor

@jcrben - Having worked with several teams across Microsoft that improvements have been made across NTFS, Defender, and many x-plat libs & tools to improve scenarios impacted by differences in how file/folder locking are handled between *NIX and Windows that this is not a simple problem to solve.

Alas, Microsoft cannot simply make Windows simply adhere to *NIX file locking semantics, just as the *NIX world cannot simply decide to default to Windows' file locking semantics: Doing either would break each ecosystem - many apps/tools/systems depend upon the file locking semantics of their default ecosystem.

Instead, if you do find that file locking semantics cause problems, do file issues with the owner of the app/tool/system impacted - from there, they should be able to chase-down solutions to the specific problem.

Specifically w.r.t. WSL, when operating on files within a distro's filesystem, WSL honours *NIX locking semantics. You may see odd behaviour if accessing the Windows filesystem from within a WSL distro, but the team have gone to considerable lengths to handle many edge-cases and to "do the right thing"TM when interopping between filesystems. If you find issues here, do file an issue in the WSL GitHub Repo and/or ping @craigloewen-msft to discuss.

@fithisux
Copy link

@bitcrazed Are these changes documented somewhere for projects like Cygwin/MSYS2 to take advantage? I also do not directly use WSL (only through Rancher/Podman).

@fithisux
Copy link

Although the project is still pre-alpha, an interested person could just try running the Midpix environment: https://github.com/lalbornoz/midipix_build#1-what-is-midipix-and-how-is-it-different

Currently building it requires Linux.

It also requires a secret reference to a temporarily and small code repo, which can be obtained by chatting on #midipix IRC channel on Libera.chat.

Also I could try to share my own build from 2022.11.18 (if a person happens to trust that) via some web means:

image

In general, it uses NTAPI instead of WinAPI and is like 3-6 times faster than Cygwin.

Do you know @sskras why they are so secretive and do not have regular builds?

Also there is Superconfigure+Cosmopolitan that are also very performant. Still not buildable under windows, still no package manager but they release builds.

@sskras
Copy link

sskras commented Oct 24, 2024

@fithisux commented
5 hours ago:

Also I could try to share my own build from 2022.11.18 (if a person happens to trust that) via some web means:
image
In general, it uses NTAPI instead of WinAPI and is like 3-6 times faster than Cygwin.

Do you know @sskras why they are so secretive and do not have regular builds?

I am not exactly sure. It's just seems to be a strategy chosen by the main developer. The project lacks manpower (manhours), and the leader hesitates to publish untested changes in an anonymous manner. IIUC, that's why the project asks every user to identify themself and to become a bit of a developer (and build the project on their own, by cross-compiling it from Linux). Also it might be that the lead waits for / would be fine with a person who would provide regular builds on their own (for the initial phase).

It also might have to do with the fact of the environment already being used as a commercially supported solution, though that's just my personal guess:
https://sysdeer.net/#clients

@fithisux
Copy link

Thank you @sskras for many years I was excited about this project, but the lack of Cygwin cross-compilation kept me far.

But, hey, its open source. So we should help.

@sskras
Copy link

sskras commented Oct 24, 2024

@fithisux wrote 12 hours ago:

the lack of Cygwin cross-compilation kept me far.

Oh, that's a nice idea – fixing bugs that prevent Midipix from being built on Cygwin. Thanks, will try.

@Eli-Black-Work
Copy link

@bitcrazed Just wanted to say "happy retirement!", and thanks for your work over the years :)

@bitcrazed
Copy link
Contributor

@bitcrazed Just wanted to say "happy retirement!", and thanks for your work over the years :)

Awww ... thanks so much! Am very flattered 😃 ❤

Even though I am now "retired" I am still watching and participating here and there if I can be of help. Alas though, I no longer have a hotline into the many teams I used to be in touch with to find & fix various perf related issues. However, others like @AdamBraden continue the hard work with what little "spare" time he has 😜

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-File-IO Issue in the IO layers above filesystem (e.g. filter drivers) Area-Filesystem Issue in the NTFS/*FAT filesystem
Projects
None yet
Development

No branches or pull requests