-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSYS/Cygwin performance is extremely low #15
Comments
Thanks for posting this issue @wm4, though a polite ask: please file one issue per GitHub Issue - combining issues makes it very difficult to discuss and track over time. Thanks. First, some context and background for others reading along who may be unfamiliar with some of these technologies: On WSLWSL currently comes in two versions:
Both versions of WSL also provides several useful integration features that enable you to:
The guidance re. WSL is, if you need to runs Linux native binaries and tools, and/or build and run code that you plan to deploy & run in Linux environments, then run them in WSL On CygwinCygwin delivers a collection of GNU shells and tools ported to run atop Windows/Win32. Cygwin is a great toolset for those who need to run key GNU tools and scripts on Windows, cross-platform projects that share the same build system, but which generate Windows executables and binaries. However, Cygwin does not run unmodified Linux binaries and so you cannot " On The Issues DescribedSo, to the issues you describe above: Building POSIX apps on WindowsIf you're able to pass arguments to a build system to emit binaries for a given platform, you may want to explore that as an option rather than relying on If you DO have to build on Windows for Windows, then why do builds run slower on Windows than on other platforms and why isn't porting easier? In reverse order ... POSIX compatibilityIt doesn't matter which way one slices or dices it, Windows and POSIX (UNIX, BSD, Linux, etc.) have two very different and orthogonal philosophies, assumptions, architectures, and implementations: In *NIX, everything is a stream; in Windows, everything is an object. In *NIX, systems are constructed by chaining together lots of small tools that "do one thing and do it well", in Windows, systems are build out of larger, more sophisticated apps and tools. Etc. These differences manifest EVERYWHERE and it can, as you point out, it can complicate porting from one to the other, especially while maintaining correct behaviors and expected levels of performance . Performance of POSIX apps on WindowsPerformance of POSIX apps on Windows is, indeed, a fundamental issue that is affected by some fundamental differences in the *NIX vs. NT IO subsystems: For example, in POSIX systems, files and folders are enumerated by first collecting a list of files by calling Windows has no direct equivalent to So, if a POSIX app is naively ported to Windows, it can result in a list of files and their properties being enumerated twice! This is just one example and nicely demonstrates that this is not a simple issue to fix. There are many, MANY more, including how file information is cached, how files are deleted, copied and moved, etc.! Stay Tuned!However, don't think we're ignorant of these issues and not doing anything about them! It's too early to discuss in detail yet, but we are working on a set of improvements to address some of the key, fundamental differences between POSIX and Win32, which we expect will provide substantial performance benefits for many POSIX apps on Windows, as well as many Windows-native apps! We will share details when we have a better picture of what we'll be delivering and when. Until then, stay tuned! |
Would it be possible to provide some POSIX-like APIs in the Win32 subsystem? I knew Windows once had a POSIX subsystem, but like WSL, being a subsystem means POSIX/Linux apps are separated from Win32 apps. Therefore people can't do things like call POSIX APIs from Win32 app, or call Win32 APIs from POSIX app, both of which are enabled by cygwin. |
I know, this issue is very broad. There are multiple possible reasons as to why Cygwin could be slow. General mismatch between POSIX/win32 is often suspected (especially fork() performance in the case of shell scripts), it could be filesystem performance, it could just be something sub-optimal that Cygwin or win32 do but which was not identified as cause yet. The Which repository would be a better match for filing issues about win32/POSIX impedance mismatches, that are not necessarily about performance, but which affect developers? For now, this repository seems to allow performance-related issues only. So I just made this issue about performance. Such workarounds often end up in bad performance, as seen on Cygwin.
Yes, but it's still just a glorified VM. It doesn't help in the example of FFmpeg, unless you cross-compile to Windows. But there are problems with that (I could go into details). For other types of programs, this isn't feasible, because they need access to windows APIs.
That omits the quite important fact that Cygwin is a POSIX environment on top of Windows. It's not just for GNU tools. You can use it to port almost any kind of POSIX-compliant software. If carefully written, an application that didn't even attempt to target Cygwin, will build and run just fine on Cygwin. Even git for windows appears to use Cygwin, even though it's not a GNU tool. (I didn't look too closely though. I've only seen the
No, that isn't Cygwin's goal. However, Cygwin has its own repository of pre-built binaries, which surely made a lot of people happy. In any case, I consider WSL1/2 to be out of scope wrt. this issue.
I mean, that's certainly true, but on the other hand there are a lot of staggering similarities. For example, win32's HANDLE is extremely similar to a UNIX FD. At least on Linux, FDs are used whenever userspace needs a handle to a kernel object. There are many types of FDs that are not associated with any kind of byte stream (consider device files, memfd, epoll, signalfd, pidfd, listener-only sockets). HANDLE on win32 is surprisingly similar. It is used for file I/O, I/O completion ports (vaguely equivalent to epoll on a conceptual level), threads, and even devices (equivalent to device files on UNIX). Microsoft's libc (MSVCRT) emulates some POSIX primitives to some degree. For example, the open/read/write functions, which all use UNIX FDs. And indeed, the libc just maps FDs to HANDLEs in a table. Portable programs can (mostly) just use open instead of CreateFile. But this "emulation" often has problems, so advanced portable programs keep doing similar stuff (like https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/win32compat/w32fd.c). (And where win32 gets a real pain is because sockets are neither HANDLEs nor emulated FDs. They're their own thing, and it's awful. So awful.) My point is, you shouldn't have to do this when porting to Windows. Sorry, I guess that got quite offtopic wrt. the performance topic. Though going through these layers will also cost performance, and they require making a lot of choices that might impact performance. A nice example which I've seen in libusb: they use win32 "events" to emulate wakeup pipes. Their central mainloop is a
That doesn't seem to be ideal. This probably affects native windows programs as well. Listing directory contents isn't the only purpose of |
One issue at a timeWe'd prefer if specific issues are filed, e.g.
Assuming underlying reasons for an issue should be avoided at the outset - the more specific and reproducible the issue, the better. On POSIX issuesWe welcome the discussion about POSIX compat in this repo - we intend to broaden our scope out to include such issues anyhow. The perf caveat just indicates that we'd prefer perf issues at this time as a way of gating input to a level we can handle as we build our team and skills here. Literally on a call discussing this as I type ;) On WSLThe point re WSL was that WSL1 was not a VM, WSL 2 uses our current VM infrastructure, but the underlying infrastructure should be considered an internal implementation detail. But yes, WSL provides a parallel POSIX / Linux runtime environment - it doesn't add POSIX capability to Windows per se. We're actively working to figure out how we can better support POSIX apps & runtimes on Windows itself in the future. Stay tuned for more info. On FDs vs. HandlesExcept in specific cases, handles are to be considered per-process, unique, and opaque. They should (generally) not be shared across processes, and one should avoid assuming underlying layout and structure of the handle's internal implementation. There are also several different underlying types of HANDLE on Windows (e.g. file handles, GDI handles, Registry handles, Console handles), but again, they should simply be considered as unique and opaque. FDs describe files and are unique to a machine, so may be shared across processes. FDs index into file table entries which index into inodes - a fact that is often assumed and utilized for better, or for worse. On
|
@driver1998 Great question: VC++ already implements many POSIX APIs which are implemented to call Win32 APIs. What we lack in Win32 are some of the fundamental APIs that behave and perform as they do on POSIX systems. This is an area we're actively exploring as I type. |
Thanks, I appreciate that MS is working on this. Though I'm getting confused about the following:
There must be some sort of misunderstanding. win32 HANDLEs are not necessarily unique or process-local: https://docs.microsoft.com/en-us/windows/win32/api/handleapi/nf-handleapi-duplicatehandle Of course the HANDLE value itself will be different, but it still refers to the same kernel object.
A file descriptor is just an integer that can be used in a single process only. If you A FD doesn't describe files either. A FD returned by |
There are some functions that are just missing, to name a few: It seems like it would be an easy thing for Microsoft to add these and other missing functions compared to the amount of work it causes for developers all over the world. |
@nmoinvaz Great point - we'll definitely discuss this with the VC libs team. It'd be great if we can close the gap between our current POSIX API support and modern-day POSIX API reality, esp. if there's a pretty close mapping between, for example |
Indeed, as I have pointed out, a lot of programs have such wrappers. Often they even replace wrappers that already exist in the CRT, for implementation quality reasons. MinGW-w64 also has a bunch of these. (I wonder whether we can get an issue about this topic somewhere, without the focus on performance considerations, which was just my way to make this not out of scope, to be honest.) |
@wm4 LOL 😁 Don't worry about the perf scoping right now - you're spot-on above in your observation that some of the POSIX API differences do impact perf, so you're in-scope. Plus we absolutely do plan on broadening scope of this repo to discuss developer productivity and other scenarios too - just wanted to gate the repo at launch so that we weren't deluged at the start 😜 |
An off-topic comment, pardon the irony of making it even more meta, but while that quote is definitely true pretty much everywhere, this approach can fail to grasp some bigger pictures. With big-picture issues (which I do consider this one to be), there's also a value IMHO in being able to discuss them as a whole, rather than discussing each micro-issue on its own. For what it's worth, personally I find the issue itself, the responses, and the discussion exceptionally on (this) topic and to the point, despite the seemingly impossibility of doing that. I applaud this discussion so far and all sides which take part in it, and hope to see other big-picture issues discussed as beautifully as here. |
Thanks for the very interesting thread. We are working on a project with very similar issues, trying to run some linux libraries (with lots of POSIX native calls) on windows. At first we went through WSL1+Docker and while we were okay with the lower performance, as @wm4 described, the solution is not well integrated considering deployment at scale of the application. In order to provide better integration, we went down the path of compiling and running the libraries on Windows using Cygwin, with quite big performance issues and not few headaches. I know that the goals of Cygwin and WSL differ and the historical difference in between POSIX apps and Windows make the integration everything but easy, but at an higher level, what's Windows answer to easily and tightly integrate linux binaries in your Win application? @bitcrazed Will WSL2 answer this need somehow? Again thanks for all the interesting points addressed here. |
Hi @orlando2378 - thanks for sharing. Could I ask what the major perf issues were that you found when porting your Linux libraries to Windows? The goal of WSL (regardless of version) is primarily to provide an environment in which you can run unmodified Linux binaries alongside all your favorite Windows apps and tools. It is NOT a goal of WSL to enable one to build apps that contain Linux libs hosted and running in WSL within a Windows app process ... in fact, that'd be prohibitive in so many ways as to be impossible. If you have code in a Linux lib project and want to reuse that code on Windows, then building it with MSYS/Cygwin is a great first step. If that code has perf etc. issues on Windows, you may need to adjust its implementation to better adapt to Windows' architecture/behaviors. We are keen to figure out where we may be able to expose additional features in Windows that better support POSIX apps, but note that this will take some time to happen. |
@bitcrazed Thanks for the prompt reply. The main issues we identified is very poor performance using multithreading. By disabling it, we actually run faster than when enabled. It seems like a common issue using Cygwin unfortunately. Indeed we would need to adjust the implementation to adapt Windows needs but that could require quite some work, especially on big projects, defeating a bit the whole purpose of having a compatibility POSIX layer in the first place. (I know, too idealistic :)) Is in near future Windows roadmap to better support POSIX apps or something more long term? |
Without knowing anything about the nature of the perf issues you're seeing when "using multithreading" it's difficult to know if the root cause is simply in MSYS' implementation of threading, inherent perf issues mapping POSIX threads to Windows threads, perf issues in Windows threading, or something else. We'd love it if you could file an issue detailing specifically what you're seeing with an easy to recreate repro case, etc. to help us narrow-down the root cause of the issue. |
Uh what? POSIX threading is quite straight-forward and simple. The only problem I see is that win32 adds weird requirements, possibly has worse scheduling and worse startup performance than Linux. |
@bitcrazed Haha, okay, no worries. Sorry; hard to read tone through the internet sometimes 🙂 |
something new with better compatibility of the kernel and c++ rumtime of windows with POSIX |
@bitcrazed one thing that's interesting is that as I've shifted over from MacOS to Windows - drawn by WSL as well as the broader Windows ecosystem - I've found myself using msys2 / git-for-windows / cygwin a lot. I still want my underlying host for the VM to be rock-solid and useful for scripts and services and I want those comfortable linux tools available in Windows. My main ask is that Windows just consider Cygwin as it makes updates so as not to break existing functionality. It's not replaced or deprecated by WSL for me. The message I'm getting here from the replies is that this is something that you all are thinking about which is encouraging. In perusing the commits to cygwin, I noticed this commit for example: Cygwin: Adjust CWD magic to accommodate for the latest Windows previews. It's nice that the git-for-windows maintainer @dscho works for Microsoft and submitted that patch but hopefully this is on more than just them. People outside of Microsoft have limited ability and motivation to make patches for "magic" updates to Windows. |
Hey Ben. I left Microsoft last March, and returned back to the UK to try out this thing folks refer to as "retirement", so am not able to drive this issue internally any longer. However, the awesome @marcpems @snickler and others are working on a bunch of stuff that will help improve MSYS2 on Windows. Also, the new Windows Developer Drive was conceived in large part to address the POSIX file IO perf issues I discuss above and should deliver very sizeable perf improvements when running POSIX workloads & scripts on Windows itself. Rest assured that the team are working on improving the performance of many POSIX-first apps, tools, libs, etc. when running on Windows. Do file additional new issues, esp. if you can provide repro cases to demonstrate the biggest offenders - this will be super-useful to the team when trying to diagnose and remedy. Thanks for your continued patience and support. |
Although the project is still pre-alpha, an interested person could just try running the Midpix environment: Currently building it requires Linux. It also requires a secret reference to a temporarily and small code repo, which can be obtained by chatting on #midipix IRC channel on Libera.chat. Also I could try to share my own build from 2022.11.18 (if a person happens to trust that) via some web means: In general, it uses NTAPI instead of WinAPI and is like 3-6 times faster than Cygwin. |
Wanted to update this thread, that last year the Windows filesystem team worked on new apis to improve perf of binaries with a posix background and rely heavily upon stat() behavior - with the goal to minimize the work required to port the code and still achieve great perf on Windows. As noted above this is called out as a pain point when porting to Windows. With this new api, Windows no longer needs to open the file and thus perf is greatly improved. These new apis in are available in the Windows Insiders builds including the upcoming Windows 11 24H2 and Windows Server 2025 release. You can find the headers in the Windows Insiders Preview SDK Note that we've worked with a few OSS repos to take advantage of this already, notably Python and libuv (which is used by NodeJS, CMake, etc). |
@AdamBraden HUGE thanks to you and the whole team involved in continuing to implement and deliver these important improvements!! It's awesome to see the features that so many across the company worked together to deliver finally arrive 😀🎉🥳 |
@marcpems @snickler @bitcrazed random question - did some googling and couldn't find a straight answer - is there any movement in the Dev Drive world or elsewhere to align the file locking behavior with Linux/Unix? I feel like the differences there create a significant amount of friction. There may be other things different which create friction also besides the performance or locking. For example if I could mount a dev drive with an ext4 filesystem that could help? |
@jcrben - Having worked with several teams across Microsoft that improvements have been made across NTFS, Defender, and many x-plat libs & tools to improve scenarios impacted by differences in how file/folder locking are handled between *NIX and Windows that this is not a simple problem to solve. Alas, Microsoft cannot simply make Windows simply adhere to *NIX file locking semantics, just as the *NIX world cannot simply decide to default to Windows' file locking semantics: Doing either would break each ecosystem - many apps/tools/systems depend upon the file locking semantics of their default ecosystem. Instead, if you do find that file locking semantics cause problems, do file issues with the owner of the app/tool/system impacted - from there, they should be able to chase-down solutions to the specific problem. Specifically w.r.t. WSL, when operating on files within a distro's filesystem, WSL honours *NIX locking semantics. You may see odd behaviour if accessing the Windows filesystem from within a WSL distro, but the team have gone to considerable lengths to handle many edge-cases and to "do the right thing"TM when interopping between filesystems. If you find issues here, do file an issue in the WSL GitHub Repo and/or ping @craigloewen-msft to discuss. |
@bitcrazed Are these changes documented somewhere for projects like Cygwin/MSYS2 to take advantage? I also do not directly use WSL (only through Rancher/Podman). |
Do you know @sskras why they are so secretive and do not have regular builds? Also there is Superconfigure+Cosmopolitan that are also very performant. Still not buildable under windows, still no package manager but they release builds. |
@fithisux commented
I am not exactly sure. It's just seems to be a strategy chosen by the main developer. The project lacks manpower (manhours), and the leader hesitates to publish untested changes in an anonymous manner. IIUC, that's why the project asks every user to identify themself and to become a bit of a developer (and build the project on their own, by cross-compiling it from Linux). Also it might be that the lead waits for / would be fine with a person who would provide regular builds on their own (for the initial phase). It also might have to do with the fact of the environment already being used as a commercially supported solution, though that's just my personal guess: |
Thank you @sskras for many years I was excited about this project, but the lack of Cygwin cross-compilation kept me far. But, hey, its open source. So we should help. |
@fithisux wrote 12 hours ago:
Oh, that's a nice idea – fixing bugs that prevent Midipix from being built on Cygwin. Thanks, will try. |
@bitcrazed Just wanted to say "happy retirement!", and thanks for your work over the years :) |
Awww ... thanks so much! Am very flattered 😃 ❤ Even though I am now "retired" I am still watching and participating here and there if I can be of help. Alas though, I no longer have a hotline into the many teams I used to be in touch with to find & fix various perf related issues. However, others like @AdamBraden continue the hard work with what little "spare" time he has 😜 |
Environment
Description
Lots of software uses GNU autoconf or build systems written in POSIX shell. Typically, these use msys/cygwin to run on Windows at all (otherwise they couldn't support Windows). However, performance is extremely low. Essentially, Windows is missing a well-integrated POSIX environment, and the emulation done by Cygwin is, in many situations, extremely slow.
Please note that WSL does not solve the problem in general. It's not well-integrated, but it's more like a VM (WSL2 is literally a VM). A solution that is well-integrated is required. POSIX shell execution is actually only an example. Any project that makes use of POSIX, a standard for portable OS access, suffers from win32's non-orthogonal mess, that makes porting to Windows a nightmare. Even projects supported by Microsoft suffer from this problem. For example, consider: https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/win32compat/fileio.c https://github.com/PowerShell/openssh-portable/blob/latestw_all/contrib/win32/win32compat/w32fd.c.
It should not be necessary for every program to invent a POSIX compat. layer for Windows, which will be slow (as win32 does not provide the required capabilities). This will generally reduce the Windows experience, and it's Microsoft's responsibility to offer a better solution.
Steps to reproduce
A good test is running ffmpeg's configure program. FFmpeg is well-known enough, so no further description is necessary. However, it affects a lot of other projects.
Expected behavior
POSIX shell execution is as fast as on Linux.
Actual behavior
POSIX shell execution is several magnitudes slower than native or even virtualized Linux.
The text was updated successfully, but these errors were encountered: