-
Notifications
You must be signed in to change notification settings - Fork 623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: cygwin and msys2 support (bash) #1291
base: master
Are you sure you want to change the base?
Conversation
.gitattributes
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the tests require that these files be in LF. On Windows, the default on clone (on my system) is CRLF. This ensures that it uses EOL as expected by the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be enough to use * text=auto eol=lf
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would, I'll update it accordingly.
test/direnv-test-common.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On MSYS2, symlinks are full file copies. And on Windows, the modtime of state-A and state-B and symlink are all the same. I touch the files to ensure that their modtime is changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might have some implications on the direnv reload logic. Can you check if direnv reload
and direnv allow
behave as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check and get back to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They behave as expected. When I execute those commands, I expect the environment to reload, and in both cases it does. These results make sense, since direnv reload
performs a touch that updates the modtime of the file and triggers the reload, and direnv allow
performs a file write operation that also updates the modtime of the file.
One reason why the msys2 unit test needs to touch the files, is because both state-A
and state-B
have the same modtime (probably the time that the project was cloned), and the msys2 version of ln -fs
does some kind of copy magic that makes exact copies of those files when creating the ./symlink
file, keeping their modtime values in tact and equal. The unix version of ln
probably updates the ./symlink
modtime as expected.
The direnv export
command (used by direnv_eval
) probably does not touch the watched files, to avoid any weird side effects, but as indicated, both direnv reload
and direnv allow
do "touch" the watched files, so things work as expected on msys2.
I could update the test to perform a direnv allow
instead of a touch
if you prefer. A part of me thinks that if a symlink changes in such a way that it would cause another environment to be loaded (in this case only the value of STATE is changed but it just as well could have executed an rm -rf *
statement), then it might be more correct to require that the user execute direnv allow
first, before executing the export.
I'm happy with this PR now. This version of direnv works well for my projects on Git Bash. @pjeby Would you be able to give this implementation a try on your system? |
It looks like this is msys specific, and I use Cygwin. So as long as the PR's changes don't affect Cygwin I should be good. |
Thanks @pjeby. This PR should not activate on Cygwin environments. From what I gather from these sources (and a few others), it looks like I am correctly distinguishing between Cygwin and MSYS2 environments using the MSYSTEM variable:
UPDATE: I installed Cygwin and verified that the MSYSTEM variable is not available there, so the MSYS2 functionality will not be triggered. I also cloned direnv's master branch (i.e without this PR) in my Cygwin environment, but couldn't get the bash unit tests to pass, so I'm not too confident that direnv would work as expected in Cygwin. But as I understand it, you have your own fork overcoming those limitations. |
Now that you've added all this cygpath usage, it might be useful to note that making Doesn't need to be in this PR or anything, but I thought it might be useful to know. I'm currently using a patched [[ "$PATH" != *'\'* ]] || export PATH=$(/usr/bin/cygpath -p "$PATH") right after running the So if direnv did the cygpath -p internally when running on Cygwin, it'd be compatible with Cygwin shells, at least to the extent my current setup is. (AFAIK Cygwin only mangles PATH, not command-line arguments or other environment variables.) (Of course, this could also be done by changing the output of |
Thanks for the tip! If this PR makes the cut, then I'll see if I can run with the momentum and get some basic Cygwin support into another PR. My first implementation of this PR involved a similar approach where I resolved the PATH issue directly in the export, but this ended up working against the way direnv performs its environment diffing. This is not that big of an issue when you are only exporting the environment, but it does pose problems when you start using the shell hook, which is probably how most people use direnv anyway. I'd probably investigate performing the cygpath call earlier in the stack for Cygwin, in a similar way I did for this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @W1M0R, thanks for the contribution.
Past the minor changes, the only blocker is that I need a Windows machine to test this. I will need your help to maintain some of that code. If you are ready to do this, we can proceed.
.gitattributes
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be enough to use * text=auto eol=lf
?
internal/cmd/msys2.go
Outdated
if len(os.Getenv("MSYSTEM")) == 0 { | ||
// This is Windows, but not an MSYS2 environment. | ||
return m | ||
} | ||
|
||
if _, err := exec.LookPath("cygpath.exe"); err != nil { | ||
// This is an MSYS2 environment...without cygpath. We require cygpath | ||
// for some initial setup. | ||
return m | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These requirements need to be added to the docs. Otherwise, it's hard for users to know whether direnv is working or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. Do you want me to add entries to the installation.md
file, the README.md
file and the direnv.1.md file
?
Just to be clear, this is the standard MSYS2 environment, so user's don't need to do anything to set this up. An msys2 environment would be broken if it didn't have the MSYSTEM
variable set and cygpath.exe
available in the path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the code slightly to log an error as a hint to the user that something seems odd about their MSYS2 environment.
internal/cmd/msys2.go
Outdated
// toWinMixedPath uses cygpath.exe to convert the Unix input path to a Windows | ||
// Mixed path. | ||
func (m msys2) toWinMixedPath(input string) (string, error) { | ||
return m.execCygpath("-m", input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much logic is it to port over if we wanted all the logic embedded in direnv?
I'm asking because Windows is a bit slow in spawning new programs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To implement cygpath, you would at minimum need to be able to parse /etc/fstab, which controls the drive mapping prefix for both Cygwin and msys (unless msys has an API for this that you can call?). See also the source code for the bit that doesn't involve /etc/fstab.
(Oh, and in order to find/etc/fstab you'll probably need to run cygpath first, as there are many possible install locations for msys and cygwin; my msys is in a subdirectory of ~/scoop/apps/
, for example.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zimbatm I agree, the spawning part can be slow. This PR already does some effort to reduce the amount of calls to cygpath. This specific function, for example, is called only during construction of the msys2
object (which happens once during a single execution of direnv).
But you are right, it can be improved by implementing the relevant parts of cygpath itself, directly in Go. If we don't need a fully equivalent cygpath implementation, then we might get away with a heuristic based solution. Here is an example: since the current implementation only cares to know where /
and /tmp
are mapped to, we could potentially resolve those paths without using cygpath, by looking at the TMP
, TMPDIR
or TEMP
environment variable for the /tmp
value and maybe the MSYSTEM_PREFIX
value (or something similar) to derive a value for the /
path.
I don't want to optimize too early though. I'm hoping that user feedback might identify other situations where we might need to use cygpath (or not).
It may even be possible to pull in the actual c source code of cygpath and use it together with CGO for a fully equivalent solution, but this will complicate builds. Maybe one day someone will take a stab at implementing cygpath entirely in Go.
To answer your question more directly: For how this PR uses cygpath, I think we could get away with a very basic version of it implemented in Go using the heuristic method proposed above, but I'm not convinced that the gains are worth the effort, although I am happy to implement a version of it if you prefer to keep cygpath out of the loop.
It is also worth mentioning that the stdlib.sh
already has a reference to cygpath (in the source_env
function) and uses it when it is available. With that in mind, since it will already be called for every source_env
invocation on msys2, the two extra calls made in this PR will not add that much of an overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zimbatm I've made a slight optimization: cygpath is now only called once at direnv startup, specifically to resolve the PATH variable. Maybe we can leave the "implement cygpath logic in Go" for another PR?
test/direnv-test-common.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might have some implications on the direnv reload logic. Can you check if direnv reload
and direnv allow
behave as expected?
19c38cf
to
5886bda
Compare
Thanks @zimbatm. Windows is my primary development environment (has been for a long time and I don't expect that to change anytime soon), so I'd be happy to maintain the relevant portions. |
fe0b7dc
to
f49c709
Compare
@pjeby This PR now has support for Cygwin. It turns out there were a few additional environment variables that needed some special handling for direnv to function properly. Will you be able to review the PR to see if there is anything off about the Cygwin support? @zimbatm I realized there was enough commonality between Cygwin and MSYS2, so I expanded this PR to cover both environments. |
So I'm looking over the code, and maybe I'm reading this wrong due to limited context in-diff, but it looks to me at first glance like there's going to be multiple cygpath calls on every direnv execution, which is likely to visibly slow down prompts, loading new .bashrc, etc. From the code, it looks like it's there to make DIRENV_ stuff show up as unix paths in stdlib.sh, but I'm not sure why? My setup works fine without that. The only variable I do any conversion on is Note that Cygwin commands mostly accept Windows paths, they just mostly don't manipulate them well. And I think that expecting Cygwin and MSYS users to take the time to write their .envrc or scripts to use cygpath when managing those vars is a fine tradeoff, especially since it then only happens when an .envrc is being run. (Or alternately, only running cygpath when calling an .envrc, since that happens much less often.) |
@pjeby Thanks for going through the code. Yes, every invocation of direnv on Cygwin will invoke a few cygpath calls, for converting key environment variables that come in as Unix paths, to Windows paths. Since direnv is a native Windows executable, the Go runtime expects Windows-style paths. An easy way to see the affect of this, is during the LoadConfig phase of direnv. It looks like direnv is succeeding because LoadConfig is designed to "fail" silently. The unit tests helped in identifying this issue, but I got the clearest view of the problem by enabling debug logging, using a Another place where this has an effect is in the diffing algorithm (when watched files are being checked for modtime changes). This becomes clear with you use the direnv hook, not so much if you are manually using direnv export. If Go can't find the files being watched (i.e. if the watched Unix paths are not converted to Windows paths before checking their mod stats), then the diffing algorithm doesn't work.
If by this you mean that users should manage the DIRENV_ stuff (using cygpath) themselves, then I'd disagree on that point: user's shouldn't need to fix (or know) about those direnv internals. For path variables in general, this PR does not call cygpath on your behalf in Cygwin (the stdlib.sh unit tests show they are no-ops), it does however call cygpath for MSYS2 systems when manipulating the PATH using the direnv stdlib. Without this feature, I had to modify my existing .envrc files to be MSYS2 aware, but with that feature, I could just use my existing .envrc files - this is a nice user experience. The cygpath calls inside the stdlib will only be called when the direnv diffing algorithm determines that a .envrc change was made, so I'm not too concerned about the performance impact of that (since that file and its watches don't change that regularly). Although there is probably more room for optimization there (i.e. I see that the stdlib invokes direnv quite a few times for something like If the direnv hook is installed, then every rendering of the prompt will eval a If you have more time available, I'd appreciate it if you can clone this PR and test it on some of your direnv projects. |
I'm a bit confused by some of your statements, see e.g.:
This looks like it's working just fine on Cygwin with Windows paths, without needing any cygpath adjustments? Also, it appears that cygwin's dirname and basename commands work fine on Windows paths, at least for my quick checks. The above are all using direnv 2.32.3, installed via scoop on Windows, under Cygwin bash. (i.e., not using this PR.) To be fair, I'm not making use of the stdlib on my PC, only basic exports, so if there's some issue there I'm not aware. I'm just not clear on how you're getting unix paths for DIRENV_ stuff in the first place. Are you sure you're not running in an MSYS environment rather than a Cygwin one? MSYS is extremely aggressive about converting filename arguments and environment variables, where Cygwin is more laid back and only converts PATH by default. |
Thank you for checking! Your feedback definitely indicates that I'm missing something. Let me give it another go. |
I updated the code with more checks to prevent unnecessary calls to cygpath.
I was wrong about that statement, it was With the added checks, cygpath usage is significantly reduced, practically down to the PATH environment variable. Thanks for highlighting the issue. @pjeby Do you see any other red flags? |
I'm mostly worried that you're changing a lot of existing direnv-on-cygwin semantics in ways that aren't sufficiently conservative. For example, you've got new logic around symlinks that seem they might be based on the assumption that Cygwin doesn't use real symlinks. But my setup uses You've also got a lot of code that appears to check for cygpath as an indication of whether you're running on Cygwin/MSYS, but that's really not the way to go about it in stdlib.sh. So at this point I'm worried that while your tests work in your environment, that even if they were made to work on mine, they don't necessarily match other users' environments, and that I may not even be spotting everything that's an issue. TBH, I ran into similar issues when I was working on this, which is why we ended up splitting the objective into more bite-sized chunks and got some of those in. What I would strongly suggest at this point is first doing a PR that only addresses inbound translation of those variables that direnv directly uses for its own operations (locating configuration/etc), without affecting .envrc handling to start with. Then, once some experience is gained with the limited use of cygpath, and there is feedback available from actual users, the next PR could address dealing with back-translation of specific outputs to be emitted from I'm very concerned that the current approach is conflating these two (very different!) requirements in a way that's likely to lead to lots of bugs and corner cases. direnv should not be translating input vars that aren't for its own immediate use, as that runs the risk of making the environment mangling situation worse, not better. Both msys and cygwin have their own translation layers in each direction with Windows, and injecting another such layer (that AFAICT might be running twice?) doesn't seem like a great plan. For output, this should be fine, since you are dealing with the variables being sent back and reverting the final "unix to windows" translation (if applicable). It's like there's currently this Sorry to be so negative, but I'm trying to do a conservative reading because the diffs don't show a lot of context and it's been years since I last dove deeply enough into direnv's internals to have the full picture in my head. But from my recollection of my previous attempt the only clean way I could see to make this work properly on MSYS at least was to bypass its translation altogether (e.g. by having the .envrc runner output its environment vars to a pipe, and/or having the direnv hook function pipe exports into direnv export directly.) Cygwin OTOH is in a "it works OKish now, please don't break it" state, provided PATH is fixed in the output of direnv export. (Hence my inclination to only do output translation at the final phase of direnv export.) I would actually almost suggest that it's cleaner to add the cygpath calls to the output of But in any case, having a PR that only addresses input conversion of direnv-use-only paths that doesn't affect the actual process environment passed down anywhere else would be a ton easier to read and verify safety of. And then doing the other parts as progressive enhancements would be more reviewable as well. |
The current direnv-on-cygwin experience is very limited and should not be the standard to compare with. The unit tests fail and direnv exec is not supported. With this PR, unit tests succeeds and direnv exec works on Cygwin. In terms of the symlinks, the logic is very minimal (one or two lines) and only used in the unit test to deal with stock Cygwin environments (i.e. a CI-friendly environment).
It's not that much code, maybe you were looking at an older commit. I also don't check for Cygwin/MSYS2 anymore, since that doesn't matter. What does matter is whether cygpath is available and whether the path you want to convert looks like a Windows path or not. To clarify, I do distinguish between Cygwin and MSYS2 inside the unit tests, to show how these environments deal with things differently. However, in the actual stdlib and Go runtime, I don't distinguish between these environments on a logic/functionality level, only for debug logging purposes.
I'd love to know if the unit tests pass in your environment.
I've also wondered how I can simplify this PR and break it into chunks. My goal is to have the unit tests pass on MSYS2/Cygwin, so any bite-sized contribution should satisfy that. If the chunk doesn't enable the tests to pass, then I don't see the value in the contribution. So currently, I think this is the smallest chunk that I could implement to satisfy that goal. I've already removed some stdlib stuff that I added originally, but found is not needed. Another way I could try and simplify is by focusing on MSYS2 as a first PR (as per my initial attempt) and then wait for the feedback cycle and then look into Cygwin. With this approach I would have to re-introduce the code that distinguishes between Cygwin and MSYS2. I'm not too fond of this approach since I'm happy with the results as is, but I'm not against it.
This alone would not make the unit tests pass.
This PR only exports PATH to the environment.
In this PR,
This PR would transform the Cygwin experience from OKish to "as advertised".
I considered OSTYPE, but found discussions online that suggested that it is not as a reliable method as one thinks. For one thing, OSTYPE worked in the stdlib.sh (at least for Cygwin), but for some reason was empty in the Go runtime. Regardless, as indicated earlier, it is no longer necessary to know whether we are using cygpath in an MSYS2 environment or in a Cygwin environment, it is good enough to know if cygpath is available and if the path in question is a Windows path or a Unix path. Adding cygpath calls to the output of
This would not allow the unit tests to pass. |
I'm not saying that it's not a good idea to do more, ever: I'm saying that this PR is currently hard for me to review and validate without re-reading most of the source of direnv, so breaking the PR into smaller pieces/phases would be immensely helpful in me being able to properly review it. As it stands, the only level of review I can give right now is to look at stuff and go, "this looks concerning to me", because I don't have enough contiguous time available to me to load the entire program structure in my head - I'm able to devote minutes to it right now, not hours. If you want me to do an actual review and/or build this patch, it might be several weeks before you get a response. So I've been doing this by skimming and cross-referencing the PR in the interest of getting you faster/more feedback than, "I'll let you know in a few weeks." If you'd rather wait for more detailed feedback or go without it from me, that's fine; I just assumed you'd rather have a low-fi review now than a more thorough one at some unspecified future date. Conversely, if you want a higher fidelity review from me now, that's only going to happen if the PR has a lot smaller scope, hence my suggestion to break it up. (As @zimbatm did with my last attempt at this, for I believe similar reasons.) |
Understood. I was too excited. I think I can submit individual PRs that would introduce things gradually.
And I appreciate your early feedback, it has guided me in several ways already.
I'm not in a rush. If time is the issue, then take your time, I'm going away for a bit of a holiday anyway and won't be back for a week or so. But if the PR is too big, even if you have enough time, then let me know, and I'll do incremental PRs for things like the gitattributes file and other small stuff. |
This PR adds support for two common Linux-on-Windows environments: Cygwin (Bash) and MSYS2 (Git Bash).
These environments have lots in common with a few notable differences. It is good enough for this PR to see MSYS2 as a variant of Cygwin, but with different defaults.
The key difference (relevant to this PR) is in the way that these environments deal with paths in process arguments and environment variables. And the key similarity is that they both offer the cygpath utility to handle path conversions manually.
This PR uses the
cygpath
utility to:stdlib.sh
(Unix-style paths)PATH
environment variable in Unix-stylestdlib.sh
functions (such as PATH_add) for automatic conversions to Unix-style paths (MSYS2)This PR was tested on Cygwin, MSYS2 "Minimal" (Git Bash), MSYS2 "Standard" (MINGW64) and WSL, with all unit tests passing and all direnv commands working as expected. More specifically, I am satisfied that the following direnv commands work well: hook, allow, deny, reload, export, edit, exec and status. I also tested this on an existing direnv project and am happy with the results.
The functionality in this PR is only activated in conventional MSYS2 environments and conventional Cygwin environments. Other environments revert back to the regular direnv behaviour. Care was taken to reduce the performance impact for non-cygpath environments by doing cheap calculations to skip the cygpath stack as early as possible. Relevant checks are in
stdlib.sh
and incygpath.go
.@zimbatm Windows has been my primary development environment for a long time and I don't expect that to change anytime soon, so I'd be happy to maintain this contribution. I also spend a lot of time in the Windows Subsystem for Linux (WSL) and use the Nix Package Manager together with direnv whenever I have the opportunity (and have done so for quite a few projects), so I have a relatively good understanding of the typical direnv build and runtime environments, and it is easy enough for me to test (and maintain) this contribution in different contexts.
Related Issues
This PR will resolve the following issues (when using direnv inside an MSYS2 Git Bash or a Cygwin Bash environment):
@pjeby also has a PR aimed towards Cygwin support: #630 (also see #638).
Future Work
stdlib.sh
we would just be replacing calls to cygpath with calls to direnv (unless we want to maintain two native cygpath implementations - one for Bash in thestdlib.sh
and one for Go).pacman -S direnv
(like fish and make for example).stdlib.sh
. If usage on MSYS2/Cygwin gains traction, it might be helpful to write a stdlib extension with Cygwin/MSYS2 helper functions (e.g.is_msys2
,is_cygwin
,has_cygpath
, etc). If an extension is not desired, then it could be included in the stdlib.cygpath
and then run a few scenarios to verify opportunities for optimization.Implementation Notes
PATH
environment is exported (Go runtime) and manipulated (stdlib.sh
).PATH
export.eval "$(direnv export bash)"
. To export an environment that looks more like your.envrc
file, useeval "$(MSYS2_ENV_CONV_EXCL="*" direnv export bash)"
. Theeval "$(direnv hook bash)"
command performs the export using the latter method, since it provides a familiar default experience to direnv users and is also faster since it skips a chunk of MSYS2's environment processing.make test
target in theGNUmakefile
tests Bash support, but specifically excludes the tests for the other shells supported by direnv, e.g. Elvish, Fish, etc. This is because the conventional MSYS2/Cygwin environments use Bash, and I didn't want to make any guarantees about support for non-conventional environments.GNUmakefile
provides a target to setup a build environment usingscoop
, that can be used for building on MSYS2 (Git Bash) and Cygwin. However, usingscoop
is not the conventional approach on those environments. The minimal MSYS2 environment (Git Bash) has no package manager, so usingscoop
is sensible. The regular MSYS2 environment includes thepacman
package manager for installing packages. The Cygwin environment does not have a package manager, but includes a setup executable that allows you to choose the packages you want in your environment. The difference between these approaches, is thatscoop
will install the native Windows packages as typically made available by the tool authors, whereas MSYS2pacman
and Cygwinsetup.exe
will install versions of the packages that are compiled in those environments with specific patches applied to make Windows tooling morePOSIX-y
(Cygwin) or Unix tooling moreWindows-y
(MSYS2).Developer Notes
PATH
environment variable.pacman
package manager (which is included by default with the standard MSYS2 installation).Comments
@zimbatm @tiymat @joshahussey While testing this PR in WSL, I came across the following code (introduced in #1242):
direnv/test/direnv-test.mx
Line 290 in ff09cc4
When running the unit tests, it looks like a silent failure, in that the tests report success, but looking at the output it seems like it actually fails. I suspected that this was a typo and renamed it to
VIRTUAL_ENV
, however, this caused the unit tests to fail hard, so I decided to leave it as is.@garyo If you are still interested in using direnv in an MSYS2/Cygwin environment, then this PR can help. It should be as simple as cloning the PR, installing go, and running
make
followed bymake install
.