Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shell-specific actions and use them to implement aliases #464

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

jcpetruzza
Copy link
Contributor

This implements a mechanism to send shell-specific code to the host shell from the .envrc file, along with a simple protocol to register clean-up actions to be performed when unloading the environment. The supported shells at the moment are bash, zsh, tcsh and fish (elvish is missing only because it doesn't have an eval function, or I couldn't find it).

Based on this, we implement the exporting of aliases from .envrc (#73). The same mechanism should let us support loading/unloading shell-completion scripts (#443) and, more in general, run arbitrary commands on entry/exit of the environment.

The core of direnv was left mostly unaffected; this is a rundown of the changes:

  • The first 2 commits add tests for tcsh and zsh, so that we can then test the new functionality on all shells. For the latter, we can simply share the test script with bash as long as we stay within bourne shell code.

  • The next 3 commits fix minor issues in tests.

  • Next, we introduce a quote command to the stdlib. Intuitively, quote fish foo bar will add foo bar; to the output of direnv export fish, etc. This works by accumulating the quotes for shell xxsh on the DIRENV_QUOTES_xxsh environment variable (excluded from DIRENV_DIFF).

  • quote is a low-level operation: it will only send commands when the environment is loaded. The next 5 commits add the higher-level shell_specific command to the stdlib that allows one to also register clean up actions. If an .envrc file were to include the following:

    shell_specific zsh "echo 'rm -f foo'" "touch foo"

    Then loading the environment with direnv export zsh would have, conceptually, the following additional effect:

    export DIRENV_ON_UNLOAD_zsh=$(echo 'rm -f foo')
    touch foo

    DIRENV_ON_UNLOAD_zsh is honoured by direnv export zsh when unloading the environment, which would then include the extra effect:

    rm -f foo
  • The second argument to shell_specific is conceptually a function computing the clean-up action. In the above example, we use echo 'rm -f foo' as a constant function that always returns rm -f foo. For a more realistic example, when implementing aliases for bash, we can exploit that alias foo outputs the full definition of foo (including the alias keyword) and use something like:

    shell_specific bash 'alias foo || echo unalias foo' 'alias foo=ls -lt'
  • In reality, DIRENV_ON_UNLOAD_xxsh is a comma-separated list of gzenv-encoded quotes. In order to be able to add elements to this list from the envrc, we added a direnv gzenv [encode|decode] private command.

  • The last commit adds a use_aliases function to the stdlib, so that the alias-exposing functionality becomes opt-in. The usage is:

    alias hello=world  # hello will not be exported
    
    use aliases
    alias foo=bar  # foo will be exported

We could use the bash tests almost verbatim. The only changes needed
were:

  - Use [[ ... ]] for guards
  - Use `direnv export zsh`

So we can have one common test script and source it from both the
bash and zsh tests
In the hook code, we do essentially `eval "$(direnv export bash)"`,
while on the test code we were doing `eval $(direnv export bash)`.
These are not equivalent, since spaces in the output of `direnv export
bash` will break the output; so that everything that comes after gets
interpreted outside of the `eval`.
Call was failing because the template was missing the
.XXXXX bit.
Make `direnv export xxx` add the contents of the
`DIRENV_QUOTES_XXX` to its output. Any other variable
of the form `DIRENV_QUOTES_YYY` is ignored. This will
be the basis for implementing shell-specific functionality
in library code.
We can use `quote <shell> foo bar` to add the command `foo bar;`
to the output of `direnv export <shell>`. This is the basic building
block for providing shell-specific functionality from within the envrc.
So that we can easily encode/decode strings using this format
from the envrc. We will use this to implement "on_unload" quotes.
The `shell_specific()` function is a higher-level version of
`quote()`: it let's one send to the shell not only a command
to do "on loading" the env, but also a command that can compute
the diff with respect to the current env and record it, in the
`DIRENV_ON_UNLOAD_XXX` variable, to be executed "on unloading".

From the point of view of `direnv export xxx`, the `DIRENV_ON_UNLOAD_XXX`
variables are similar to `DIRENV_DIFF`. The protocol that is
implemented is:

  - When running `direnv export xxx`, if we are unloading, only
    `DIRENV_ON_UNLOAD_xxx` is considered.

  - `DIRENV_ON_UNLOAD_xxx` is a comma-separated list of
    shell-specific commands, encoded using gzenv.

  - When the decoded commands are output, they are separated
    by newlines
At the moment, fish has no equivalent of the `"$(cmd)"` substitution
of bash, zsh, etc. In particular, `(cmd)` will split the output of
`cmd` on newlines. That means that `eval (direnv export fish)`,  will
first run `direnv export fish`, then split it on newlines and finally
call `eval` with one argument per line. The behaviour of `eval` is to
concatenate all its arguments with *spaces* (not newlines) and then
evaluate that.

This is a bug waiting to happen (e.g., if the value of an exported
value contains a `\n` it could be silently replaced by a space), and
would also prevent us from implementing `shell_specific` for `fish`.

In fish, though, `eval` is just a wrapper around `source`, which can
actually take it's input from stdin. So we can simply pipe the output
of `direnv export fish` to `source`.
tcsh is notoriously hard to use as a scripting language
so trying to write actual shell-specific code for this shell will be
"fun". In particular, back-ticks cannot be nested (there is no
equivalent of `$(...)`) and they are not easy to escape.
We introduce the `use aliases` functionality in
the stdlib. Once activated, the `alias` command
will effectively set aliases on bash, zsh, tcsh
and fish by way of shell-specific code. It is
implemented by aliasing "alias" to a function
that mimics the behaviour of "alias" under
"bash" but also emits the corresponding
shell-specific code.

On each shell, a command is run to find out the
current definition for the alias (in case it is
being redefined)  and an "on-unload" action is
registered that will restore the original
definition or remove the alias in case it is
new.

In the case of tcsh, this was tricky to test
since in order to decide if, on unload, we
need to execute a `unalias foo` or an `alias
foo old-value`, we need to run a command that
in some cases will finish with a non-zero
value to be able to perform the "if". However,
`tcsh -e` will fail immediately when this
happens (the alternative would be to use
backticks, but they can't be easily nested),
etc. So we end up adding another test script
for tcsh to be run with tcsh, for this type
of scenarios...
@zimbatm
Copy link
Member

zimbatm commented Mar 8, 2019

thanks @jcpetruzza, I already integrated all your test improvements.

Sorry but the rest of the PR changes direnv too much and I am not comfortable merging this. I can see direnv potentially managing more things than just environment variables but it would have to be done declaratively where each shell hook would be expanded to handle the diffing.

@jcpetruzza
Copy link
Contributor Author

@zimbatm Thanks for your response! I think it is fine not to merge the rest of this PR as it is. I was initially going to submit only the minimal number of changes that would allow people to implement things like aliases or shell-completion on their own (and share the implementations on the wiki), but I felt without a fully-developed example it wouldn't be clear that the idea works, etc, so I ended up adding a use_alias example to the stdlib. I see the code in this PR as a working proof-of-concept that I would like to use to discuss the design of a direnv extension to handle additional effects that is good to be merged in.

The main take-aways, for me, from implementing this prototype are:

  1. From the rc script one needs to be able to signal to the running direnv export xxsh command, (via direnv dump) that there is some xxsh-specific action to perform, so that this can be ultimately sent to the hook. I don't think one can use anything better than a reserved environment variable for this. A design choice here is whether direnv dump interprets it and then outputs something more complex than an Env, or if it just passes it through as an environment variable. In this implementation, I used a DIRENV_QUOTES_xxsh variable, let it pass through direnv dump and the "signal" was just the raw xxsh-command to run. If I understand correctly, you'd prefer, as "signal", some sort of message (e.g., "define this alias...", etc) that then the func (sh xxsh) Export(..) can turn into xxsh-specific code. I think that in any case, the possibility to send raw xxsh-code should be there as well, so people can experiment with ideas without having to modify the direnv implementation; this could be achieved with a message that'd say "run this xxsh-code".
  2. Whatever effect is ultimately executed in the hook, we will normally want to reverse it when unloading the env. The problem is that in general we won't know how to reverse the effect until the code in the hook runs (e.g. one needs to know what aliases are overwritten to be able to restore them). The only way I can see to implement this is by having the hook set an environment variable that will be used by direnv during the unloading of the env. Here I've used DIRENV_ON_UNLOAD_xxsh for this. One would want to use some gzenv-encoding for this (as is done for DIRENV_WATCHES), so an additional direnv command will probably need to be implemented as well.

Do you think this could be implemented in any other way?

I'd propose then the following design for the extension:

  • We add two reserved environment variables: DIRENV_LOAD_ACTIONS and DIRENV_UNLOAD_ACTIONS. They will contain a gzenv-encoded list of "actions".
  • DIRENV_LOAD_ACTIONS is to be set by the rc-script and honored by direnv during an environment load.
  • DIRENV_UNLOAD_ACTIONS will be set by the eval-call installed by the shell-hook, and honored by direnv during an environment unload.
  • An "action" is a pair ("action_type", "action_payload"). Initially, the action type will be one of: quote_bash, quote_zsh, quote_tcsh, quote_fish; and the payload a command for that shell. More higher-level actions can be added in the future as needed.
  • We add a direnv add_action command, analogous to direnv watch. It uses DIRENV_IN_ENVRC to decide whether to target DIRENV_LOAD_ACTIONS or DIRENV_UNLOAD_ACTIONS.
  • A function add_action is added to the stdlib, analogous to watch_file.
  • A function on_direnv_unload is installed by the hook of each shell (similar to add_action), to be used by the code that needs to install an unload-action.

Because this is a cleaned-up version of what is already in this PR, we have a validation that it should work. This also doesn't require many changes to direnv internals, imo. One can then play around, implement something cool like "use direnv to activate a local nix profile and have the shell-completion scripts updated", and share it on the wiki 🙂

Thoughts?

@zimbatm
Copy link
Member

zimbatm commented Mar 14, 2019

sorry I haven't had much time to think and this is a bit of a deep subject.

One of the fundamental difference between environment variables and the rest is that env vars are inherited between processes. So I can cd into a project with my bash shell, start zsh and provided both have direnv installed, everything will keep working as before. I can cd .. in the zsh shell and unload the env, etc... If I exit the zsh shell, I will be back to the bash shell with it's state intact as well.

Now if I think about aliases, what should happen is that the bash hook, somehow keeps track of it's own aliases. If a special DIRENV_ALIASES environment variable is created after an eval, it would have to backup the old aliases and apply the new ones itself. And somehow on every prompt command it would have to check if DIRENV_ALIASES has changed so that it could work when invoked as a sub-shell. Here I am assuming that the aliases are simple and compatible with all of the shells. It also puts the burden on each shell implementation to implement their own aliasing mechanism.

This is the reverse of your LOAD/UNLOAD actions as the responsibility is shifted from the .envrc to the specific shell for doing the backup and restore. But it means that aliases can actually be restored which I don't think your approach handles.

Another reason for my reluctance is that I would prefer to merge all the DIRENV_ environment variables into one DIRENV_STATE at some point, so introducing new variables is not ideal. For that I think your gzenv utility is quite helpful. I realize that it's not directly related to the PR and I would be happy to be working with you on that first if that's alright with you.

@jcpetruzza
Copy link
Contributor Author

The point about subshells is a very good one, and I hadn't thought about that, really. You don't even need to go the bash -> zsh route, already bash -> bash and zsh -> zsh is not working in this prototype, since an alias defined by the envrc will "disappear" from the subshell.

Now if I think about aliases, what should happen is that the bash hook, somehow keeps track of it's own aliases. [...] And somehow on every prompt command it would have to check if DIRENV_ALIASES has changed so that it could work when invoked as a sub-shell.

I think I would do this differently, though. The code emitted by direnv hook could set some random UUID in a DIRENV_HOOK_ID envrironment variable, and the contents of the proposed DIRENV_STATE could contain the hook id of the shell that produced it by calling direnv export xxsh. That way, when you launch a subshell, it will get a different DIRENV_HOOK_ID and the call todirenv bash xxsh associated to the prompt-command can then notice that given there is a DIRENV_STATE for the right envrc file and the hook id is different, the envrc needs to be rerun (or the output re-emitted since we can presumably reproduce it from the state).

Here I am assuming that the aliases are simple and compatible with all of the shells. It also puts the burden on each shell implementation to implement their own aliasing mechanism.

This is the reverse of your LOAD/UNLOAD actions as the responsibility is shifted from the .envrc to the specific shell for doing the backup and restore. But it means that aliases can actually be restored which I don't think your approach handles.

Wait, this prototype is already restoring the aliases! 🙂 Look at the test for the scenario called "alias": it checks that on entering a directory two aliases foo and bar are set according to the definition of the envrc and, on exiting the directory, are restored to their previous value (in the case of foo, by removing it). And works on bash, zsh, fish and tcsh, which required, in each case, shell-specific code handle it.

If I understand correctly, what you are proposing is to make the hook more smart, but also more complex (first is has to know about aliases, and tomorrow about shell-completions, etc), while in my approach the hook remains just as simple as it is right now and users can add new shell-specific extensions without having to touch direnv internals, since they just need to use the envrc to emit the shell-specific code that will eventually run on the hook (not the most beautiful thing to write, but well...)

Another reason for my reluctance is that I would prefer to merge all the DIRENV_ environment variables into one DIRENV_STATE at some point, so introducing new variables is not ideal. For that I think your gzenv utility is quite helpful. I realize that it's not directly related to the PR and I would be happy to be working with you on that first if that's alright with you.

Moving towards a single DIRENV_STATE variable makes sense to me and I'd be happy to help with that. It would be a biggish change, though, so it's probably better to do it incrementally. Do you want to open an issue to lay out the plan and track the progress?

@zimbatm zimbatm force-pushed the master branch 8 times, most recently from a4e430c to 6d567f3 Compare May 21, 2019 10:22
@varac
Copy link

varac commented Sep 5, 2019

What's the state of the PR ? I'd love to define directory specific aliases with direnv !

@seanking2919
Copy link

Any update on this?

@zimbatm
Copy link
Member

zimbatm commented Dec 22, 2020

It's not very likely to happen. @jcpetruzza did a fantastic job, and I don't feel comfortable merging this at the same time. It's too much new surface of untested code for this little one-man project.

If somebody were to abstract the changes record to include aliases and the shell process id, on top of environment variables, then it might be doable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants