-
-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elixir and Erlang RELEASE_COOKIE: let's reach consensus on what to do to fix the current mess #166229
Comments
I do not like the idea of patching that script. If we want to patch it, it would be better to find a way forward to upstream the change to the script to make it easier to work with. I am trying to understand what is the problem here. So the mix release generate (or get injected hopefully for reproduceability) a cookie when it generate the release. We package that and a user download it. Is the problem that we expect a different package to use the same cookie ? |
That's how upstream However, in our case:
Meaning, when a user tries to launch a binary, the mix-release startup script will fail when trying to load the non-existing Here's a practical example of a CLI utility ( /tmp/tmp.JRFJp10kpj » cat pleroma-without-cookie.nix
let
# Nixpkgs pin **before** the commit in which we wrap pleroma pleroma binaries with a dummy COOKIE.
pkgs = import (builtins.fetchTarball {
url = "https://github.com/NixOS/nixpkgs/archive/1098fc92217ac27746ec8004a87ca742b1408795.tar.gz";
sha256 = "sha256:0p6bva4jw7nfr57ds21mi6mj8axx1parpj8xcg1hkk36hhcs8lhj";
}) {};
in pkgs.pleroma
/tmp/tmp.JRFJp10kpj » pleroma=$(nix-build pleroma-without-cookie.nix)
/tmp/tmp.JRFJp10kpj » "${pleroma}"/bin/pleroma_ctl create
cat: /nix/store/5bl9vq8acxf7h6s0rdxwm08g6p32sqs8-pleroma-2.4.2/releases/COOKIE: No such file or directory [Edit]: I realize my OP wasn't clear at all and leads to some confusion. By:
I basically mean:
|
hey, thanks for starting this! let me try to summarize the problem. I have a couple of questions. |
Individually answering the questions.
👍 It does for the binaries meant to be used as long running services.
I don't see any either for binaries meant to be used as long running services.
👍
Yes. This whole story started out by a confused user pinging me on IRC facing this exact error (not being able to run pleroma_ctl from the CLI), they were rightfully confused by the following error message:
My knowledge of Erlang and Elixir is very limited, I took this opportunity to dig a bit more into EVM. I now realize this whole story boils down to the threat model you had in mind when writing this. Since I'm a novice here, I'm going to describe what I see as the current threat model. Could you confirm this is what you had in mind, just to make sure we're at the same page here? From what I can tell, elixir uses the Now, let's consider the local users. If we were to store the release cookie in the Nix store, it would mean that any user having access to the machine running the Erlang node could get an interactive access to it provided they can find the world-readable cookie in the Nix store. This is the unacceptable part leading us not to store the running cookie to the store. ^ Is this threat model correct or am I (once again) missing something? In the end, I think this situation boils down to a single question: how common is it to find Elixir binaries meant to be both short-lived and used interactively 1.
Footnotes
|
I would say that in general, yes, having elixir short lived command line tools is "relatively" rare except for server side stuff managing an elixir application, and in this case it is expecting the operators know enough to handle the work needed to make it work. the Erlang system are not really meant for short lived stuff. |
We had a brief discussion offline with NinjaTrappeur. The actionable on this item would be to make a PR to document the threat model and our choice. The additional detail that was new to me is that the commands like The last issue we haven't talked about is getting an Thank you again for starting the discussion! |
It may be worthwhile to add both a defaults:
asserts:
This means that one of the values must be set, and since to my understanding no (without fixups) The dummy cookie would just be a cookie generated in |
Sounds like a sensible different approach. It'd probably mean having to patch mix as described here #166229 (comment) on the Elixir side. (or wrapping all the mix release builds). @lambdadog would you be up to implement that? |
I see no compelling reason why we shouldn't simply wrap all mix release builds. A patch is more maintenance and wrapping a program can be considered to be cheap enough we don't need to care on all NixOS targets. And absolutely, I'll start work on it. |
We potentially could add a CLI flag to Adding a wrapper is also fine by me. |
Okay I haven't commented on this so far. The fact that I think it's fine if we put a well-known default cookie in the Nix store, because that is how the Instead, I think the proper solution here is to completely disable the distribution features of the beam VM by default in all the NixOS modules. This can be done by setting If someone wants to set up an installation with distribution features and do that securely, it's their task to properly secure access to those ports and set a different cookie. |
Yeah I'm leaning towards agreeing with @yu-re-ka here as well. That being said... I would like to keep the It's probably more correct to disable distribution stuff by default, and let users explicitly enable them though. Does the BEAM VM require a cookie? if one isn't provided, does that mean anyone can connect? If not, then I would say remove the cookie. |
Any updates or new thoughts on this? |
This seems to be preventing me from running my phoenix app :( What are we supposed to do? |
- Need to provide RELEASE_COOKIE environment variable when running the app (NixOS/nixpkgs#166229) - Deploy script has an output directory hardcoded that doesn't play nice with nix. I made change and generated a patch file with `git diff` in my local copy of the repo. I also had to make sure to change the filepaths in the patchfile to remove the `assets/` prefix. The contents of the this directory must be moved to `priv/static/assets`. - Have to manually install the phoenix node dependencies (these aren't fetched from npm, but from the repo itself).
I am also having problems with this. Any updates? |
I can say that I'm no longer really available to work on this issue (and apologies for falling through on it previously), but as far as I'm aware the implementation I discussed should still be sound. I am curious if anything has come from mentions of upstreaming some changes. I get the feeling that Elixir and Erlang applications weren't really designed to be distributed via a system package manager and that's the core of the issue, but I'm not sure if the teams upstream would be interested in including and maintaining changes that enable this kind of package manager distribution a bit better. |
Some Context
Since Elixir 1.13, the absence of a release cookie at startup leads to a failure in the
mix-release
-generated start script. We've been hit pretty badly by this issue for the Pleroma packageFirst, @kloenk approched this by injecting a random cookie to the service start script in #149368 via a systemd-provided env variable.
While it did fix the Pleroma service startup, it did not fix the situation for the interactive binaries such as
pleroma_ctl
. Trying to fix that issue, I opened #164398, which wraps all the$out/bin
binaries with the previously mentioned release cookie env variable containing a dummy release-cookie.Sadly, this PR broke some existing setups which led @yu-re-ka to open #164965 .
At this point, @yu-re-ka, @kloenk and me started to discuss how to fix this once and for all on Matrix. During this discussion, we realized that most of the single-node beam packages are likely to suffer from the same issue. We agreed the proper fix shouldn't live in the
pleroma
derivation but rather in themix-release
routine in charge of generating the startup scripts.Status Quo
How could we solve this unfortunate situation?
I personally can see 3 options:
RELEASE_COOKIE
. In that case, we probably should write some documentation about it and streamline the wrapper introduced at nixos/pleroma: create cookie if not existing #149368.mix-release
script generator to use a static cookie. We'd first generate a static release cookie in the Nix store and then patch the script generator to setRELEASE_COOKIE
to this static value. In practice, it would mean we'd stop deleting the statically generated release cookie in themix-release
postFixup
script. This would solve this release cookie issue for the single node deployments. Of course, we'd provide a way to override this and set a custom cookie in case of a multi-node deployment.mix-release
script generator to use a cookie situated at runtime's$(PWD)/cookie
instead of something located in the$RELEASE_ROOT
, ie. in the release Nix store path.I'd be personally in favor of moving forward 1 with the solution 2. I'd assume that if a user is advanced-enough to setup a multi-node Erlang cluster, they are advanced enough to override the static dummy cookie in their Nix config to something sensible.
I don't like the idea of leaving things as they are (ie. solution 1): we'll know for sure that any binary produced by
beamPackages.mixRelease
will fail at startup unless it gets patched.Dear @NixOS/beam maintainers, what do you think? Do you see any option besides the ones listed above? Which one do you personally favor? Why?
Cc: @yu-re-ka @kloenk @happysalada @NixOS/beam
Footnotes
Meaning: I'm up to implement that if we all agree it's the way to go. ↩
The text was updated successfully, but these errors were encountered: