-
-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
makeBinaryWrapper: Variable expansion at runtime #172583
Comments
Well, this definitely looks much better than I thought initially. I think implementing this in |
Also, returning an error code can be thought as an advantage. We can capture those errors and print a user friendly message to help with debugging. |
I'm wondering if these errors would be possible to detect/rule out at build time already (like the string containing an explicit newline), or if there are some errors that would only be possible to detect at runtime by setting environment variables to something bad. Another discovery, which seems like it could be very useful when it comes to flags in particular: // gcc main.c -o main
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <wordexp.h>
int main(int argc, char **argv) {
const char *str = "$0 -before $@ -after";
wordexp_t p;
int err = wordexp(str, &p, 0);
if (err) {
printf("wordexp failed with error code %i\n", err);
return err;
}
printf("p.offset = %lu\np.we_wordc = %lu\n", p.we_offs, p.we_wordc);
for (int i = 0; i < p.we_wordc; i++) {
printf("p.we_wordv[%i] = \"%s\"\n", i, p.we_wordv[i]);
}
} Running
|
We can probably just add an obligatory post build phase that runs the binary after wrapping. If something is wrong this will trigger the issue. |
Single quotes also seem to have a special purpose in the string passed to wordexp: #include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <wordexp.h>
int main(int argc, char **argv) {
putenv("expandme=xyz");
const char *str = "$0 '$noexpansion with spaces' $@ -after $expandme";
wordexp_t p;
int err = wordexp(str, &p, 0);
if (err) {
printf("wordexp failed with error code %i\n", err);
return err;
}
printf("p.offset = %lu\np.we_wordc = %lu\n", p.we_offs, p.we_wordc);
for (int i = 0; i < p.we_wordc; i++) {
printf("p.we_wordv[%i] = \"%s\"\n", i, p.we_wordv[i]);
}
}
|
Executing the wrapper will have the side effect of executing the program it wraps though. How would we know if the wrapper crashes or the program itself does? A solution to this might be to replace EXECUTABLE with a dummy shell script and rebuild/run in the post build phase. |
I just tested this on macOS (x86 Big Sur), and it does not behave the same way/give the same output: // ...
putenv("expandme=xyz");
const char *str = "$0 '$noexpansion with spaces' $@ -after $expandme";
// ...
This is likely a consequence of this quote from the documentation:
So we should probably disallow usage of these since they are not portable. |
Yeah, I think substituting the wrapper with a dummy should be doable during testing. However, we can also ignore this issue for now. I mean, most people should run the binary at least once. The main issue would be during mass migrations though. |
Can you also check that your example with |
The problem with running the wrapper after build is that we probably want to allow command substitution, but we definitely don't want to run the commands during build, and we don't want to treat them as errors either. |
Seems to work on macOS (x86) with // ...
putenv("y=Hello");
const char *str = "${x:+$y} world";
// ...
Not sure what the offset is for - but it seems to generally be 0 on Linux, and some huge number on macOS. |
What I was thinking is more like, create a binary wrapper for a dummy binary, build, run. If this results in an error bail out, if not continue. I think this should be doable. |
I'm not sure we're talking about the same thing. Command substitution is the |
For testing the validity of a wordexp-string at build-time, how about simply doing something like this when the C code is being generated: #include <stdio.h>
#include <wordexp.h>
int main(int argc, char **argv) {
if (argc != 2) {
printf("Usage: %s INPUT_STRING\n", argv[0]);
return -1;
}
wordexp_t p;
return wordexp(argv[1], &p, 0);
} # This could be packaged in nixpkgs, so that we don't need to compile every time we generate C-code
gcc wordexp-is-valid.c -o wordexp-is-valid
if ./wordexp-is-valid "test-string"; then
# print out wordexp-code
else
# print out compiler error macro complaining about bad wordexp
fi |
Doesn't solve the command substitution problem. Really we'd need a DRY_RUN flag for wordexp. |
Ah ok. I wasn't thinking of actually covering this case at all with This would be probably the 5% that I said before that still needs |
It does, see the man page. Probably just calls Found a derivation using this:
|
Oh. This is really unexpected.
This looks like it could easily be changed to use But yeah, I understand that other usages may be more legitimate. So for now I am pending more for not having any checks, at all. We can think about how to do a proper checking later on. |
I guess we can look at some of the wordexp implemetations to find out how the error checking is done:
And then try to copy over the error checking-part of the implementation - so that we avoid side-effects of command substitution. |
That would pretty much amount to reimplementing wordexp in nixpkgs... |
Yeah, I also think so. I am not completely against of implementing some type of build time check to avoid having issues during runtime. Like I said, this will be especially good to have when we do mass migrations (like the one in However, I think for the initial version we can ignore the build checks because we will probably migrate a few packages that we will migrate by hand and we will actually test them manually. This should be already be useful. And them once we have a good idea what we want, we can create a better way to validate. Otherwise, I think we will bikeshed how to create a proper way to check during builds and will not get anywhere 😅. |
Another option: disallow command substitution altogether - and set the flag WRDE_NOCMD. Then wordexp shouldn't be able to cause any side-effects - and we could use it safely as a build-time check. |
I think there is still a way this idea can possibly work even with variable substitution. I am not sure if |
In any case the function only returns one error. Also we can't rely on unsetting PATH because the commands might be absolute paths. |
I think inside the build environment it would be very difficult for any program to have access to absolute PATHs, at least on NixOS, thanks to the sandbox. Could be an issue in non-NixOS Linux/Darwin though. Of course, the program can still link directly on the package itself, but them I don't see that much issue (at least in most cases). For the few problematic cases out there we could have a option to disable the check or something. |
Yes, the the function only returns one error - and it will give up on the first problem it encounters. Setting |
Okay, I have another (very hacky) idea - we don't need to reimplement wordexp, we just need to recompile a version of it with This means the command substitution code will return early without causing side effects - but with 0 (success), making it look like everything worked as expected - and as a consequence it will keep parsing the input string. https://github.com/lattera/glibc/blob/895ef79e04a953cac1493863bcae29ad85657ee1/posix/wordexp.c#L890 |
This actually don't see that a bad idea actually. I mean, I still want to see an actual example of this being useful, and even if we found one that it is, this will probably negate most performance advantages that we have with binary wrappers since we will run a subprocess in those cases. I would be totally fine disabling variable expansion by default and only enabling it by a unsafe flag or something. |
We would also want to catch usage of Maybe writing a basic wordexp parser/validator in bash rather than C would not be too difficult, which could be used for build-time validation. |
Starting point for wordexp input parser/validator written in bash: validate_wordexp_input() {
local input
input=$1
while [ "${#input}" -gt 0 ]; do
case "$input" in
# Unspecified behaviour
$'$@'*) return 1;; # Unspecified behaviour
$'$*'*) return 1;; # Unspecified behaviour
$'$#'*) return 1;; # Unspecified behaviour
$'$?'*) return 1;; # Unspecified behaviour
$'$-'*) return 1;; # Unspecified behaviour
$'$$'*) return 1;; # Unspecified behaviour
$'$!'*) return 1;; # Unspecified behaviour
$'$0'*) return 1;; # Unspecified behaviour
# Bad characters
$'\n'*) return 2;; # WRDE_BADCHAR
$'|'*) return 2;; # WRDE_BADCHAR
$'&'*) return 2;; # WRDE_BADCHAR
$';'*) return 2;; # WRDE_BADCHAR
$'<'*) return 2;; # WRDE_BADCHAR
$'>'*) return 2;; # WRDE_BADCHAR
$'('*) return 2;; # WRDE_BADCHAR
$')'*) return 2;; # WRDE_BADCHAR
$'{'*) return 2;; # WRDE_BADCHAR
$'}'*) return 2;; # WRDE_BADCHAR
# More complex parsing
$'\\'*) input=$(parse_backslash "$input") || return $? ;;
$'$'*) input=$(parse_dollars "$input") || return $? ;;
$'`'*) input=$(parse_backtick "$input") || return $? ;;
$'"'*) input=$(parse_dquote "$input") || return $? ;;
$'\''*) input=$(parse_squote "$input") || return $? ;;
$'~'*) input=$(parse_tilde "$input") || return $? ;;
$'*'*) input=$(parse_glob "$input") || return $? ;;
$'['*) input=$(parse_glob "$input") || return $? ;;
$'?'*) input=$(parse_glob "$input") || return $? ;;
*) input="${input:1}";;
esac
done
} Possible errors from wordexp:
Note that |
I strongly oppose this. Please don't add a shell parser to nixpkgs. |
The only purpose of the shell parser would be to validate the input string sent to wordexp as a build-time check in makeCWrapper. Why are you strongly opposed to this? |
It's a lot of complexity for something which I'm not even sure might help. Wrappers should be tested, and errors caught then. For mass migrations, switching to wordexp won't make existing shell expressions invalid. If you really want to check shell syntax, something like On another note, we'd probably also need to support something like
|
Unspecified behaviour ( "$hello|$world;" # valid in a shell string, not allowed in wordexp, because of `|` and `;` I think the only way to properly avoid these problems is to have a custom validator - or to use wordexp itself for validation, which can only be safely done if we disallow command substitution. But even then, unspecified behavior will not be caught, so we'd need to check this seperately. And we can't just match the string against "$@" either for example, since we'd need to count the number of backslashes in front of the dollar sign to know if it is escaped or not. I don't think the wordexp input validator would end up being all that complex - and the benefit for portability and resulting ease of use of makeBinaryWrapper could be worth the complexity in my mind. |
One question: why are we having different behaviors between platforms? Should be because we use different C libs right? Can we just force the binary wrappers on Darwin to use glibc instead of whatever Darwin uses by default? |
Again, I am not against disabling command substitution. I want to see an actual usage of command substitution, because the only example we have right now can be rewritten to not use it anymore ( |
BTW, I think we don't have to worry about command substitution:
So looking at it, there are no cases where command substitution this would be useful. What all those cases are doing is calculating something at build time (look at the examples above and it will be clear). Of course, maybe there is some cases using other So my conclusion: let's use |
As the author of the runtime expansion (RTE) used as an example, is like to point out that the RTE was only used because it was possible, as a half hack to allow users to choose Wayland vs X for GUI programs without having multiple installs. In this particular case, I believe a shell wrapper is justified and there are no scripts that will use the wrapped GUI programs on shebang lines. If a shell wrapper is not desired, a tiny single-purpose wrapper binary could be written that calls the regular binary wrapper of the GUI program. Furthermore, IMHO the spirit of makeWrapper is to make the wrapped program behave correctly on Nix. I can't think of a use case for RTE for that purpose. So, before trying to add RTE to makeBinaryWrapper, why not look for other examples of it in use? I think you'll not find any that justify adding generic support for RTE. Instead, we can split the wrapper into a static makeBinaryWrapper and a shell makeWrapper that calls it. |
Are there any workarounds to this, as in examples of how to do RTE-like expansions. I need some to expand a HOME variable in commandline arguments for a program (otherwise it won't know where to put the files because it doesn't read those variables by itself, unfortunately) |
makeBinaryWrapper: Variable expansion at runtime
Right now, several packages (typically desktop GUI applications) that are wrapped with makeWrapper (shell version) depend on variable expansion at runtime (as opposed to build time) happening in the bash wrappers. This is currently not possible to do using the existing implementation of binary wrappers.
nixpkgs/pkgs/applications/networking/browsers/chromium/default.nix
Lines 183 to 185 in 64ab981
Above you can see an example of such an application (chromium) being wrapped in a way which currently can't be migrated to using binary wrappers.
wordexp.h
wordexp.h (https://man7.org/linux/man-pages/man3/wordexp.3.html) seems like it could enable this for binary wrappers as well. Below you can see an example using the string from the chromium bash wrapper.
stdout
It seems like wordexp will split words into seperate entries in the output array. Note that several characters are unsupported (like newlines) in the string passed to wordexp. It will cause wordexp to return an error code; so wordexp is likely not something that can/should be used as a default for --set, --set-default, --add-flags, etc.
Thanks to @thiagokokada for the suggestion of using wordexp.h.
Related:
The text was updated successfully, but these errors were encountered: