Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run Nix setup.sh? #26

Closed
15 of 19 tasks
andychu opened this issue Aug 7, 2017 · 30 comments
Closed
15 of 19 tasks

Run Nix setup.sh? #26

andychu opened this issue Aug 7, 2017 · 30 comments

Comments

@andychu
Copy link
Contributor

andychu commented Aug 7, 2017

Hm this uses several (ugly) bash features not implemented in Oil:

We should definitely be able to parse it -- running it will take some work.

  • trap builtin
  • compatibility issue: case "$(type -t "$hookName")" in fails in OSH because we respect errexit in command sub, but bash doesn't
  • compatibility issue: for i in "${nativePkgs[@]}"; do -- OSH doesn't allow indexing empty var/string with [@]
  • type builtin (should be easy, only need -t)
  • read flags: -r, -n etc. (but lack of -r not implemented)
  • oldOps="$(shopt -po nounset)"; ...; eval "$oldOpts" (use case for "with" in Oil)
  • setup.sh has same issue as Gentoo in issue Implement [ #19, trying to run [ with empty $PATH
    • test -x
  • content="${content//"$pattern"/$replacement}"
  • ${FUNCNAME[@]}
  • ${!varRef} -- completion scripts also use this
  • shopt -s nullglob
  • local -a times=(...) (unparsed)
  • declare -a (unparsed)
  • append e.g. eval "$var"'+=("$pkg")' (unparsed)
  • might be tricky: if [[ -z "$makeFlags" && ! ( -n "$makefile" || -e Makefile || -e makefile || -e GNUmakefile[[ ) ]]

Deferred:

  • named file descriptors: exec {fd}< "$fn" (unparsed)
  • closing descriptor: exec {fd}<&- (unparsed)
  • printf builtin -- %q is used, but external command should be OK

https://github.com/NixOS/nixpkgs/blob/master/pkgs/stdenv/generic/setup.sh#L318

@andychu
Copy link
Contributor Author

andychu commented Aug 7, 2017

FYI @lheckemann

@lheckemann
Copy link
Contributor

Awesome, thanks! I'll see if I can find the time to contribute some of this functionality myself.

@andychu
Copy link
Contributor Author

andychu commented Aug 7, 2017

OK great I'm going to tackle the parsing stuff first, so:

  • declare -a / local -a
  • +=
  • {fd}<
  • <&-

The type builtin is probably the easiest one ... the command dispatch is in core/cmd_exec.py. Let me know if anything doesn't make sense, or if some more docs on the wiki would help.

andychu pushed a commit that referenced this issue Aug 8, 2017
Note that we accept flags but don't do anything with them yet.  The
variable is known to be an array from the RHS, not LHS.

We might have to parse these dynamically later, as with 'export', but so
far all usages are static.

Addresses issue #26.
andychu pushed a commit that referenced this issue Aug 9, 2017
Extract EvalLhs function and share it.

Still need to parse LHS indexing, e.g. a[x]+=b.

Addresses issue #26.
@andychu
Copy link
Contributor Author

andychu commented Aug 9, 2017

Note on last bullet point. This works fine:

$ bin/osh -n -c '[[ -z "$makeFlags" && ! ( -n "$makefile" || -e Makefile || -e makefile || -e GNUmakefile[[ ) ]]'             

The LST doesn't preserve the grouping parentheses, but that's (maybe) an issue for conversion and not execution.

@andychu
Copy link
Contributor Author

andychu commented Aug 9, 2017

@lheckemann If you pull master, the setup.sh should parse now, with:

osh -n setup.sh

It doesn't run yet though -- failing on the shopt builtin.

Also, I believe these lines in isELF and isScript:

exec {fd}< "$fn"
read -r -n 2 -u "$fd" magic
exec {fd}<&-

should be replaced with:

read -r -n 2 magic < "$fn"

The pattern they're using is "open to fd", then "read -u $fd", then "close $fd". There's no point to this as you can just redirect the stdin of read directly from the file "$fn".

(This similar to the point here [1] I made about exec 6>&1 then exec 1>&6 6>&- in an Apache yetus script; it can be vastly simplified to a simple redirect of a function. I need to write a blog post about this.)

[1] https://www.reddit.com/r/oilshell/comments/6kygto/osh_runs_real_shell_programs/djr8ier/

OSH is still pretty far from running this script, so I wouldn't change it for OSH. But if there is a decent way of testing it, maybe someone wants to change it for readability / simplicity / slight efficiency.

I'm going to put off the exec {fd}< and exec <&- constructs for last, because I still haven't seen a good use of them.

@lheckemann
Copy link
Contributor

Great! I'll take a look at implementing some of the missing functionality as soon as I've managed to get oil building from the repo.

@lheckemann
Copy link
Contributor

wrt printf — I noticed that on my NixOS system (with coreutils 8.26)

$ printf "%q\n" $'\n'
$'\n'
$ $(which printf) "%q\n" $'\n'
''$'\n'

Meanwhile on an old debian system (coreutils 8.23)

linus@soundray:~$ printf "%q\n" $'\n'
$'\n'
linus@soundray:~$ $(which printf) "%q\n" $'\n'
/usr/bin/printf: %q: invalid conversion specification

So depending on the coreutils version, support for %q will either be absent or (as far as I can tell) incorrect… But bash has had it for a long time now. I guess this is another one of those issues to handle if anyone ever needs it.

@andychu
Copy link
Contributor Author

andychu commented Aug 9, 2017

OK good point about %q. I added a note about it above. It looks like it's used mostly for debug output, and it's easy for OSH to print an unambiguous representation of an array of strings, even if it's not through printf %q.

I'm not opposed to a full printf builtin, but it's a fair amount of work that probably doesn't have to block this.

@andychu
Copy link
Contributor Author

andychu commented Aug 9, 2017

Oh also make sure you can run say test/spec.sh builtins or test/spec.sh smoke. That's on the Contributing page. It's pretty essential for implementing say type -t.

test/spec.sh all does all of them in parallel, without showing output.

@Ericson2314
Copy link

Ericson2314 commented Aug 9, 2017

My favorite bash oddity is the hardly-believable local -a 'foo=("$bar")' (single quotes intentional).

andychu pushed a commit that referenced this issue Aug 10, 2017
Details:

- Implement LooksLikeGlob() with unit tests

Other:

- Add a stub/test for 'shopt -s failglob too', but it's not
  implemented.
- Fix up comments in core/glob_.py

Addresses issue #26.
andychu pushed a commit that referenced this issue Aug 10, 2017
Details:

- Implement LooksLikeGlob() with unit tests

Other:

- Add a stub/test for 'shopt -s failglob too', but it's not
  implemented.
- Fix up comments in core/glob_.py

Addresses issue #26.
@andychu
Copy link
Contributor Author

andychu commented Aug 10, 2017

Hm wow I didn't know about local -a 'foo=("$bar")'. It also works with declare.

That's very similar to issue #3, which is runtime parsing of arithmetic expressions inside strings.

Bash is very confused about data vs. code.

I think this is basically because local foo=bar is a hack to begin with. OSH parses it statically, but to bash, local is just a builtin that gets an argv array, and the code is parsed dynamically.

That is, local foo=bar is the same as local 'foo=bar'. And then that makes local 'foo=(1 2 3)' work too.

@andychu
Copy link
Contributor Author

andychu commented Aug 10, 2017

a5c107a

$FUNCNAME tested but not implemented

andychu pushed a commit that referenced this issue Aug 10, 2017
Polish var-ref tests.

Addresses issue #26.
@Ericson2314
Copy link

Ericson2314 commented Aug 10, 2017

@andychu I just learned variable expansion is array + ( specific!

$ declare -a 'foo=("$bar")'; set | grep foo
_='foo=("$bar")'
foo=([0]="")
$ declare -a 'foo="$bar"'; set | grep foo
_='foo="$bar"'
foo=([0]="\"\$bar\"")
$ declare +a 'foo=("$bar")'; set | grep foo
_='foo=("$bar")'
foo='("$bar")'

@andychu
Copy link
Contributor Author

andychu commented Aug 10, 2017

Interesting examples. I haven't encountered stuff like this in the wild, but I think I get what's going on there, if not why.

The declare builtin dynamically parses its first string arg as an assignment in all cases.

  1. -a means array, so it parses everything to the right of = as an array, which has one element here. $bar is expanded to the empty string.
  2. -a means array, but there are no (). So declare turns everything to the right of = into a string, and then coerces it into an array by putting the string in the array. It doesn't expand $bar. This is weird but that's what it's doing.
  3. +a means it's not an array, so it turns everything to the right of = into a string, even though it has the (). It doesn't expand $bar.

This would make a great post for "Bash: the Awful Parts" but I'm intentionally concentrating on OSH/Oil over the blog. But if I get Oil done and need to justify it, there is a great example :)

Right now OSH doesn't have an array attribute of a cell/location -- it only has Str and StrArray values, which go inside cells. So the type is carried along with the value rather than the location, which I think makes more sense. It remains to be seen whether this breaks any bash programs, but I think it will be better if programs settle on a consistent style rather than this craziness.

The problem is that bash has two ways to express an array -- -a vs +a, and (a b c) vs 'a b c'. I don't see why these two ways are necessary. In Oil I just use the presence of array literals to tell me.

That is,

  • declare -a myarray can simply be written declare myarray=().
  • declare +a mystring can simply be written declare mystring=''

This doesn't work in bash but it has no ambiguity. It's a little more Python-like, where objects have types, not variables.

Thanks for the examples!

@Ericson2314
Copy link

Ericson2314 commented Aug 10, 2017

The "type" of an undefined variable is observable here:

$ declare -i a
$ a+='   1'
$ echo "$a"
1
$ a+='   1'
$ echo "$a"
2

I fear bash is poorly designed to the point where almost all internal implementation details are not abstracted way.

@andychu
Copy link
Contributor Author

andychu commented Aug 11, 2017

Hm that's another good one. And note that (( a += ' 1' )) does something different -- it's always arithmetic, even without the -i attribute.

I hope that OSH doesn't have to implement these flags to make real programs run, but I saved them here in case:

https://github.com/oilshell/oil/blob/master/spec/type-compat.test.sh

Nix is definitely using some nooks and crannies of bash, but nothing too insane. The eval / var ref stuff is a little odd but I think it should work in OSH.

I understand {!varRef} but declare -n is pretty wierd:

https://github.com/oilshell/oil/blob/master/spec/var-ref.test.sh

If you have any slightly easier bash scripts to run I would be interested in those too :)

andychu pushed a commit that referenced this issue Aug 11, 2017
Also make note of a var-ref issue where $? is not dynamic.

Addresses issue #26.
@andychu
Copy link
Contributor Author

andychu commented Aug 13, 2017

FYI I turned the exec examples into a blog post, along with the other example I mentioned above:

http://www.oilshell.org/blog/2017/08/12.html

tl;dr isELF and isScript in setup.sh can be written in a simpler fashion, like this:

https://github.com/oilshell/blog-code/blob/master/hard-coded-descriptors/demo.sh#L24

@lheckemann
Copy link
Contributor

Re: printf again, looking at it again I see that the output from coreutils, while awkward, is correct after all. So not implementing it as a builtin should be fine as long as we're ok with depending on a recent version of coreutils.

andychu pushed a commit that referenced this issue Aug 22, 2017
There are two strategies, depending on the pattern.

1) Fixed strings use Python's string methods, e.g.
startswith/endswith/replace/slice.

2) Glob patterns are converted to Python regexes.  (Character classes
aren't currently supported.)

Then we use the regex engine for position information and
greedy/non-greedy matches.

Also:

- Added tests.
- Fix parsing.
- TODO: Unicode

Addresses issue #26.
@Ericson2314
Copy link

Ericson2314 commented Aug 26, 2017

@andychu I rewrote is* that more complicated way as I thought I was gong to call them from a context when stdin was already in use (while read; do .. done < stuff). Ended up not doing that so I suppose I could revert.

@andychu
Copy link
Contributor Author

andychu commented Aug 26, 2017

@Ericson2314 There is no problem with using isElfSimple while stdin is being redirected, because the < redirect implicitly saves and restores process state.

The way read < foo works is:

  1. save process descriptor state, open file foo, make it descriptor 0
  2. execute read builtin, which uses descriptor 0
  3. restore process descriptor state. Now descriptor 0 is whatever it was before.

Nothing but read happens between steps 1 and 3. If there were any concurrency, it would happen in a different process, so it's always safe.

I think this wasn't have been entirely clear to me before writing a shell, but it's clear now.

Demo here:

oilshell/blog-code@f61ff5c

$ ./demo.sh isElfSimpleWithStdin 
1
YES
NO

2
YES
NO

3
YES
NO

EDIT: Rewrite of somewhat related example from Reddit:

afbfd5f

andychu pushed a commit that referenced this issue Aug 28, 2017
I learned that the [ language isn't really specified by a grammar, as [[
is.  Instead we handle it like bash does: by splitting it into 6 cases:
expressions of length 0, 1, 2, 3, 4, and more.

The bash source implies that POSIX specifies this.  And it explains why
-a and -o are deprecated.

I'm not supporting -a as a unary operator, since it's an alias for -e.

I also added a couple test cases to dbracket.test.sh, and fixed some
syntax error handling bugs in [[.

NOTE: There's still a bug where [ abc = 'a*' ] should not support
globbing.

Addresses issue #19 and issue #26.
andychu pushed a commit that referenced this issue Sep 6, 2017
Used by setup.sh in Nix.

Addresses issue #26.
andychu pushed a commit that referenced this issue Sep 6, 2017
andychu pushed a commit that referenced this issue Sep 6, 2017
We can print options in two formats, as well as set the options from the
'set' builtin.  There are still a few cases left to do, like 'set -o'.

- Tests for shopt -p and shopt -po

Addresses issue #26.
@andychu
Copy link
Contributor Author

andychu commented Sep 6, 2017

@Ericson2314 I noticed this commit

NixOS/nixpkgs@81194ee

where you noted:

${foo+"${foo[@]}"} will prevent set -u problems with empty arrays and older without making a single '' in the empty case.

but you say this happens only on old versions of bash? Do you have a new version of bash where it doesn't happen? This came up on the bash mailing list:

http://lists.gnu.org/archive/html/help-bash/2017-09/threads.html

I get this undesired behavior in bash 4.3, and your workaround works. Just curious where you found this or if you came up with it... it is not obvious to me!

$ declare -a empty=()  # empty array, not an unset variable
$ set -u
$ echo "${empty[@]}"
-bash: empty[@]: unbound variable
$ bash --version
GNU bash, version 4.3.48(1)-release (x86_64-pc-linux-gnu)

@andychu
Copy link
Contributor Author

andychu commented Sep 14, 2017

I released OSH 0.1 with all these bug fixes:

http://www.oilshell.org/blog/2017/09/09.html

There are a few issues left, as noted at the top, but I'm putting them on the back burner for now. I'd appreciate any more bug reports or feedback on what distros want from a shell. (And patches appreciated too, although I fixed most of the low-hanging fruit)

I know it is slow and unpolished, so I'd like to add something you can't get from bash in the near term. I was thinking of adding of adding better debugging features:

  • a better set -x, in particular one that shows all the exit codes, so you can figure out why a script that uses set -e exited
  • being able to dump all variables on the stack after a failure and inspect them. I'm not going to implement a debugger but something like a 'crashdump' builtin might be useful for debugging.

(The other place it's already better than bash is osh -n, which does static parsing.)

I think the initial use case for OSH is building Linux distros, i.e. large shell programs (as opposed to interactive usage).

I think I will concentrate on Alpine Linux for the next release, because it's very small, but I'm still interested in feedback from Nix maintainers and users.

(BTW I also discovered on the mailing list that bash 4.4 fixed the set -u and empty array bug.)

@andychu
Copy link
Contributor Author

andychu commented Sep 20, 2017

@lheckemann @Ericson2314 Wondering if you guys had any feedback on this?

Sorry if this issue got pinged a lot an overwhelmed your mailbox. I'm not sure if it sends notifications when I edit the top-level comment, which I did a lot.

@lheckemann
Copy link
Contributor

No worries, thanks for keeping me up to date! I've been busy with other stuff lately, but have just tried running it again.

I discovered a bug (#40 includes a test that demonstrates it). This results in the line if [ "${NIX_DEBUG:-}" = 1 ]; then complaining osh error: Expected unary operator, got '='.

The script still fails overall, in a rather odd way… in the runHook function, we have the following lines:

    local hookName="$1"
    shift
    local hooksSlice="${hookName%Hook}Hooks[@]"

This fails after "shift" with the message

Line 0 of '<unknown>'
  <no position info for token>
osh failed: Undefined variable

(it would be useful if code loaded with source could have position information associated with it, I had to use -x and look through the code by hand to debug this). Oddly enough, if I add echo $hookName it works just fine! It works with : instead of echo too. Is this a consequence of some sort of slightly-too-eager optimising out of unused variables or something?

@andychu
Copy link
Contributor Author

andychu commented Sep 22, 2017

Thanks for finding that! I just replied to the pull request.

What's the command I can reproduce this with? I was running osh setup.sh or 'bin/osh -c "source osh.s"' and getting a different error.

$ bin/osh -c 'source setup.sh'                                                                     
osh warning: *** trap not implemented ***
Line 346 of 'setup.sh'
  for i in "${nativePkgs[@]}"; do
            ^~
osh failed: Can't index string with @: (Str s:"")

Also, I should say it might be time to port the spec tests to run in a chroot with busybox, so we can all have the exact same environment. Ironically, the spec tests themselves illustrate a big problem with shell -- Linux distros vary widely.

I don't want you to have to fix up a million issues before even getting started and that looks like where it's going...

Linux Standard Base was supposed to solve this but it failed. I think containers/chroots are the modern answer, and every distro can run them.

I have been playing with Alpine Linux, and I really like the 4 MB chroot ... it seems useful in contexts in addition to the spec tests, like generating docs that include code snippet output and so forth.

@lheckemann
Copy link
Contributor

Sorry, I think the failure wasn't actually on that line after all. My best guess is that it's related to the "Hack around old bash being bad and thinking empty arrays are undefined." on the lines after, which is a bit mind-bending. I'll see if I can figure out a minimal test case for it.

@lheckemann
Copy link
Contributor

Update: I'm just a bit of an idiot and mistook rebuilding a bunch of stuff for everything working. The same error still occurs with the : $hookName.

@andychu
Copy link
Contributor Author

andychu commented Sep 23, 2017

@lheckemann I didn't follow that last part, but please try again after the fix I just made and let me know what the current error is.

  • I can explain the empty arrays hack if you are interested... I just learned about that and it took me awhile to figure out. It is a change in bash 4.4 -- 4.3 treats empty arrays with set -u differently. (So it's not actually a very old bash, 4.4 came out last year)
  • If you are still getting this, file a bug with the script /command that triggers it and I can probably fix it.
Line 0 of '<unknown>'
  <no position info for token>
osh failed: Undefined variable

@andychu
Copy link
Contributor Author

andychu commented Jan 17, 2018

TODO: Try building "hello" package with Nix as described here:

https://www.reddit.com/r/oilshell/comments/7qn0p8/success_with_aboriginal_alpine_and_debian/dsrnp0q/

Now that I look at this bug again, the errexit compatibility issue I mentioned was fixed due to Alpine. As far as I remember, setup.sh used some pretty hairy constructs but it's probably worth another try.

@andychu
Copy link
Contributor Author

andychu commented May 16, 2019

I checked off a few more things on this dormant issue. We're now using this page to prioritize:

https://github.com/oilshell/oil/wiki/Shell-Programs-That-Run-Under-OSH

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants