Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[osh] arithmetic evaluation context in test clause #3

Closed
d630 opened this issue Dec 23, 2016 · 6 comments
Closed

[osh] arithmetic evaluation context in test clause #3

d630 opened this issue Dec 23, 2016 · 6 comments

Comments

@d630
Copy link

d630 commented Dec 23, 2016

If you use the test compound command with arithmetic binary operators, there is also a whole let expression available. In bash you can write stuff like:

[[ ('(x=5, y[10]+=5), x*=y[10]' -eq x) && y[2]==y[10]?1:0 -eq 0 ]];
s=$? declare -p s x y;

osh gives me an AssertionError then.

@andychu
Copy link
Contributor

andychu commented Dec 30, 2016

Wow I didn't know about this. Do you know if it's documented anywhere? I don't see it in "help [[".

Osh actually doesn't parse a top-level "let" yet either. It seems fairly rare -- I think I encountered it exactly once, in Kubernetes, out of hundreds lines of code parsed. Most people use (( )); I think let is basically a ksh-ism.

I noticed that let basically like (( )) except the tokenization is more strict, as in your example.

let i=1+2   # valid
let i=1 + 2   # not valid because of spaces
let 'i=1 + 2'  # adding quotes fixes it

I also noticed that boolean operands can be stuff like:

$ [[ 0x10 -eq 16 ]]; echo $? 
0

So this means additionally that boolean operands have to be parsed as arithmetic expressions. Thanks for pointing it out.

I accept this as a bug for bash compatibility, but I'm prioritizing based on usage in the wild. Have you encountered this in the wild or is it a corner case you noticed?

@andychu
Copy link
Contributor

andychu commented Dec 30, 2016

note: the current error is as follows. In any case it shouldn't be an assertion; it should produce a parse error until it's supported.

osh$ [[ ('(x=5, y[10]+=5), x*=y[10]' -eq x) && y[2]==y[10]?1:0 -eq 0 ]];
Traceback (most recent call last):
  File "/home/andy/git/oil/bin/../core/cmd_exec.py", line 701, in Execute
    raise AssertionError('Error evaluating boolean: %s' % bool_ev.Error())
AssertionError: Error evaluating boolean: ["Invalid integer: invalid literal for int() with base 10: '(x=5, y[10]+=5), x*=y[10]'"]

@d630
Copy link
Author

d630 commented Dec 30, 2016

Ja, it appears not to be documented in bash, probably it's somewhere in
a subtext, in the code or in the mailing list. It says (only) "arg1 OP arg2", and
that these args "may be positive or negative integers".

By the way, ksh considers those arithmetic binary operators as obsolete,
but kindly uses the word "exp" instead "arg". mksh indicates that behavior
in "Arithmetic Expressions":

Integer arithmetic expressions can be used with the let command, inside
$((...)) expressions, inside array references (e.g. name[expr]), as
numeric arguments to the test command, and as the value of an assignment
to an integer parameter.

And comments it while describing the "test" builtin command in "Command
execution":

Note that a number [like in 'number -eq number'; d630] actually may be an
arithmetic expression, such as a mathematical term or the name of an
integer variable:

    x=1; [ "x" -eq 1 ]      evaluates to true

That is, in ksh and mksh that thing is valid in the test builtin command and
in the test/conditional compound command as well; but not in bash's simple
test command, which is here more close to POSIX and dash.

For the test clause in bash, see also its real manual,
which mentions octal and hexadecimal numbers and the "[base#]n" syntax. On top
of that page there is also a reference to the -i flag of the declare builtin
command; once you have declared the integer attribute to a variable (scalar or
vector), you can do arithmetics without using keywords or simple commands:

declare -i s=;
s='(x=5, y[10]+=5), x*=y[10], y[2]==y[10] ? 1 : 0';
declare -p s x y;

I suppose that (( and arithmetic substitution are the "normal" methods to do
arithmetic. let is nice, since each arg of let is equal to an expression, which
must otherwise be separated by a comma.

# My first example can be written as:
let x=5 y[10]+=5 x\*=y[10] y[2]==y[10]\?1:0;
s=$? declare -p s x y;

# And yours also as:
let i=1+2 i;
s=$? declare -p s i;

# But see:
_let () { IFS=,; (($*)); };
_let i=1+2 i;
s=$? declare -p s i;

And there is a special dealing with environment variables in let:

# builtin
i=1+2 let i=i;
s=$? __=$_ declare -p s __ i;

# func
_let () (($*));
i=1+2 _let i=i;
s=$? __=$_ declare -p s __ i;

@andychu
Copy link
Contributor

andychu commented Dec 30, 2016

Yeah I thought about it last night, and I think there is just a function in bash which is basically "try as hard as you can to convert an arbitrary word (string) to an integer, for use in arithmetic". That function handles 0x10, 64#123, and it ALSO tries to parse the word as an arithmetic expression and evaluate it, which is bizarre.

This function is used in lots of places:

  1. each word of let
  2. inside (( ))
  3. inside [[ ]], but ONLY if you're using an arithmetic operator like -eq
  4. inside array indexing

And probably all the other places I mentioned that arithmetic expressions are used [1], except for some reason it doesn't work inside $(( )).

$ (( a=9, a+a+a == 'a*3' )); echo $?
0

$ array=(1 2 3); array[a+a+a == 'a*3']=XXX; echo ${array[@]}
1 XXX 3  

I think this is just sloppiness in bash implementation. The author probably just reused this function in a bunch of places, not quite realizing that it leads to horrible language design. A lot of bash has that flavor where it only considers what you CAN do with the language; it doesn't care what you CAN'T do. The error paths are underspecified.

So yeah the philosophy is if that people are using this "in the wild" we could copy it, but I don't want to cargo cult horrible design mistakes.

[1] http://www.oilshell.org/blog/2016/10/19.html (end of this post)

@andychu
Copy link
Contributor

andychu commented Nov 14, 2018

I'm closing this since I haven't run into any shell scripts that rely on this behavior. It is documented here, and came up a month or so ago on help-bash@ !

https://github.com/oilshell/oil/blob/dev/andy-2/doc/architecture-notes.md#where-strings-are-evaluated-as-code-perhaps-unintentionally

Anyway thanks for the heads up -- that was the first I've heard of it!

@andychu andychu closed this as completed Nov 14, 2018
@d630
Copy link
Author

d630 commented Nov 14, 2018

Info: Documented in bash.1 since http://git.savannah.gnu.org/cgit/bash.git/commit/doc/bash.1?h=bash-5.0-testing&id=9a51695bed07d37086c352372ac69d0a30039a6b

andychu pushed a commit that referenced this issue Jul 15, 2020
Hm this doesn't fix the infinite loop in case #3 of spec/alias?  Still
need to work on that.

mylib: add comments and test cases
andychu pushed a commit that referenced this issue Feb 6, 2021
Why didn't this happen on my own machine?

http://travis-ci.oilshell.org/srht-jobs/2021-02-06__19-34-07.wwz/_tmp/toil/logs/cpp-unit-all.txt

--> COLLECT with 0 roots
i = -2147483648
.AddressSanitizer:DEADLYSIGNAL
=================================================================
==3052==ERROR: AddressSanitizer: SEGV on unknown address 0x7f8666c53034 (pc 0x55618e7f6d5b bp 0x7ffc429f1190 sp 0x7ffc429f1180 T0)
==3052==The signal is caused by a READ memory access.
    #0 0x55618e7f6d5a in gc_heap::str_equals(gc_heap::Str*, gc_heap::Str*) /home/build/oil/mycpp/gc_heap.cc:260
    #1 0x55618e7dafd0 in str_replace_test /home/build/oil/mycpp/my_runtime_test.cc:162
    #2 0x55618e7e7bd6 in main /home/build/oil/mycpp/my_runtime_test.cc:779
    #3 0x7f866924109a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #4 0x55618e7d94b9 in _start (/home/build/oil/mycpp/_bin/my_runtime_test.asan+0xd4b9)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/build/oil/mycpp/gc_heap.cc:260 in gc_heap::str_equals(gc_heap::Str*, gc_heap::Str*)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants