Skip to content

Conversation

mbland
Copy link
Owner

@mbland mbland commented Mar 21, 2017

Closes #142.

The process of testing revealed so much more than just early bugs in the new command script itself. See the commit messages for the gory details, but the highlights are:

  • Previously the version check at the very beginning of go-core.bash would report that Bash 4.0 and 4.1 were unsupported. This has been fixed.
  • I uncovered another Bash array initialization bug, this time related to brace expansion, that caused several test cases from tests/new.bats to fail on Bash versions prior to 4.1. As with the other array bug from commits b421c73 and c6bf1cf, the fix was to declare the array on one line and initialize on the next, rather than declaring and initializing it on the same line.
  • Debugging test failures on Ubuntu and Arch Linux led me to discover a silent failure in @go.test_compgen whereby the underlying compgen would, in some cases, happily return a successful status even when no completions were generated. @go.test_compgen now fails loudly in this case, as you practically never want it to return empty results (otherwise just use the empty string as test data directly).
  • The big one: I discovered a subtle bug in several tests/new.bats test cases only because of a difference in array expansion between Bash versions up to and including 4.2.31 and Bash versions from 4.2.32 on—only when a single-item array is expanded such that it results in the empty string. See the commit message for 5abb59b for all the juicy details.
  • Finally, running tests/new.bats on Windows using the MSYS2 Bash from Git for Windows revealed a bug in stub_program_in_path from lib/bats/helpers whereby stubbing a system program like chmod may result in the program stub itself getting executed as part of create_bats_test_script. The fix was easy enough, by calling create_bats_test_script before setting PATH.

mbland added 8 commits March 20, 2017 12:12
Previously it would report that Bash 4.0 and 4.1 were unsupported.

Since Bash always sets `BASH_VERSION`, it's not practical to write an
automated test unless different versions of the Bash binary are
available. I may add one, but since I've tested it manually on my
machine and this condition isn't likely to ever change, it's probably
not worth it.
Under Bash 3.2.57(1)-release, `./go test new` was failing with:

```
 ✗ new: tab complete --internal
   (in test file tests/new.bats, line 76)
     `touch "${internal_modules[@]}"' failed
   touch: (/var/folders/dl/0j29q1wd0w715j4gnz5ry_pc0000gn/T/test rootdir/scripts/lib/foo): No such file or directory
 ✗ new: tab complete --public
   (in test file tests/new.bats, line 94)
     `touch "${public_modules[@]}"' failed
   touch: (/var/folders/dl/0j29q1wd0w715j4gnz5ry_pc0000gn/T/test rootdir/lib/xyzzy): No such file or directory
 ✗ new: tab complete --test
   (in test file tests/new.bats, line 112)
     `touch "${test_files[@]}"' failed
   touch: (/var/folders/dl/0j29q1wd0w715j4gnz5ry_pc0000gn/T/test rootdir/tests/frotz.bats): No such file or directory
 ✗ new: tab complete --type
   (in test file tests/new.bats, line 130)
     `touch "${text_files[@]}"' failed
   touch: (/var/folders/dl/0j29q1wd0w715j4gnz5ry_pc0000gn/T/test rootdir/gue/wizard.txt): No such file or directory
```

At first I thought this was the "declare and initialize an array at the
same time" bug from commit b421c73 and
commit c6bf1cf, as the workaround was
the same: declare the array on one line, and initialize it on another.
After thinking it through, however, I realized this bug was different,
since the earlier bug had to do with exported arrays not getting
initialized, and these arrays were `local`. On top of that, it appeared
that the brace expansion was to blame, since `touch` appeared to see
only the final brace expansion value, and that value was wrapped in
parentheses.

To verify this, I added this line before one of the `touch` calls:

  printf 'ARG: %s\n' "${internal_modules[@]}" >&2

which produced:

 ✗ new: tab complete --internal
   (in test file tests/new.bats, line 77)
     `touch "${internal_modules[@]}"' failed
   ARG: (/var/folders/dl/0j29q1wd0w715j4gnz5ry_pc0000gn/T/test rootdir/scripts/lib/foo)
   touch: (/var/folders/dl/0j29q1wd0w715j4gnz5ry_pc0000gn/T/test rootdir/scripts/lib/foo): No such file or directory

After reviewing https://tiswww.case.edu/php/chet/bash/CHANGES, this
appeared to be the most likely culprit:

  This document details the changes between this version,
  bash-4.1-alpha, and the previous version, bash-4.0-release.

  bb. Fixed a bug that caused brace expansion to take place too soon in
      some compound array assignments.

Using the methodology described in the log message for commit
99ab780, I downloaded the Bash 4.0 and
4.1 sources and patches, and confirmed that the bug manifested under
Bash 4.0.44 (the highest patchlevel for 4.0) and did not manifest under
Bash 4.1.

Also, for future reference, the official Bash git repository is at:

  https://savannah.gnu.org/git/?group=bash
  http://git.savannah.gnu.org/cgit/bash.git

However, neither the patches from
https://mirrors.ocf.berkeley.edu/gnu/bash/ nor the Git repository show a
specific change for the fix, and the diff between 4.0.44 and 4.1 is too
large and difficult to parse to easily identify the fix.
Unsorted `compgen` and unescaped `{` characters in regular expressions strike
again.

Also, though the coverage report only shows 82.9%, that's because `kcov` isn't
recognizing any of the array assignments as executed code. Coverage is actually
100%.
The process of fixing these failures also exposed a silent failure in
`@go.test_compgen` that will be addressed in the next commit.
The previous commit rectified errors in assertions from `tests/new.bats`
whereby assertions that should have failed as written did not. Though
the code under test was sound, these latent test errors were due to the
fact that `@go.test_compgen` would produce an empty result array and
return success if the underlying `@go.compgen` returned empty results.

When this empty array was passed to `assert_success` or `assert_failure`
using "${array[@]}" notation, it expanded to an empty argument list, in
which case these assertions did not check the value of `$output`. This
left the impression that the output matched a nonempty list of values
when the assertions were actually not comparing the output at all.

Now `@go.test_compgen` will fail loudly with `@go.compgen` results are
empty, and any case in which the underlying `@go.compgen` returns an
error.
The previous arguments to `@go.test_compgen` should have ended with a
trailing slash; as it stood, it was returning the directory itself, with
a trailing slash appended. After some experimentation, it emerged that
removing the directory prefix (with trailing slash) from the expected
results (e.g.  `$TESTS_GO_SCRIPTS_DIR/lib/`) resulted in passing the
empty string as the sole argument to `assert_success` through Bash
4.2.31, and in passing no arguments at all to `assert_success` from Bash
4.2.32 on.

Again using the methodology described in the log message for commit
99ab780, I built different Bash
versions and performed a binary search using the following command until
I pinpointed 4.2.32 as the inflection point, replacing `BASH_VERSION` in
`PATH` at each iteration:

  $ PATH=/usr/local/bash-builds/BASH_VERSION/bin:$PATH \
      TEST_FILTER='tab complete --internal' gos test new

I then added the following line to `assert_success` from
`lib/bats/assertions`:

  printf 'ARGC: %d\n' "$#" >&2

and added `return 1` after the `assert_success` call in the `tab
complete --internal` test case to force an error under 4.2.32, which
produced the following results (edited for clarity):

  $ PATH=/usr/local/bash-builds/4.2.31/bin:$PATH \
      TEST_FILTER='tab complete --internal' gos test new

   ✗ new: tab complete --internal
     (in test file tests/new.bats, line 96)
       `assert_success "${expected[@]#$TEST_GO_SCRIPTS_DIR/lib/}"' failed
     ARGC: 1
     ARGC: 1
     output not equal to expected value:
       expected: ''
       actual:   'bar
     baz
     foo'

  $ PATH=/usr/local/bash-builds/4.2.32/bin:$PATH \
      TEST_FILTER='tab complete --internal' gos test new

   ✗ new: tab complete --internal
     (in test file tests/new.bats, line 96)
       `assert_success "${expected[@]#$TEST_GO_SCRIPTS_DIR/lib/}"' failed
     ARGC: 1
     ARGC: 0

Later I reproduced this behavior using a standalone script,
`expand-empty-list-test.bash`:

  #! /usr/bin/env bash

  print_argc() {
    printf 'ARGC: %d\n' "$#"
    for arg in "$@"; do
      printf 'ARG:  "%s"\n' "$arg"
    done
  }

  ARGV=('foo')
  print_argc "${ARGV[@]#foo}"

The last line can also be replaced with the following to achieve the
same effect:

  set - "${ARGV[@]#foo}"
  print_argc "$@"

Running either version of this script produces the output:

  $ /usr/local/bash-builds/4.2.31/bin/bash expand-empty-list-test.bash
  ARGC: 1
  ARG:  ""

  $ /usr/local/bash-builds/4.2.32/bin/bash expand-empty-list-test.bash
  ARGC: 0

Also, this behavior _only_ manifests when an array containing a single
value is expanded such that all of its characters are removed. Various
other values for `ARGV` (including the empty list) that result in more
than one value (even more than one empty string) do not exhibit this
behavior. Any form of expansion that removes all the characters from
every list item (including `##*`, `%%*`, and `//*/`) only exhibit this
behavior when `ARGV` contains a single string.

This is the patch and bug report for Bash 4.2.32:

  https://mirrors.ocf.berkeley.edu/gnu/bash/bash-4.2-patches/bash42-032
  http://lists.gnu.org/archive/html/bug-bash/2012-05/msg00010.html

Which appears to correspond to the following from
https://tiswww.case.edu/php/chet/bash/CHANGES:

  This document details the changes between this version,
  bash-4.3-alpha, and the previous version, bash-4.2-release.

  ggg. Fixed a bug that causes the shell to delete DEL characters in the
       expanded value of variables used in the same quoted string as
       variables that expand to nothing.

This doesn't speak directly to the "single item array expansion to empty
string" behavior difference, but the patch contains the following change
to `expand_word_internal` from `subst.c`:

  *** ../bash-20120427/subst.c  2012-04-22 16:19:10.000000000 -0400
  --- subst.c 2012-05-07 16:06:35.000000000 -0400
  ***************
  *** 8152,8155 ****
  --- 8152,8163 ----
        dispose_word_desc (tword);

  +     /* Kill quoted nulls; we will add them back at the end of
  +        expand_word_internal if nothing else in the string */
  +     if (had_quoted_null && temp && QUOTED_NULL (temp))
  +       {
  +         FREE (temp);
  +         temp = (char *)NULL;
  +       }
  +
        goto add_string;
        break;

`QUOTED_NULL` is defined in `subst.h` as:

  /* Is the first character of STRING a quoted NULL character? */
  #define QUOTED_NULL(string) ((string)[0] == CTLNUL && (string)[1] == '\0')

and `CTLNUL` is defined in `shell.h` as:

  #define CTLNUL '\177'

Back to `expand_word_internal`, the `add_string:` label following the
`Kill quoted nulls` block is followed by:

  add_string:
    if (temp)
      {
        istring = sub_append_string (...);
        temp = (char *)0;
      }

    break;

Finally, the end of the function contains the following comment and
block of code:

```
finished_with_string:
  /* ... */

  /* If we expand to nothing and there were no single or double quotes
     in the word, we throw it away.  Otherwise, we return a NULL word.
     The single exception is for $@ surrounded by double quotes when
     there are no positional parameters.  In that case, we also throw
     the word away. */

  if (*istring == '\0')
    {
      if (quoted_dollar_at == 0 && ...)
        {
          /* ... */
        }
      /* According to sh, ksh, and Posix.2, if a word expands into nothing
         and a double-quoted "$@" appears anywhere in it, then the entire
         word is removed. */
      else  if (quoted_state == UNQUOTED || quoted_dollar_at)
        list = (WORD_LIST *)NULL;
```

As best I can tell, the 4.2.32 patch that kills `QUOTED_NULL` values
prevented the assignment to `istring` (under the `add_string:` label)
that was still performed as of 4.2.31. Somehow expanding a single-item
array such that it results in the empty string produces a `QUOTED_NULL`,
hence the difference in behavior betweeen the two versions.

With this change to the test, the bugs in the tests themselves are
fixed, and the tests all pass under both versions. That said, this is a
rather pernicious kind of bug that seems difficult to detect and defend
against, given that it only manifests when single-item arrays are
expanded as a single empty string. The interface to `assert_success` and
`assert_failure` isn't wrong; it's intended to make it easy to check
command output without requiring two different assertions when a
straight equality check will do—especially when there should be no
output at all. I'll have to think on whether there's an effective
mechanism or idiom that can guard against it; but for now, my guard is
up, at least.
The following test case from `tests/new.bats` was failing on Windows
(using the MSYS2 Bash 4.3.46(2)-release from Git for Windows):

   ✗ new: error if setting permissions fails
     (from function `stub_program_in_path' in file lib/bats/helpers, line 320,
      in test file tests/new.bats, line 268)
       `stub_program_in_path 'chmod' 'printf "ARG: %s\n" "$@"' 'exit 1'' failed
     ARG: 700
     ARG: /tmp/test rootdir/bin/chmod
     ARG: -R
     ARG: u+rwx
     ARG: /tmp/test rootdir

Apparently on this platform, `chmod` wasn't already `hash`-ed, and using
`stub_program_in_path` to stub out `chmod` was causing
`create_bats_test_script` to look up `chmod` in `PATH` andinvoke the
stub script itself.

The fix: call `create_bats_test_script` before setting `PATH`. The
existing `stub_program_in_path for testing external program` test case
from `tests/bats-helpers.bats` now uses `chmod` for its example program
instead of `git`, which reproduced the original failure and verifies its
fix.
@mbland mbland added this to the v1.4.0 milestone Mar 21, 2017
@mbland mbland self-assigned this Mar 21, 2017
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.8%) to 94.477% when pulling f530a60 on generator into bd8994d on master.

@mbland
Copy link
Owner Author

mbland commented Mar 21, 2017

Note that the "decrease" in coverage is because kcov isn't recognizing any of the array assignments in libexec/new as executed code. Coverage of that file is actually 100%, and coverage of the remaining files remains unchanged. Merging.

@mbland mbland merged commit 71d729f into master Mar 21, 2017
@mbland mbland deleted the generator branch March 21, 2017 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create command to generate new commands, modules
2 participants