# Conditional statements


<div class="alert alert-block alert-info">
    You can find all of the scripts in this notebook in the subdirectory containing this notebook:
    <code>./scripts/conditional_statements</code>
</div>

The Bash `if` statement has the following form:

```sh
if test-commands; then
    commands
elif test-commands; then
    commands
else
    commands
fi
```

The semi-colon `;` is a standard shell symbol used to separate commands and is required between the
`test-commands` and the `then` keyword when the `then` appears on the same line as the condition. Alternatively,
an `if` statement may be written as:

```sh
if test-commands
then
    commands
elif test-commands
then
    commands
else
    commands
fi
```

The unusual feature of the Bash `if` statement is the use of commands as the test condition.

## Command exit status as a condition

Recall that all commands should produce an integer exit status where a value of `0` indicates success and any
non-zero integer value indicates a failure of some kind. An `if` statement can use the exit status of a
command or commands (called `test-commands` above) as a condition. If `test-commands` has an exit status
of `0` then the condition is deemed to be true, otherwise the condition is deemed to be false.

<div class="alert alert-block alert-info">
    The command `true` always successfully does nothing and sets its exit status to `0`. The command
    `false` always successfully does nothing and sets its exit status to `1`. Thus the statement
    <code>if true; then echo "true"; fi</code> always prints <code>true</code> and the statement
    <code>if false; then echo "false"; fi</code> always prints nothing.
</div>

Consider the following script that accepts one command line parameter being the path to a directory; the script
attempts to change to the directory and then list the contents of the directory:

---

```sh
#!/bin/bash

# cdls.sh

if cd "$1"
then
    # cd successful
    ls
else
    # cd unsuccessful
    echo "error changing to directory $1" >&2
    exit 1
fi
```

---

In the script above, the exit status of the `cd` command is used to determine if the directory specified
by the caller of the script exists and if the user has execute permission. If so, then the script attempts
to list the contents of the directory, otherwise an error message is output and the script exits.

The following cell calls `cdls.sh` using the current directory as a command line argument:

In [None]:
./scripts/conditional_statements/cdls.sh .

The following cell calls `cdls.sh` with a non-existant directory name:

In [None]:
./scripts/conditional_statements/cdls.sh /bad_dirname

Notice that the error output produced by the `cd` command appears in the script output. It is common when
using the exit status of a command as a condition to not want the output of the command to appear. In this
case, we can redirect the standard error output of `cd` to `/dev/null`.

<div class="alert alert-block alert-info">
    <code>/dev/null</code> on UNIX systems is the <it>null device</it> represented by a file that discards
    everything written to it and reports that the write operation was successful.
</div>

---
```sh
#!/bin/bash

# cdls2.sh

if cd "$1" 2> /dev/null      # discard standard error output of cd command
then
    # cd successful
    ls
else
    # cd unsuccessful
    echo "error changing to directory $1" >&2
    exit 1
fi
```
---

Calling `cdls2.sh` with a non-existant directory name does not produce the error message from `cd`:

In [None]:
./scripts/conditional_statements/cdls2.sh /bad_dirname

## The `[[ ]]` construct

To use a logical expression as a condition, we require a command that can evaluate the expression and return
the appropriate exit status (`0` for true and non-zero for false). The standard UNIX command that performs
this task is called `test`. The short hand for the `test` command is to embed the logical expression inside
of single square brackets `[ ]`.
If you need to write a script that (should) work on a variety of UNIX systems, then you should learn how
to use `test` or `[]` for evaluating logical expressions.

Bash and a few other shells (including zsh used as the default shell in macOS) provide a double square bracket
construct `[[ ]]` to evaluate logical expressions. Compared to `test`, `[[ ]]` has fewer
surprises and is considered easier to use. The differences between `test` and `[[ ]]` are well documented
[here](http://mywiki.wooledge.org/BashFAQ/031).

If *expr* is a logical expression then ``[[ `` *expr* `` ]]`` evaluates to `0` if *expr* evaluates to true and
`1` if *expr* evaluates to false. Notice the space after the second `[` and before the first `]`.

<div class="alert alert-block alert-warning">
    The space before and after the logical expression is required so that Bash can correctly parse
    the construct. Failure to include either space results in unusual errors.
</div>



### No word splitting or filename expansions occur inside `[[ ]]`

Word splitting and filename expansions are suppressed inside of `[[ ]]`. This means that variable
expansions usually do not need to be double quoted inside of `[[ ]]`.


### Types of expressions

Broadly speaking, there are five different kinds of expresssions that `[[ ]]` can use:

1. miscellaneous expressions (those that do not fit in any of the remaining categories)
2. file expressions
3. string expressions
4. integer expressions
5. string regex expressions

A complete list of expressions can be found at https://www.gnu.org/software/bash/manual/html_node/Bash-Conditional-Expressions.html. The table below lists some of the more commonly used expressions:

| Expression                | Evaluates to true if ...                      |
|:--------------------------|:----------------------------------------------|
| **miscellaneous expressions** | |
| `-v` *varname*            | the variable *varname* has been assigned a value           |
| **file expressions** | |
| `-d` *file*               | *file* is a directory                         |
| `-e` *file*               | *file* exists (can be any kind of file)       |
| `-f` *file*               | *file* exists and is a regular file           |
| `-r` *file*               | *file* exists and is readable                 |
| `-w` *file*               | *file* exists and is writable                 |
| `-x` *file*               | *file* exists and is executable               |
| `-s` *file*               | *file* exists and has size greater than zero  |
| *file1* `-nt` *file2*     | *file1* is newer by modification time than *file2*, or *file1* exists and *file2* does not |
| *file1* `-ot` *file2*     | *file1* is older by modification time than *file2*, or *file2* exists and *file1* does not |
| **string expressions** | |
| *string*                  | the length of *string* is not zero            |
| `-n` string               | the length of *string* is greater than zero   |
| `-z` string               | the length of *string* is equal to zero       |
| string1 `=` string2       | *string1* is equal to *string2*               |
| string1 `==` string2      | *string1* is equal to *string2*               |
| string1 `=~` string2      | *string1* matches the regular expression *string2* |
| string1 `!=` string2      | *string1* is not equal to *string2*           |
| string1 `>` string2       | *string1* comes after *string2* lexicographically     |
| string1 `<` string2       | *string1* comes before *string2* lexicographically    |
| **integer expressions** | |
| int1 `-eq` int2           | *int1* is equal to *int2*                             |
| int1 `-ne` int2           | *int1* is not equal to *int2*                         |
| int1 `-le` int2           | *int1* is less than or equal to *int2*                |
| int1 `-lt` int2           | *int1* is less than *int2*                            |
| int1 `-ge` int2           | *int1* is greater than or equal to *int2*             |
| int1 `-gt` int2           | *int1* is greater than *int2*                         |

### Miscellaneous expression example

When using `-v` *varname* it is important to remember that *varname* is the name of the variable; *do not* use
`$` to get the value of the variable:

In [None]:
# unset unsets or removes a variable
# We use it here in case you have some other cell in this notebook that has created a variable
# named x. You would not normally have do this in a script.
unset x

if [[ -v x ]]; then
    echo "x assigned"
else 
    echo "x not assigned"
fi

In [None]:
x=anything                       # anything can be empty string
if [[ -v x ]]; then
    echo "x assigned"
else 
    echo "x not assigned"
fi

### File expression examples

Files are fundamental in UNIX, so it is not surprising that there are expressions for testing the attributes
of files. The following script uses several file exression tests to query the status of a specified file:

---

```sh
#!/bin/bash
#
# test_file.sh: Evaluate the status of a file

file=$1
if [[ -e $file ]]; then                # no need to double quote $file
                                       # because word splitting is suppressed
  if [[ -f $file ]]; then
    echo "$file is a regular file"
  fi
  if [[ -d $file ]]; then
    echo "$file is a directory"
  fi
  if [[ -r $file ]]; then
    echo "$file is readable"
  fi
  if [[ -w $file ]]; then
    echo "$file is writable"
  fi
  if [[ -x $file ]]; then
    echo "$file is executable"
  fi
else
  echo "$file does not exist"
  exit 1
fi
```

---

In [None]:
# test a directory
./scripts/conditional_statements/test_file.sh scripts

In [None]:
# test a regular file
./scripts/conditional_statements/test_file.sh conditional_statements.ipynb

In [None]:
# test a non-existant file
./scripts/conditional_statements/test_file.sh no_such_file

Using a file expression, we can test if a file exists before overwriting it in our `namedsh.sh` script:

---

```sh
#!/bin/bash

# namedsh.sh

if [[ -e $1 ]]; then
    # oops, specified file already exists
    echo "namedsh.sh: $1 already exists" >&2
    exit 1
fi

# specified does not exist, go ahead and create it
echo '#!/bin/bash' > ${1}
chmod u+x ${1}
```

---

### String expressions

String expressions allow the programmer to test for conditions involving strings. Regular expression 
matching using `=~` is covered in the Regular Expresssion notebook.

Testing for a non-empty string can be performed with just the name of the string:

In [None]:
s=""                    # the empty string; change this to see the opposite result
if [[ $s ]]; then       # $s because we want the value (string) stored in s
    echo "not empty"
else
    echo "empty"
fi

Use `-z` if you want to test for an empty string:

In [None]:
s=""                    # the empty string; change this to see the opposite result
if [[ -z $s ]]; then
    echo "empty"
else
    echo "not empty"
fi

Lexicographical comparisons of strings (not integers!) are performed using `=` or `==`, `<` and `>`:

---

```sh
#!/bin/bash

# compstr.sh : Java-like compareTo comparison of strings

s1=$1
s2=$2
if [[ $s1 == $s2 ]]; then
    echo 0
elif [[ $s1 < $s2 ]]; then
    echo -1
else
    echo 1
fi
```

---

In [None]:
./scripts/conditional_statements/compstr.sh "cat" "cat"            # equal strings

In [None]:
./scripts/conditional_statements/compstr.sh "antelope" "zebra"     # antelope comes before zebra

In [None]:
./scripts/conditional_statements/compstr.sh "tiger" "poodle"       # tiger comes after poodle

Beware that inside `[[ ]]` the operators `==`, `<`, and `>` are string comparison operators. Integer values
are not compared numerically in such cases (they are compared lexicographically as strings):

In [None]:
./scripts/conditional_statements/compstr.sh 100 2                 # "100" comes before "2" lexicographically

#### Pattern matching in string expressions

`==` and `!=` are the equals and not-equals operators. When comparing two strings:

```sh
if [[ $left == $right ]]; then
```

or

```sh
if [[ $left != $right ]]; then
```

the string `$right` is considered to be a pattern as long as `$right` is not quoted. This means that the characters `*`, `?`, `[`, and `]` have special meaning if they
are part of an unquoted string appearing on the right-hand side of `==` or `!=`.

Recall that in pattern matching the `*` matches any sequence of characters (including the empty string). We can test
if a string `str` starts with the letter `a` by matching against the pattern `a*`:

In [None]:
str="avocado"
if [[ $str == a* ]]; then
    echo "starts with a"
else
    echo "does not start with a"
fi

Recall that in pattern matching the `?` matches any single (non-empty) character.  The pattern `a*a` matches
any three letter string starting and ending with `a`:

In [4]:
str="ada"
if [[ $str == a*a ]]; then
    echo "matches"
else
    echo "does not match"
fi

does not match



Recall that in pattern matching `[ ]` allows the programmer to specify a set of characters to match and that
character classes can be used. We can test if a string `str` matches the format of a Canadian postal code like so:

In [None]:
str="K7L3N6"
if [[ $str == [A-Z][0-9][A-Z][0-9][A-Z][0-9] ]]; then
    echo "might be a postal code"
else
    echo "not a postal code"
fi

### Integer expressions

Integer values can be compared using an integer expression or using an arithmetic expansion
(see the *Arithmetic* notebook).

A script to perform a Java-like compareTo comparison of integer values might be written like so
(other implementations are possible):

---

```sh
#!/bin/bash

# compint.sh : Java-like compareTo comparison of integers

i1=$1
i2=$2
if [[ $i1 -ne $i2 ]]; then           # not equal
    if [[ $i1 -lt $i2 ]]; then           # less than
        echo -1
    else
        echo 1
    fi
else 
    echo 0
fi
```

---

In [None]:
./scripts/conditional_statements/compint.sh 5 5            # 5 is equal to 5

In [None]:
./scripts/conditional_statements/compint.sh -3 12          # -3 is less than 12

In [None]:
./scripts/conditional_statements/compint.sh 22 21          # 22 is greater than 21

In most cases, if a non-integer value is used in an integer expression then no error occurs; instead the value
is treated as zero:

In [None]:
./scripts/conditional_statements/compint.sh hello goodbye         # 0 is equal to 0

In [None]:
./scripts/conditional_statements/compint.sh abc 3                 # 0 is less than 3

In [None]:
./scripts/conditional_statements/compint.sh 12 xyz                # 12 is greater than 0

Not all seemingly non-integer strings evaluate to zero because Bash supports expressing integer values
in a specified base (between 2 and 64). For example, the binary (base 2) number `101`, which is equal to
`5` in base 10, can be written as `2#101`:

In [None]:
./scripts/conditional_statements/compint.sh 2#101 1               # 5 is greater than 1

Because of the rules that Bash uses to parse an integer value in an integer expression, it is possible to
produce an error:

In [None]:
./scripts/conditional_statements/compint.sh 2abc 1               # 5 error

### Testing for the correct number of command line arguments

As we have already seen, Bash does not indicate that an error has occurred if an unassigned parameter is used.
A script that requires a specific number of command line arguments should manually test that the correct number
of arguments have been provided. 

Using an integer expression, we can test if the correct number of command line arguments have been
provided by the caller in our `namedsh.sh` script:

---

```sh
#!/bin/bash

# namedsh.sh

if [[ $# -ne 1 ]]; then
    # oops, no filename specified
    echo "namedsh.sh: output filename required" >&2
    exit 2
fi

if [[ -e $1 ]]; then
    # oops, specified file already exists
    echo "namedsh.sh: $1 already exists" >&2
    exit 1
fi

# specified does not exist, go ahead and create it
echo '#!/bin/bash' > ${1}
chmod u+x ${1}
```

---