# Functions

<div class="alert alert-block alert-info">
    You can find all of the scripts in this notebook in the subdirectory containing this notebook:
    <code>./scripts/functions</code>
</div>

Bash allows the programmer to write functions where code is grouped into a single procedure that can be
called by name. The ways in which arguments are passed to the function, values returned by the function,
and how variables are scoped are unusual compared to languages such as Python and Java.

### Declaring a function

A Bash function named *fname* can be written as:

```sh
function fname {
    commands
}
```

or 

```sh
fname() {
    commands
}
```

The first version is legal but generally discouraged because it is not portable across different shells. The
course notebooks use only the second version shown above.

The space or newline after the opening brace `{` and the newline before the closing brace `}` are important.
If either is missing then a syntax error occurs because the shell cannot determine if the brace is a delimiter
of the function body or if it is part of an identifier. For example, running the following cell results
in a syntax error (see https://www.shellcheck.net/wiki/SC1054):

In [None]:
fname() {echo "inside function"
}

Similarly, a missing newline before the closing brace `}` also causes an error. The following script fails with
a syntax error because the shell does not recognize the closing brace (see https://www.shellcheck.net/wiki/SC1056):

```sh
#!/bin/bash

# missing_newline.sh

fname() { 
    echo "inside function" }
```

In [None]:
./scripts/functions/missing_newline.sh

The following script illustrates a simple function having no arguments and (seemingly) returning no value:

---
```sh
#!/bin/bash

# hello.sh

# the function definition
greet() {
    echo "Hello"
}

# calling the function
greet

```
---



In [None]:
./scripts/functions/hello.sh

Calling a function looks exactly like invoking a command. Just like commands, it is possible to pass arguments
to a function. Unlike languages such as Python, the parameters of a Bash function do not have programmer
defined names. Instead, the positional parameters are tempoprarily replaced with the function arguments
while the function is running.

The following script illustrates a function having three parameters. The script calls the function with
0, 1, 2, 3, and 4 arguments and then prints the command line arguments passed to the script:

---
```sh
#!/bin/bash

# func_params.sh

test() {
    param1=$1
    param2=$2
    param3=$3
    echo "$# arguments passed to function"
    echo "param1 = ${param1}"
    echo "param2 = ${param2}"
    echo "param3 = ${param3}"
}

# call function with varying number of arguments
test
echo "---"
test a
echo "---"
test a b
echo "---"
test a b c
echo "---"
test a b c d
echo "---"

# check if arguments passed to the script are still intact
n=$#
echo "${n} arguments passed to script"
for (( i = 1; i <= n ; i++ )); do
    echo "\$${i} = $1"
    shift
done
echo "---"
```
---

In [None]:
./scripts/functions/func_params.sh arg1 arg2 arg3 arg4 arg5

Notice that the function behaves much like a script. Calling the function with the wrong number of
arguments is not automatically an error, just like running a script with the wrong number of
arguments is not automatically an error.

Also notice that outside of a function, the values of `@` and the positional parameters retain their values.

### Returning a value from a function

The similarity between functions and commands extends to the return value of a function. Just like a script
sets its exit status value at completion, a function sets its exit status on completion. The exit status
of a function is equal to the exit status of the last command executed by the function. The `return` builtin
command will cause a function to stop executing and set the exit status to the specified value.

<div class="alert alert-block alert-info">
    <code>return</code> <it>n</it> always sets the exit status to the integer value formed
    from the least significant 8 bits of <it>n</it>. This means that the exit status is always
    a value between 0 and 255.
</div>

The following cell contains a function that randomly sets its exit status to `0` or `1`:

In [None]:
somefunc() {
    if (( RANDOM % 2 == 0 )); then
        return 0
    else
        return 1
    fi
}
somefunc
echo $?

If you want to simulate returning a value other than the exit status then have the function print the return
value to standard output and use command substitution to capture the output. The following script
prints the maximum value in a list of integers supplied on the command line:

---
```sh
#!/bin/bash

# max.sh

# prints the maximum of two integer arguments
# prints the first integer if both values are equal
max2() {
    if (( $1 >= $2 )); then
        echo $1
    else
        echo $2
    fi
}

# exits with 0 if argument is an integer, 1 otherwise
is_int() {
    if [[ $1 =~ ^[-+]?[0-9]+$ ]]; then 
        return 0
    else
        return 1
    fi
}


if (( $# == 0 )); then 
    exit 1
fi
if is_int "$1"; then 
    hi=$1
else 
    exit 2
fi
while (( $# > 0 )); do
    if is_int "$1"; then 
        hi=$(max2 $hi $1)
    else
        exit 2
    fi
    shift
done
echo $hi
```
---

In [None]:
./scripts/functions/max.sh 1 5 7 4 

The function `max2()` prints the greater of two integer arguments to standard output. The script uses `max2()`
by comparing each integer supplied as command line arguments to the current greatest value `hi`.

<div class="alert alert-block alert-info">
    The disadvantage with using command substitution to capture the return value is that command substitution
    runs the command in a separate subshell which can be inefficient in terms of runtime for simple commands.
</div>

### Dynamic scoping

Bash uses a technique called *dynamic scoping* to determine visibility of variables. Dynamic scoping is uncommon
in modern languages so most readers of this document are likely unfamiliar with the concept. The Bash documentation
(https://www.gnu.org/software/bash/manual/html_node/Shell-Functions.html#Shell-Functions) says the following: 

> With dynamic scoping, visible variables and their values are a result of the sequence of function calls that
> caused execution to reach the current function.

This means that if a script creates a variable `a` and then calls a function `f()`, then the variable `a` is
visible in the called function. If `f()` changes the value of `a` then the caller will see the changed value
of `a`. The following script illustrates dynamic scoping:

---
```sh
#!/bin/bash

# dynamic_scope.sh

f() {
    echo "  a at beginning of f()   = $a"
    a="f: hello"
    g
    echo "  a at end of f()         = $a"
    echo "  b at end of f()         = $b"
}

g() {
    echo "    a at beginning of g() = $a"
    b="g: goodbye"
    echo "    a at end of g()       = $a"
}

a="hello"
echo "a before calling f()      = $a"
f
echo "a after calling f()       = $a"
echo "b after calling f()       = $b"

```
---

In [None]:
./scripts/functions/dynamic_scope.sh 

We can trace the visibility of the variables `a` and `b`:

1. `a` is assigned the value `hello` in the main body of the script
2. the main body of the script calls the function `f()`. `a` is visible in `f()`
3. the function `f()` assigns `a` the value `f: hello`
4. `f()` calls `g()`. `a` is visible in `g()`
5. `g()` creates the variable `b`
6. after `g()` returns, the variable `b` is visible in `f()`
7. after `f()` returns, the variable `b` is visible in the main body of the script

Dynamic scoping can be used to simulate returning a value or values from a function in two different
ways. In the function:

1. assign the return values to variables that were visible to the caller
    * the caller uses those variables to access the return values
2. create new (non-local) variables and assign the return values to the new variables
    * the caller uses the new variables to access the return values
    
We can re-write the `max.sh` script so that the `max2()` function assigns the greater of its two arguments
to the variable `hi`:

---
```sh
#!/bin/bash

# remax.sh

# prints the maximum of two integer arguments
# prints the first integer if both values are equal
max2() {
    if (( $1 >= $2 )); then
        hi=$1
    else
        hi=$2
    fi
}

# exits with 0 if argument is an integer, 1 otherwise
is_int() {
    if [[ $1 =~ ^[-+]?[0-9]+$ ]]; then 
        return 0
    else
        return 1
    fi
}


if (( $# == 0 )); then 
    exit 1
fi
if is_int "$1"; then 
    hi=$1
else 
    exit 2
fi
while (( $# > 0 )); do
    if is_int "$1"; then 
        max2 $hi $1
    else
        exit 2
    fi
    shift
done
echo $hi
```
---

In [None]:
./scripts/functions/remax.sh 5 10 1

### Local variables

To restrict the scope of a variable to a single function (and, unfortunately, to any
function called by that function) use the `local` builtin command. Local variables allow a function
to use variables without accidently overwriting variables defined in the main script. Consider the
following script that contains a loop that appears should run for 10 iterations:

In [None]:
some_function() {
    i=100
}

for (( i = 0; i < 10; i++ )); do
    echo $i
    some_function
done

`some_function` sets the value of `i` which unfortunately is the name of the loop variable used in the
main script. This causes the loop to iterate only once. If `some_function` declares `i` to be local, then
a new variable named `i` is created which is visible only to `some_function` and any function that 
`some_function` calls:

In [None]:
some_function() {
    local i=100
    # i in main script not affected
}

for (( i = 0; i < 10; i++ )); do
    echo $i
    some_function
done

The local variable `i` is visible to any function that `some_function` calls. For example, `some_other_function`
is able to see (and modify) the local variable `i` that `some_function` declared:

In [None]:
some_function() {
    local i=100
    some_other_function
}

some_other_function() {
    echo "some_other_function i = $i"
}

for (( i = 0; i < 10; i++ )); do
    echo $i
    some_function
done