## Shell Scripting

So far we've seen the **BASH** shell used as a command interpreter and introduced the concept of variables. BASH is rather more than just an interpreter, however, and has a number of features of a programming language. It can be used to automate tasks in a reproducible manner by writing a _shell script_, essentially a collection of shell commands stored in a text file with the  _executable_ bit set. Shell scripts generally have the `.sh` filename extension.

### The Structure of a Script

A typical shell script has the following format:

```bash
#!/bin/bash
#
# Description of my shell script

echo "Hello from a shell script"
```

This very simple example has three sections:

 * The invocation line, `#!/bin/bash`. This line tells Linux which command interpreter we want to use to execute the following commands. There are other shell interpreters, but BASH is by far the most common.
 * Comments describing what the script does. Comment lines in BASH are prefixed by a `#` symbol.
 * The main body of the script that performs the intended functions. In this case it simply prints a message to the standard output.

In the examples that follow, whenever you want to write a shell script simply use your favourite text editor to create a file. It doesn't matter what the file is called but if the script is part of an exercise it will be convenient to name it `exercise-x.sh`, where `x` is the exercise number. Remember to ensure that the script is marked as executable before you run it. If you haven't already created a directory to contain your work then do so now.

#### Exercise 11

Create a script shell script called `exercise-11.sh`, set it to be executable, and copy the above code snippet into it. Can you run it?

### Conditional Operations

One of the most powerful features of a genuine programming language is the ability to change its function depending on conditions. BASH has this feature in the form of the `if` statement:

In [None]:
# Set our target date equal to the 19th June 2018
mydate=180619

# Perform a scheduled function on the 19th June 2018
if [ $(date +%y%m%d) = ${mydate} ]
then
  echo "Do something on the 19th"
else
  echo "It's not the 19th, so don't do anything"
fi

In this example we:

 * Set a date when we want to perform a task.
 * Test whether the current date is equal to the specified date.
 * If it is, then we perform the task. If not perform a different task.

The code in `[ square brackets ]` is a _logical comparison_ intended to give either of two answers, **true** or **false**, so in the simplest possible terms our code looks like this:

```bash
if [ TRUE ]
then
  task A
else
  task B
fi
```

If we don't want anything to happen if the comparison is false then we can simplify the code still further:

```bash
if [ TRUE ]
then
  task A
fi
```

Our comparison demonstrates two methods of determining a value. The first, dereferencing the variable `${mydate}`, we're already familiar with. The second, placing a command within `(brackets)` and prefixing it with a `$` actually runs that command and captures the output, which can either be assigned to a variable or, in this case, used directly. We can see what output the `date` command is generating by running it directly:

In [None]:
date +%y%m%d

#### Types of Comparisons

A curious feature of shell variables is that they don't have a defined type, such as _integer_, or _string_, or _floating point number_. Rather, the shell interprets their type depending on the context. In this case we are using the `=` operator to compare the result of the `date` command and `${mydate}` as strings. We can also use the `-eq` operator to compare the two values as numbers. This is a list of common comparison operators, both for treating variables as strings, and as integers:

| Comparison | String | Integer |
|------------|--------|---------|
| Equal to | `=` | `-eq` |
| Not equal to | `!=` | `-ne` |
| Greater than | `>` | `-gt` |
| Greater than or equal to | | `-ge` |
| Less than | `<` | `-lt` |
| Less than or equal to | | `-le` |

#### Exercise 12

A test for whether a number is greater than another is easy to conceptualise, but how would you expect the equivalent string comparison to operate? Either in a script or in a Jupyter notebook test how the comparison operators deal with numbers and strings.

### Loops

Loops are a way of repeating a series of operations, potentially using different data. A simple example of a `for` loop is shown below: 

In [None]:
for number in {1..10}
do
  echo "I am printing number ${number}"
done

#### The FOR loop

The structure of a `for` loop looks like this:

```bash
for VARIABLE in LIST
do
  TASK 1
  TASK 2
  ..
  TASK N
done
```

The LIST is simply a set of values separated by spaces. The `for` loop selects each in turn, stores it in the variable VARIABLE, and then executes a series of commands. When it comes to the end it returns to the top, selects the next value from LIST, executes the commands, and keeps doing this until the the values in LIST runs out. In the above example the structure `{1..10}` expands to `1 2 3 4 5 6 7 8 9 10`, so we could re-write it as:

In [None]:
for number in 1 2 3 4 5 6 7 8 9 10
do
  echo "I am printing number ${number}"
done

...or indeed:

In [None]:
for number in 1 25 "hello" "This is the BASH shell" 40000
do
  echo "I am printing number ${number}"
done

Notice that collections of space-separated elements enclosed by `"quotation marks"` are treated as a single element, and we can mix integers and strings without consequence.

#### Exercise 13

The `for` loop operates on any list of space-separated values. Can you think of any standard Linux tools that return such a list? Try using such a tool in place of the LIST and perform an operation on each of the VARIABLEs that is returned.

#### The WHILE loop

A `while` loop also iterates over a series of values, but unlike the `for` loop where we can specify a range:

In [None]:
count=0
while [ ${count} -ne 10 ]
do
  echo "Count is ${count}"
  count=$((count+1))
done

The structure looks like this:

```bash
while CONDITION
do
  TASK 1
  TASK 2
  ..
  TASK N
done
```

The CONDITION is specified in the same format as that provided to the `if` statement, and the loop will keep repeating as long as that condition is true. In our example we increment the value of the variable `count` every time we go through the loop so that eventually the CONDITION becomes false.
This example introduces **integer arithmetic** using the shell. If we want to perform a calculation then it must be enclosed within `((double brackets))` and prefixed by a `$` sign. For example:

In [None]:
x=2
y=7
z=$((x*y))
echo "The result is ${z}"

This freedom to operate over a large range of values and define how `${count}` is incremented for each iteration (known as the _stride_) provides additional flexibility over the `for` loop, but does more manual control and leaves open the possibility of loops that don't successfully complete.

#### Exercise 14

The above `while` example loop will increment `${count}` by 1 each time through the loop and terminate when `${count}` equals 10. What would happen if you incremented `${count}` by 3 each time? Be very careful if you want to try this! What conditional operators could solve this potential problem? 

#### The UNTIL loop

The `until` loop looks and operates very similarly to the `while` loop:

In [None]:
count=0
until [ ${count} -gt 10 ]
do
  echo "Count is ${count}"
  count=$((count+1))
done

The main difference is that it operates `until` a condition is met instead off `while` a condition is met.

### Command Line Arguments

If we want to write a script that performs an operation on a set of files, we have two options:

 * Write those filenames directly into the script.
 * Allow the filenames to be passed to the script at the command line.

The former might be suitable for a one-off task, but the latter allows for code re-usability. We don't have to keep editing the script every time we want to change the list of files. For example:

```bash
#!/bin/bash
#
# Print the first two arguments
echo "The first two arguments are $1 and $2"

# Loop over the list of arguments and perform an operation with each
for arg in $@
do
  echo "Argument: ${arg}"
done
```

The command line arguments are passed to the script as unusual-looking variables, `$1`, `$2`, etc..., where the number is the position of the argument on the command line. Because these variables are not descriptive it is common to re-allocate them to a variable with a more memorable name. There are other variables passed along with the command line arguments that may prove useful:

| Argument | Description |
|----------|-------------|
| `$0` | The name of the script that is being run |
| `$1`..`$n` | n numbered variables corresponding to the command line arguments |
| `$@` | The full list of arguments |
| `$#` | The total number of arguments |

#### Exercise 15

Copy the above command line argument example into a script and run it as follows:
```bash
./exercise-15.sh first second third
```
Create three files with text in them and print them out, using `sed` to replace any instances of the word _date_ with the actual date.