# Bash programming

<center>
<img src="figs/bash.png" style="width: 500px;"/>
    </center>

## History of Unix shells


  * 1979: Bourne shell (`sh`)
  * 1978: C and TC shell (`csh` and `tcsh`)
  * 1989: Bourne Again shell (`bash`)
  * Bash derivatives: 
      * 1983: Korn shell (`ksh`), 
      * 1990: Z shell (`zsh`)
      * 2002: Dash (`dash`),       

## Why learn Bash? (1)

  * Learning Bash means learning the roots of scripting 
  * Bash, are frequently encountered on Unix systems
  * Bash is the dominating command interpreter and scripting language

## Why learn Bash? (2)

Shell scripts evolve naturally from a workflow: 
  1. A sequence of commands you use often are placed in a file
  2. Command-line options are introduced to enable different options to be passed to the commands
  3. Introducing variables, if tests, loops enables more complex program flow
  4. At some point pre- and postprocessing becomes too advanced for bash, at which point (parts of) the script should be ported to Python or other tools


## Installation

* All our examples can be run under Bash, and many in the Bourne shell
* Differences in operating systems:
    * **Mac OSX**: `/bin/sh` is just a link to Bash (`/bin/bash`).
    * **Ubuntu**: `/bin/sh` is a link to Dash, a minimal, but much faster shell than bash.
    * **Windows**: bash is available through `cygwin` or the Linux-Subsystem in Windows 10.     

## Some reference material
  
  * Bash reference manual: `www.gnu.org/software/bash/manual/bashref.html`
  * "Advanced Bash-Scripting Guide": `http://www.tldp.org/LDP/abs/html/`

## What Bash is *good* for

* File and directory management

* Systems management (build scripts)

* Combining other scripts and commands

* Rapid prototyping of more advanced scripts

* Very simple output processing, plotting etc.

## What Bash is *not good* for

* Cross-platform portability

* Graphics, GUIs

* Interface with libraries or legacy code

* More advanced post processing and plotting

* Calculations, math etc.

## Some common tasks in Bash

  * file writing
  * for-loops
  * running an application
  * combining applications (pipes)
  * file globbing, testing file types
  * managing files and directories (creation, deletion, renaming)
  * directory tree traversal
  * packing directory trees

# Let's get started!

### Bash example: Hello world

In [6]:
echo "Hello world"  # This is a comment

Hello world


**Note**: You will learn a lot of Bash/Unix commands in this lecture. 
I will highlight new commands with a <font color='red'>⚠️ </font> and explain its action. 

<font color='red'>⚠️ </font> **echo** prints text to screen.

#### How to run this script? 


**Option 1**: Type the commands directly in the bash shell (only feasible for
small scripts).

**Option 2**. Save the code as `helloworld.sh`, make it executable with 
```bash
chmod a+x helloworld.sh
``` 
and run with:
```bash
./helloworld.sh
```

<font color='red'>⚠️ </font> Use **chmod** to change file permissions.

### How does the OS know that this is a bash script?

Bash uses itself as default interpreter, if not otherwise specified.

**Better**: Add a "magic" line as the first line to tell the operating system which interpreter to use:

```bash

#!/bin/bash  

echo "Hello world"
```

### Why do I need add the "./" when execting "./helloworld.sh" ?

### Bash example: Hello world, v2 (variables)

**Syntax**:
* Assign a variable by `var=value`.
* Retrieve the value of the variable by `${var}` or `$var`

**Example**

In [5]:
#!/bin/bash

cmd=echo
name="Simon"

${cmd} "Hello world, ${name}s!"

Hello world, Simons!


### Bash example: Hello world, v3 (getting output of commands)
**Syntax**: `var=$(cmd)` stores the output of the command `cmd` in the variable `var`.


**Task**: Display the day of the week.

In [6]:
#!/bin/bash

weekday=$(date +"%A")    # date +"%A" is a bash command to display the day of the week 
echo "I dag er det $weekday."

I dag er det mandag.


## Bash is (sometimes surprisingly) strict in its syntax

What happens when running this code?

In [8]:
myvar = 5

**Remember**: Do not use spaces around the "=" in variable assignments!


Also: Non-existing variables evaluate to an emptry string:

In [3]:
echo "Value of new variable is ${never_used}."

Value of new variable is .


## By default variables are un-typed, and treated as character arrays

In [1]:
x=5
x=$x+5
echo $x

5+5


You can explicitly declare the variable type:

In [2]:
declare -i b     # define an integer variable b
a=5
b=$a+5
echo $b

10


In [4]:
b="Hallo"
echo $b

0


### Other variable types
Read only variables

In [5]:
declare -r r=10            # read only

In [6]:
r = 5

bash: r: readonly variable


: 1

Arrays

In [15]:
declare -a array=("foo" "bar") # array
echo ${array[0]}  # First array value
echo ${array[@]}  # All array values
echo ${#array[@]}    # Array size

foo
foo bar
2


# Bash programming, part 2

<center>
<img src="figs/bash.png" style="width: 500px;"/>
    </center>

# Combining bash commands

Core concepts of UNIX:

* The power of Unix lies in combining simple commands into powerful operations.
* Standard bash commands and UNIX applications normally do one simple task (but very well).
* Text is used for input and output - easy to send output from one command as input to another.

### Problem: Write a script that prints today's weather forecast in Oslo

In [3]:
curl -s https://www.yr.no/place/Norway/Oslo/Oslo/Oslo/ | \
grep "temperature" | \
head -n 1 | \
cut -d"\"" -f4 | \
cowsay

 _______________________________________
/ Temperature: 18°. Feels like 18°. For \
\ the period: 19:00                     /
 ---------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||


## The in-/outputs of a bash process: `STDIN`, `STDOUT` and `STDERR`

Unix processes uses the following three standard streams as preconnected input and output communication channels:

<center>
<!--<img src="figs/bash_process_codes.jpg" style="width: 500px;"/>-->
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/7/70/Stdstreams-notitle.svg/646px-Stdstreams-notitle.svg.png" style="width: 500px;"/>
</center>

* user input is passed to the standard input `STDIN` stream
* normal information is passed to the standard output `STDOUT` stream
* error information is passed to the standard error `STDERR` stream.

## Combining bash commands: Pipes

<center>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/f/f6/Pipeline.svg/560px-Pipeline.svg.png
" style="width: 500px;"/>
</center>

The bash pipe `|` connects `STDOUT` of one command to `STDIN` of another. 

### Examples

List all Jupyter Notebook files in a directory tree:

In [8]:
ls ../* | grep ".ipynb"

../Peer-review information[01;31m[K.ipynb[m[K
About the course[01;31m[K.ipynb[m[K
Introduction to git[01;31m[K.ipynb[m[K
Scripting vs regular programming[01;31m[K.ipynb[m[K
Bash programming[01;31m[K.ipynb[m[K
Untitled[01;31m[K.ipynb[m[K
03_python_summary[01;31m[K.ipynb[m[K
03_python_summary-short[01;31m[K.ipynb[m[K
03_python_summary-subclasses[01;31m[K.ipynb[m[K
04_python_summary2[01;31m[K.ipynb[m[K
Error handling in Python[01;31m[K.ipynb[m[K
Essentials for better Python programming[01;31m[K.ipynb[m[K
How to structure your program[01;31m[K.ipynb[m[K
Testing[01;31m[K.ipynb[m[K
05_numerical_python[01;31m[K.ipynb[m[K
cython_from_crash_course[01;31m[K.ipynb[m[K
mixed_programming_cython[01;31m[K.ipynb[m[K
mixed_programming_introduction[01;31m[K.ipynb[m[K
Numba[01;31m[K.ipynb[m[K
python_profiling[01;31m[K.ipynb[m[K
introduction_to_c[01;31m[K.ipynb[m[K
mixed_programming_ctypes_instant[01;31m[K.ipynb[m[K
mi

<font color='red'>⚠️ </font> `ls` lists the files in a directory.<br>
<font color='red'>⚠️ </font> `grep` filters lines that contain a string

### Examples 

Show options about case-sensitivity in the manual page of `grep`:

In [13]:
man grep  | grep "case"

       -i, --ignore-[01;31m[Kcase[m[K
              Ignore  [01;31m[Kcase[m[K  distinctions  in  patterns and input data, so that
              characters that differ only in [01;31m[Kcase[m[K match each other.
       --no-ignore-[01;31m[Kcase[m[K
              Do not ignore [01;31m[Kcase[m[K distinctions  in  patterns  and  input  data.


<font color='red'>⚠️ </font> `man` shows the manual of a UNIX/bash command

## Redirecting streams to files

## `STDOUT` to file

Bash redirects `>` pass `STDOUT` to a file:

```bash
./myscript.sh > myfile.txt   
```
same as above, but appends output to an existing file
```bash
./myscript.sh >> myfile.txt   
```

## File to `STDIN`
Use the `<` redirect to send a file to `STDIN`:

```bash
wc -w < thesis.txt # Count the number of words and print to STDOUT 
wc -w < thesis.txt > word_stat.txt # Same as above, but save STDOUT output to file
```


<font color='red'>⚠️ </font> `wc` prints newline, word, and byte counts for a file 

## Specifying which stream to redirect

You can specify which stream to redirect. 

**Syntax**: `[STREAM]>`. Valid values for `STREAM` is `1` for stdin, `2` for stderr and `&` for both.

**Example**:

```bash
./compile_model.sh                 # stdout and stderr are displayed on the terminal
./compile_model.sh 1> out.txt      # Redirect stout to file, same as >
./compile_model.sh 2> err.txt      # Redirect stderr to file
./compile_model.sh &> outerr.txt   # Redirect stdout and stderr to file
```

# Flow control and command line arguments

## Comparisons and if statements

A simple `if` statement with **string comparison**:

```bash
if [ "$name" == "øyvind" ]; then  
    echo "Du heter øyvind"
fi 
```
**Note**: You **must** have spaces after `[` and before `]`. 

### `if` statement with integer comparison:

```bash
if [ $variable -eq 10 ]; then 
    echo "The variable is 10"
fi 
```



**Important**: Unless you have declared a variable to be an integer, assume that all variables are strings and use double quotes (strings) when comparing variables:


```bash
if [ "$?" != "0" ]; then  # this is safe
```

```bash
if [  $?  !=  0  ]; then  # might be unsafe
```

## Command line arguments

It is common to pass command line arguments when running a script:


```bash
ls -l -h
```

Bash provides the special (array) variable `$` for accessing these arguments:

* `$0` contains the script name, here `ls`
* `$1` contains `-l`
* `$2` contains `-h`
* ...

In addition, you can use:
* `$@` to access all command line arguments as array
* `$#` to get the number of arguments


## For-loops


What if we want to run the application for multiple input files?

```bash
./run.sh test1.i test2.i test3.i test4.i
```
or
```bash
./run.sh *.i
```

A for-loop over command line arguments

```bash
for arg in $@; do
  simulation_app < $arg
done
```

Can be combined with more advanced command line options, output
  directories, etc...

## For-loops (2)


For loops for file management:

```bash
files=$(ls *.tmp) 

for file in $files
do
  echo removing $file
  rm -f $file
done
```

## Counters


Declare an integer counter:

In [14]:
declare -i counter
counter=0
# arithmetic expressions must appear inside (( ))

In [15]:
((counter++))
echo $counter  # yields 1

1


For-loop with counter:

```bash
declare -i n; n=1
for arg in $@; do
  echo "command-line argument no. $n is <$arg>"
  ((n++))
done
```

## Example: the classical Unix script
A combination of commands, or a single long command, that you use often:

```bash
./pulse_app -cmt WinslowRice -casename ellipsoid < ellipsoid.i | tee main_output
```

In this case, flexibility is often not a high priority. 

However, there
is room for improvement;

1. Not possible to change command line options, input and output files
2. Output file `main_output` is overwritten for each run
3. Can we edit the input file for each run?

## Problem 1: changing command line options
In many cases only one parameter is changed frequently:

```bash
CASE='testbox'
CMT='WinslowRice'
if [ $# -gt 0 ]; then
   CMT=$1
fi
INFILE='ellipsoid_test.i'
OUTFILE='main_output'

./pulse_app -cmt $CMT -cname $CASE < $INFILE | tee $OUTFILE
```

Still not very flexible, but in many cases sufficient. More
flexibility requires more advanced parsing of command line options,
which will be introduced later.

## Problem 2: overwriting output file


A simple solution is to add the output file as a command line
  option, but what if we forget to change this from one run to the
  next?

Simple solution to ensure data is never over-written:

```bash
jobdir=$PWD/$(date)
mkdir $jobdir
cd $jobdir

./pulse_app -cmt $CMT -cname $CASE  < $INFILE | tee $OUTFILE
cd ..
if [ -L 'latest' ]; then
    rm latest
fi
ln -s $jobdir latest
```

<font color='red'>⚠️ </font> New commands:
* **mkdir** creates a directory
* **-L FILE** checks if `FILE` exists and is a symbolic link (you can use **f file** to check if **file** exists and is a regular file
* **ln** creates a links to files.

## Problem 2: overwriting output file (2)
Alternative solutions for creating a unique path:

1. Use process ID of the script (`$$`, not really unique)

2. <font color='red'>⚠️ </font> `mktemp` can create a temporary file with a unique name, for
  use by the script

3. Check if subdirectory exists, exit script if it does:

```bash
dir=$case
# check if $dir is a directory:
if [ -d $dir ]
then
  # exit script to avoid overwriting data
  echo "Output directory exists, provide a different name"
  exit
fi
mkdir $dir   # create new directory $dir
cd $dir      # move to $dir
```

<font color='red'>⚠️ </font> **-d DIR** checks if a `DIR` is a directory

## Problem 3: can we edit the input file at run time?

**Problem:** Some applications do not take command line options, all input must read from standard input or an input file

A Bash script can be used to equip such programs with basic handling of command line options

**Idea**: We want to grab input from the command line, create the correct input file, and run the application

## File reading and writing


File writing is efficiently done by 'on-the-fly documents':

```bash
cat <<EOF
multi-line text
can now be inserted here,
and variable substition such as
$myvariable is
supported.
EOF > myfile
```

The final EOF must
start in column 1 of the
script file.

<font color='red'>⚠️ </font> **cat** prints the content of a file (or `STDIN`) to `STDOUT`.

## Parsing command-line options

```bash
# We want to support arguments of the form "-m 2 -b 1.5"

# read variables from the command line, one by one:
while [ $# -gt 0 ]
do
    option=$1; # load command-line arg into option
    shift;     # eat currently first command-line arg
    case "$option" in
        -m)
            m=$1; shift; ;;  # ;; indicates end of case
        -b)
            b=$1; shift; ;;
        *)
            echo "$0: invalid option \"$option\""; exit ;;
    esac
done

echo "Command line arguments:"
[ -n "$m" ] && echo "m=$m"
[ -n "$b" ] && echo "b=$b"
```

<font color='red'>⚠️ </font> **shift** easts the current first command line argument.

## Alternative to case: if

`case` is standard when parsing command-line arguments in Bash, but if-tests can also be used. Consider

```bash
if [ "$option" == "-m" ]; then
    m=$1; shift;  
elif [ "$option" == "-b" ]; then
    b=$1; shift;
else
    exit;
fi
```

## After assigning variables, we can write the input file

Write to `$infile` the lines that appear between the EOF symbols:

```bash
cat > $infile <<EOF
        gridfile='test2.grid'
        param_m=$m
        param_b=$b
EOF
```

## Then execute the program as usual


Redirecting input to read from the new input file

```bash
../pulse_app < $infile
```

## Print an error if someting goes wrong

We can add a check for successful execution.
The shell variable `$?` is 0 if last command
was successful, otherwise `$? != 0`.

```bash
if [ "$?" != "0" ]; then
  echo "running pulse_app failed"; exit 1
fi
```


<font color='red'>⚠️ </font> **exit n** sets $? to n and exits the script

## Example: bundle files


Pack a series of files into one file

Executing this single file as a Bash script packs out all the individual files again

Usage:

```bash
bundle file1 file2 file3 > onefile  # pack
bash onefile # unpack
```

Writing `bundle` is easy:

```bash
#/bin/sh
for i in $@; do
    echo "echo unpacking file $i"
    echo "cat > $i <<EOF"
    cat $i
    echo "EOF"
done
```

## The bundle output file


Consider 2 fake files: `file1`:

```bash
Hello, World!
No sine computations today
```

and `file2`:

```bash
1.0 2.0 4.0
0.1 0.2 0.4
```

Running `bundle file1 file2` yields the output

```bash
echo unpacking file file1
cat > file1 <<EOF
Hello, World!
No sine computations today
EOF
echo unpacking file file2
cat > file2 <<EOF
1.0 2.0 4.0
0.1 0.2 0.4
EOF
```

**Note**: In the terminal, you can send a foreground job into the background by pressing `Ctrl-Z` followed by the `bg` command, and retrieve it again with `fg`.

## Functions

```bash
function system {
# Run operating system command and if failure, report and abort

  "$@"
  if [ $? -ne 0 ]; then
    echo "make.sh: unsuccessful command $@"
    echo "abort!"
    exit 1
  fi
}
# function arguments: $1 $2 $3 and so on

# call:
system pdflatex mydoc.tex
system bibtex mydoc
```

How to return a value from a function? Define a new variable within the function - all functions are global!

## File globbing, for loop on the command line


List all .ps and .gif files using wildcard notation:

```bash
files=`ls *.ps *.gif`

# compress and move the files:
gzip $files
for file in $files; do
  mv ${file}.gz $HOME/images
```



<font color='red'>⚠️ </font> **gzip** compresses a file,
**mv** moves a file, **$HOME** is a string with the path to your home directory.

## The find command

Very useful command!


<font color='red'>⚠️ </font> `find` visits all files in a directory tree and can execute one or more commands for every file

Basic example: find the `oscillator` codes

```bash
find $scripting/src -name 'oscillator*' -print
```

Or find all log and PDF files

```bash
find $HOME \( -name '*.log' -o -name '*.pdf' \) -print
```

We can also run a command for each file:

```bash
find rootdir -name filenamespec -exec command {} \; -print
# {} is the current filename
```

## Applications of find (1)


Find all files larger than 2000 blocks a 512 bytes (=1Mb):

```bash
find $HOME -name '*' -type f -size +2000 -exec ls -s {} \;
```

Remove all these files:

```bash
find $HOME -name '*' -type f -size +2000 \
           -exec ls -s {} \; -exec rm -f {} \;
```

or ask the user for permission to remove:

```bash
find $HOME -name '*' -type f -size +2000 \
           -exec ls -s {} \; -ok rm -f {} \;
```

## Applications of find (2)


Find all files not being accessed for the last 90 days:

```bash
find $HOME -name '*' -atime +90 -print
```

and move these to /tmp/trash:

```bash
find $HOME -name '*' -atime +90 -print \
           -exec mv -f {} /tmp/trash \;
```

## Tar and gzip


<font color='red'>⚠️ </font> The `tar` command can pack single files or  all files in a directory tree into one file, which can be unpacked later

```bash
tar -cvf myfiles.tar mytree file1 file2

# options:
# c: pack, v: list name of files, f: pack into file

# unpack the mytree tree and the files file1 and file2:
tar -xvf myfiles.tar

# options:
# x: extract (unpack)
```

The tarfile can be compressed:

```bash
gzip mytar.tar
# result: mytar.tar.gz
```

## A find/tar/gzip example


Pack all PostScript figures:

```bash
tar -cvf ps.tar `find $HOME -name '*.ps' -print`
gzip ps.tar
```

# That is it for today

<center>
<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/a/a4/Anatomy_of_a_Sunset-2.jpg/2880px-Anatomy_of_a_Sunset-2.jpg" style="width: 500px;"/>
</center>