# First some organisational stuff

## Assignments 1

* Deadline: Today at midnight
* Make sure that your solution is pushed to 
  
  `https://github.com/UiO-INF3331/INF3331-username`.
* If you have problems accessing your repo, talk to me during the break/after the lecture.

## Assignments 2

* Assignment 2 is online now.
* Topic: Bash scripting (lecture today).

# Bash programming


## Overview of Unix shells

  * The original scripting languages were (extensions of) command interpreters in operating systems

  * Primary example: Unix shells

  * Bourne shell (`sh`) was the first major shell
    * C and TC shell (`csh` and `tcsh`) had improved command interpreters, but were less popular than Bourne shell for programming
    * Bourne Again shell (Bash/`bash`) is a GNU/FSF improvement of Bourne shell
    * Other Bash-like shells: Dash (`dash`), Korn shell (`ksh`), Z shell (`zsh`)

  * Bash is the dominating Unix shell today

## Why learn Bash?

  * Learning Bash means learning Unix

  * Learning Bash means learning the roots of scripting (Bourne shell is a subset of Bash)

  * Shell scripts, especially in Bourne shell and Bash, are frequently encountered on Unix systems

  * Bash is widely available (open source) and the dominating command interpreter and scripting language on today's Unix systems

## Why learn Bash? (2)

* Shell scripts evolve naturally from a workflow: 
  * A sequence of commands you use often are placed in a file
  * Command-line options are introduced to enable different options to be passed to the commands
  * Introducing variables, if tests, loops enables more complex program flow
  * At some point pre- and postprocessing becomes too advanced for bash, at which point (parts of) the script should be ported to Python or other tools
* Shell scripts are often used to glue more advanced scripts in Perl and Python

## Remark

* We use plain Bourne shell (`/bin/sh`) when special features of Bash (`/bin/bash`) are not needed
* Most of our examples can in fact be run under Bourne shell (and of course also Bash)
* In Mac OSX, the Bourne shell (`/bin/sh`) is just a link to Bash
    (`/bin/bash`). In Ubuntu, the Bourne shell (`/bin/sh`) is a link to Dash, a
    minimal, but much faster shell than bash.
* On Windows bash is available through `cygwin` or the Linux-Subsystem in Windows 10. 

## More information

  * `man bash`

In [17]:
man bash

BASH(1)                     General Commands Manual                    BASH(1)

NNAAMMEE
       bash - GNU Bourne-Again SHell

SSYYNNOOPPSSIISS
       bbaasshh [options] [command_string | file]

CCOOPPYYRRIIGGHHTT
       Bash is Copyright (C) 1989-2013 by the Free Software Foundation, Inc.

DDEESSCCRRIIPPTTIIOONN
       BBaasshh  is  an  sshh-compatible  command language interpreter that executes
       commands read from the standard input or from a file.  BBaasshh also incor‐
       porates useful features from the _K_o_r_n and _C shells (kksshh and ccsshh).

       BBaasshh  is  intended  to  be a conformant implementation of the Shell and
       Utilities portion  of  the  IEEE  POSIX  specification  (IEEE  Standard
       1003.1).  BBaasshh can be configured to be POSIX-conformant by default.

OOPPTTIIOONNSS
       All  of  the  single-character shell options documented in the descrip‐
  

  * Bash reference manual: `www.gnu.org/software/bash/manual/bashref.html`

  * "Advanced Bash-Scripting Guide": `http://www.tldp.org/LDP/abs/html/`

## What Bash is *good* for

* File and directory management

* Systems management (build scripts)

* Combining other scripts and commands

* Rapid prototyping of more advanced scripts

* Very simple output processing, plotting etc.

## What Bash is *not good* for

* Cross-platform portability

* Graphics, GUIs

* Interface with libraries or legacy code

* More advanced post processing and plotting

* Calculations, math etc.

## Some common tasks in Bash

  * file writing

  * for-loops

  * running an application

  * pipes

  * writing functions

  * file globbing, testing file types

  * copying and renaming files, creating and moving to directories, creating directory paths, removing files and directories

  * directory tree traversal

  * packing directory trees

### Bash example 1: Hello world

In [1]:
#!/bin/bash
echo "Hello world"

Hello world


Two options to run this script:

1. Type the commands directly in the bash shell (only feasible for
small scripts).
2. Save the code as `helloworld.sh` and run with:
```bash
chmod a+x helloworld.sh # Make script executable
./helloworld.sh
```

### Bash example 2: Hello world, v2

In [7]:
#!/bin/bash
x="world"
echo "Hello ${x}!"

Hello world!


## Bash variables and commands

  * Assign a variable by 
    `
    x=3.4
    `, 
    retrieve the value of the variable by `$x` (also called **variable substitution**).

  * Variables passed as command line arguments when running a script are called **positional parameters**.

  * Bash has a number of built in commands, type `help` or `help | less` to see all.

In [8]:
help

GNU bash, version 4.3.48(1)-release (x86_64-pc-linux-gnu)
These shell commands are defined internally.  Type `help' to see this list.
Type `help name' to find out more about the function `name'.
Use `info bash' to find out more about the shell in general.
Use `man -k' or `info' to find out more about commands not in this list.

A star (*) next to a name means that the command is disabled.

 job_spec [&]                            history [-c] [-d offset] [n] or hist>
 (( expression ))                        if COMMANDS; then COMMANDS; [ elif C>
 . filename [arguments]                  jobs [-lnprs] [jobspec ...] or jobs >
 :                                       kill [-s sigspec | -n signum | -sigs>
 [ arg... ]                              let arg [arg ...]
 [[ expression ]]                        local [option] name[=value] ...
 alias [-p] [name[=value] ... ]          logout [n]
 bg [job_spec ...]                       mapfile [-n count] [-O origin] [-s c>
 bind [-lpsv

  * The real power comes from all the available Unix commands, in addition to your own applications and scripts.

## Quiz: What is the difference?

In [9]:
x=5
echo $x



In [8]:
x = 5
echo $x

x: command not found


## Bash variables (1)


Variables in Bash are untyped!

In [47]:
x=5



Generally treated as character arrays:

In [48]:
x=$x+1
echo $x

5+1


Use the `let` command for simple arithmetic and other operations:

In [50]:
x=5
let "x+=1"   # let evaluates arithmetic expressions
echo $x

6


Variables can be explicitly declared to integer or array:

In [55]:
declare -i i     # i is an integer
declare -a A     # A is an array
r=10
declare -r r     # r is read only

bash: r: readonly variable


In [56]:
r="a"

bash: r: readonly variable


## Bash variables (2)


The `echo` command is used for writing:
```bash
s=42
echo "The answer is $s"
```
and variables can be inserted in the text string (variable interpolation)

#### Frequently seen variables


Command line arguments:
```bash
$0    # Name of script
$1    # First command line argument
$2    # Second command line argument
# ...
```

All the command line arguments:
```bash
$@
```

Numer of command line arguments:
```bash
$#
```

The exit status of the last executed command:
```bash
$?   # returns 0 if the last command was success
```

### Example


**Idea**: Write a script that takes a command as an argument, runs it and checks if it was succesfull.

```bash
# run_and_test.sh 
$@

if [ "$?" == "0" ]; then
    echo "Hurray, everything went fine."
else
    echo "Oops, there was an error."
fi

```

In [12]:
./run_and_test.sh ls hello-world/*.cpp

hello-world/c++.cpp
Hurray, everything went fine.


## Bash variables (3)


Comparison of two integers use a syntax different from comparison of two strings:

```bash
if [ $i -eq 10 ]; then        # integer comparison
    
fi 

if [ "$name" == "10" ]; then  # string  comparison
    
fi 
```


Unless you have declared a variable to be an integer, assume that all variables are strings and use double quotes (strings) when comparing variables in an if test

```bash
if [ "$?" != "0" ]; then  # this is safe
```

```bash
if [  $?  !=  0  ]; then  # might be unsafe
```

### Executing a command, storing the result as a variable

Can we done in two ways:
```bash 
time=$(date)
time=`date`
```

## Convenient debugging tool: -x


Each source code line is printed prior to its execution if you add -x as option to `/bin/sh` or `/bin/bash`

Either in the header

```bash
#!/bin/bash -x
```

or on the command line:

In [16]:
bash -x hw.sh

+ echo 'Hello World'
Hello World


Very convenient during debugging

## Combining bash commands (1)

  * The power of Unix lies in combining simple commands into powerful operations

  * Standard bash commands and unix applications normally do one small task

  * Text is used for input and output - easy to send output from one command as input to another

## Combining bash commands (2)
Two standard ways to combine commands:


The pipe, sends the output of one command as input to the next:

```bash
ls -l | grep 3331
```

Will list all files having 3331 as part of the name

## Combining bash commands (3)
#### More usefull pipe examples

Send files with size to `sort -rn` (reverse numerical sort) to get a list
of files sorted after their sizes:


```bash
ls -s | sort -rn
```

Make a new application: sort all files in a directory
tree `assignments`, with the largest files appearing first, and
equip the output with paging functionality:

```bash
du -a assignments | sort -rn | less
```

## Bash redirects

Redirects are used to pass output to either a file or stream.

```bash
echo "Hei verden" > myfile.txt  # Save (stdout) output to file
echo "Hei verden" >> myfile.txt # Append (stdout) output to file
wc -w < myfile.txt              # Use file content as (stdin) command input
```

## Quiz: Can you simplify this code?

```bash
ls > files && grep 2017 < files
```

The code above implements a (inefficient) pipe. Better:
```bash
ls | grep 2017
```

## Quiz: Can you simplify this code?

```bash
cat INF3331-$username | ./test
```

The above can also be implemented as a:

```bash
./test < INF3331-$username
```

## The in-/outputs of a shell process: stdin, stdout and stderr

A process takes standard input (STDIN) and returns

* standard output (STDOUT)
* standard error (STDERR)
* return code - 0 on success, a different number otherwise

<img src="figs/bash_process_codes.jpg" style="width: 500px;"/>

## Redirecting process streams

```bash
rm -v *.txt                 # stdout and stderr are displayed on the terminal
rm -v *.txt 1> out.txt      # Redirect stout to a file, same as >
rm -v *.txt 2> err.txt      # Redirect stderr to a file
rm -v *.txt &> outerr.txt   # Redirect stdout and stderr to file
```

You can print to `stderr` with:

```bash
echo "Wrong arguments" >&2
```

Redirects and pipes can be combined:

```bash   
./compile 2>&1 | less  # View both stdout and stderr in less
```

## Example: the classical Unix script
A combination of commands, or a single long command, that you use often:

```bash
./pulse_app -cmt WinslowRice -casename ellipsoid < ellipsoid.i | tee main_output
```

In this case, flexibility is often not a high priority. 

However, there
is room for improvement;

  * Not possible to change command line options, input and output files

  * Output file `main_output` is overwritten for each run

  * Can we edit the input file for each run?

## Problem 1: changing application input
In many cases only one parameter is changed frequently:

```bash
CASE='testbox'
CMT='WinslowRice'
if [ $# -gt 0 ]; then
   CMT=$1
fi
INFILE='ellipsoid_test.i'
OUTFILE='main_output'

./pulse_app -cmt $CMT -cname $CASE < $INFILE | tee $OUTFILE
```

Still not very flexible, but in many cases sufficient. More
flexibility requires more advanced parsing of command line options,
which will be introduced later.

## Problem 2: overwriting output file


A simple solution is to add the output file as a command line
  option, but what if we forget to change this from one run to the
  next?

Simple solution to ensure data is never over-written:

```bash
jobdir=$PWD/$(date)
mkdir $jobdir
cd $jobdir

./pulse_app -cmt $CMT -cname $CASE  < $INFILE | tee $OUTFILE
cd ..
if [ -L 'latest' ]; then
    rm latest
fi
ln -s $jobdir latest
```

## Problem 2: overwriting output file (2)
Alternative solutions:

* Use process ID of the script (`$$`, not really unique)

* `mktemp` can create a temporary file with a unique name, for
  use by the script

* Check if subdirectory exists, exit script if it does:

```bash
dir=$case
# check if $dir is a directory:
if [ -d $dir ]
  # exit script to avoid overwriting data
  then
    echo "Output directory exists, provide a different name"
    exit
fi
mkdir $dir   # create new directory $dir
cd $dir      # move to $dir
```

## Alternative `if`-tests
As with everything else in Bash, there are multiple ways to do `if`-tests:

```bash        
# the 'then' statement can also appear on the 1st line:
if [ -d $dir ]; 
then
  exit
fi

# another form of if-tests:
if test -d $dir; then
  exit
fi

# and a shortcut:
[ -d $dir ] && exit
test -d $dir && exit
```

Be aware of the whitespaces in the `[ -d $dir ]`, otherwise it will not work.

## Problem 3: can we edit the input file at run time?

  * Some applications do not take command line options, all input must read from standard input or an input file

  * A Bash script can be used to equip such programs with basic handling of command line options

**Idea**: We want to grab input from the command line, create the correct input file, and run the application

## File reading and writing


File writing is efficiently done by 'here documents':

```bash
cat > myfile <<EOF
multi-line text
can now be inserted here,
and variable substition such as
$myvariable is
supported.
EOF
```

The final EOF must
start in column 1 of the
script file.

## Parsing command-line options

```bash
# read variables from the command line, one by one:
while [ $# -gt 0 ]
do
    option=$1; # load command-line arg into option
    shift;     # eat currently first command-line arg
    case "$option" in
        -m)
            m=$1; shift; ;;  # ;; indicates end of case
        -b)
            b=$1; shift; ;;
        *)
            echo "$0: invalid option \"$option\""; exit ;;
    esac
done

echo "Command line arguments:"
[ -n "$m" ] && echo "m=$m"
[ -n "$b" ] && echo "b=$b"
```

## Alternative to case: if

`case` is standard when parsing command-line arguments in Bash, but if-tests can also be used. Consider

```bash
if [ "$option" == "-m" ]; then
    m=$1; shift;  
elif [ "$option" == "-b" ]; then
    b=$1; shift;
else
    echo exit;
fi
```

## After assigning variables, we can write the input file

Write to `$infile` the lines that appear between the EOF symbols:

```bash
cat > $infile <<EOF
        gridfile='test2.grid'
        param_m=$m
        param_b=$b
EOF
```

## Then execute the program as usual


Redirecting input to read from the new input file

```bash
../pulse_app < $infile
```

We can add a check for successful execution.
The shell variable `$?` is 0 if last command
was successful, otherwise `$? != 0`.

```bash
if [ "$?" != "0" ]; then
  echo "running pulse_app failed"; exit 1
fi

# exit n sets $? to n
```

## For-loops


What if we want to run the application for multiple input files?

```bash
./run.sh test1.i test2.i test3.i test4.i
```

or

```bash
./run.sh *.i
```

A for-loop over command line arguments

```bash
for arg in $@; do
  ../pulse_app < $arg
done
```

Can be combined with more advanced command line options, output
  directories, etc...

## For-loops (2)


For loops for file management:

```bash
files=`ls *.tmp`

for file in $files
do
  echo removing $file
  rm -f $file
done
```

## Counters


Declare an integer counter:

```bash
declare -i counter
counter=0
# arithmetic expressions must appear inside (( ))
((counter++))
echo $counter  # yields 1
```

For-loop with counter:

```bash
declare -i n; n=1
for arg in $@; do
  echo "command-line argument no. $n is <$arg>"
  ((n++))
done
```

## C-style for-loops

```bash
declare -i i
for ((i=0; i<$n; i++)); do
  echo $c
done
```

## Example: bundle files


Pack a series of files into one file

Executing this single file as a Bash script packs out all the individual files again

Usage:

```bash
bundle file1 file2 file3 > onefile  # pack
bash onefile # unpack
```

Writing `bundle` is easy:

```bash
#/bin/sh
for i in $@; do
    echo "echo unpacking file $i"
    echo "cat > $i <<EOF"
    cat $i
    echo "EOF"
done
```

## The bundle output file


Consider 2 fake files: `file1`:

```bash
Hello, World!
No sine computations today
```

and `file2`:

```bash
1.0 2.0 4.0
0.1 0.2 0.4
```

Running `bundle file1 file2` yields the output

```bash
echo unpacking file file1
cat > file1 <<EOF
Hello, World!
No sine computations today
EOF
echo unpacking file file2
cat > file2 <<EOF
1.0 2.0 4.0
0.1 0.2 0.4
EOF
```

## Running an application


Running in the foreground:

```bash
cmd="myprog -c file.1 -p -f -q";
$cmd < my_input_file

# output is directed to the file res
$cmd < my_input_file > res

# process res file by Sed, Awk, Perl or Python
```

Running in the background:

```bash
myprog -c file.1 -p -f -q < my_input_file &
```

or stop a foreground job with Ctrl-Z and then type `bg`

## Functions

```bash
function system {
# Run operating system command and if failure, report and abort

  "$@"
  if [ $? -ne 0 ]; then
    echo "make.sh: unsuccessful command $@"
    echo "abort!"
    exit 1
  fi
}
# function arguments: $1 $2 $3 and so on
# return value: last statement
# call:
name=mydoc
system pdflatex $name
system bibtex $name
```

How to return a value from a function? Define a new variable within the function - all functions are global!

## File globbing, for loop on the command line


List all .ps and .gif files using wildcard notation:

```bash
files=`ls *.ps *.gif`

# or safer, if you have aliased ls:
files=`/bin/ls *.ps *.gif`

# compress and move the files:
gzip $files
for file in $files; do
  mv ${file}.gz $HOME/images
```

## Testing file types

```bash
if [ -f $myfile ]; then
    echo "$myfile is a plain file"
fi

# or equivalently:
if test -f $myfile; then
    echo "$myfile is a plain file"
fi

if [ ! -d $myfile ]; then
    echo "$myfile is NOT a directory"
fi

if [ -x $myfile ]; then
    echo "$myfile is executable"
fi

[ -z $myfile ] && echo "empty file $myfile"
```

## Rename, copy and remove files

```bash
# rename $myfile to tmp.1:
mv $myfile tmp.1

# force renaming:
mv -f $myfile tmp.1

# move a directory tree my tree to $root:
mv mytree $root

# copy myfile to $tmpfile:
cp myfile $tmpfile

# copy a directory tree mytree recursively to $root:
cp -r mytree $root

# remove myfile and all files with suffix .ps:
rm myfile *.ps

# remove a non-empty directory tmp/mydir:
rm -r tmp/mydir
```

## Directory management

```bash
# make directory:
$dir = "mynewdir";
mkdir $mynewdir
mkdir -m 0755 $dir  # readable for all
mkdir -m 0700 $dir  # readable for owner only
mkdir -m 0777 $dir  # all rights for all

# move to $dir
cd $dir
# move to $HOME
cd

# create intermediate directories (the whole path):
mkdir -p  $HOME/bash/prosjects/test1
```

## The find command

Very useful command!


`find` visits all files in a directory tree and can execute one or more commands for every file

Basic example: find the `oscillator` codes

```bash
find $scripting/src -name 'oscillator*' -print
```

Or find all PostScript files

```bash
find $HOME \( -name '*.ps' -o -name '*.eps' \) -print
```

We can also run a command for each file:

```bash
find rootdir -name filenamespec -exec command {} \; -print
# {} is the current filename
```

## Applications of find (1)


Find all files larger than 2000 blocks a 512 bytes (=1Mb):

```bash
find $HOME -name '*' -type f -size +2000 -exec ls -s {} \;
```

Remove all these files:

```bash
find $HOME -name '*' -type f -size +2000 \
           -exec ls -s {} \; -exec rm -f {} \;
```

or ask the user for permission to remove:

```bash
find $HOME -name '*' -type f -size +2000 \
           -exec ls -s {} \; -ok rm -f {} \;
```

## Applications of find (2)


Find all files not being accessed for the last 90 days:

```bash
find $HOME -name '*' -atime +90 -print
```

and move these to /tmp/trash:

```bash
find $HOME -name '*' -atime +90 -print \
           -exec mv -f {} /tmp/trash \;
```

## Tar and gzip


The `tar` command can pack single files or  all files in a directory tree into one file, which can be unpacked later

```bash
tar -cvf myfiles.tar mytree file1 file2

# options:
# c: pack, v: list name of files, f: pack into file

# unpack the mytree tree and the files file1 and file2:
tar -xvf myfiles.tar

# options:
# x: extract (unpack)
```

The tarfile can be compressed:

```bash
gzip mytar.tar
# result: mytar.tar.gz
```

## Two find/tar/gzip examples


Pack all PostScript figures:

```bash
tar -cvf ps.tar `find $HOME -name '*.ps' -print`
gzip ps.tar
```

Pack a directory but remove CVS directories and redundant files

```bash
# take a copy of the original directory:
cp -r myhacks /tmp/oblig1-hpl
# remove CVS directories
find /tmp/oblig1-hpl -name CVS -print -exec rm -rf {} \;
# remove redundant files:
find /tmp/oblig1-hpl \( -name '*~' -o -name '*.bak' \
 -o -name '*.log' \) -print -exec rm -f {} \;
# pack files:
tar -cf oblig1-hpl.tar /tmp/tar/oblig1-hpl.tar
gzip oblig1-hpl.tar
# send oblig1-hpl.tar.gz as mail attachment
```