# EDAF35: Lecture 3

Contents:
- UNIX Shell Programming
- UNIX Commands

## Why Shell Programming ?

- A program written for a shell is called a shell script.
- Shell scripts are (almost always) interpreted 
    - *(there is a company in the USA which sold shell-compilers but they now focus on selling C++ compilers instead)*
    - *see also the [Shell Script Compiler tool](http://www.linux-magazine.com/Online/Features/SHC-Shell-Compiler)*
- Shell programs have some advantages over C programs:
    - More convenient to write when dealing with files and text processing.
    - The building blocks of the shell are of course all the usual UNIX commands.
    - More portable.
- However, the shell is slower than compiled languages.

## Different Shells

- There are a number of shells.
- **Bourne shell** is the original but lacked many features *(e.g. name completion)*.
- The **csh** and **tcsh** have different syntax but were more advanced.
- The **Korn shell** was written at Bell Labs as a superset of Bourne shell but with modern features.
- The GNU program **Bourne Again Shell**, or bash, is similar to Korn shell.
- We will focus on bash.

## Bash as Login Shell

- Every user has a path to the login shell in the password file.
- When you login, and have bash as login shell, bash will process the following files:
    - `/etc/profile`
    - First found (in `$HOME`) of `.bash_profile`, `.bash_login`, `.profile`.
- When the login shell terminates, it will read the file `.bash_logout`.

In [1]:
cat /etc/profile

# System-wide .profile for sh(1)

if [ -x /usr/libexec/path_helper ]; then
	eval `/usr/libexec/path_helper -s`
fi

if [ "${BASH-no}" != "no" ]; then
	[ -r /etc/bashrc ] && . /etc/bashrc
fi


## Interactive Non-Login Shell

- An *interactive shell* is, of course, one which one types commands to.
- A *non-interactive shell* is one which is executing a shell script.
- An interactive shell which is not the login shell executes the file `.bashrc`.
- There is a file `/etc/bashrc`, but it is not automatically read.
- To read it automatically, insert `source /etc/bashrc` in your `.bashrc`.

## Non-Interactive Shell

- Non-interactive shells do not start with reading a specific file.
- If the environment variable `$BASH_ENV` (or `$ENV` if the bash was started as `/bin/sh`) contains a file name, then that file is read.
- The first argument to bash itself, contains the program name, so `echo $0` usually prints `bash`.

In [3]:
echo $BASH_ENV
echo $ENV
echo $0



/bin/bash


## `Source` Builting Command

- To ask the current shell to read some commands use the `source filename` command.
- You can use `.` instead of `source`.

## Aliases and Noclobber

- UNIX commands perform their tasks without asking the user whether he/she really means what he/she just typed. This is very convenient (most of the time).
- For instance the rm command has an option -i to ask for confirmation before a file is removed.
    - Sometimes people put the command `alias rm=’rm -i’` in a bash start file.
- A similar feature is to use the command: `set -o noclobber` which avoids deleting an existing file with I/O redirection (e.g. `ls > x`).
- All such features should be avoided (in my *Jonas* opinion) since they just reduce productivity and make people think UNIX is a safe place.

## I/O Redirection

- `< file` Use file as stdin.
- `> file` Use file as stdout.
- `>> file` Append output to file.
- `2> file` Use file as stderr.
- `2>&1` Close stderr and dup stdout to stderr.
- `cmd1 | cmd2` Use the stdout from `cmd1` as stdin for `cmd2` (aka *pipe*)

In [2]:
#echo 'Hello' > f1
echo ' world!' >> f1
cat < f1

Hello
 world!
 world!
 world!


## Shell Script Basics

- The first line should contain the line `#!/bin/bash`
- To make the script executable, use `chmod a+x file`.
- A line comment is started with #.
- Commands are separated with newline or semicolon `;`.
- Backslash `\` continues a command on the next line.
- Parenthesis `()` group commands and lets a new shell execute the group.

## More About Parentheses

- A subshell has its own shell variables such as current directory.
- The builtin `cd` does not read from stdin, so we can pipe as follows:
    `(cd ; ls) | (cd ˜/Desktop; cat > ls-in-home)`

In [6]:
(cd ; ls) | (cd ˜/Desktop; cat > ls-in-home)
cat ls-in-home

bash: cd: ˜/Desktop: No such file or directory
Box Sync
Desktop
Documents
Downloads
Dropbox
Library
Movies
Music
Pictures
Privat
Public
Qt
Sites
Terminal Saved Output
bin
gcviewer.properties
git
go
node_modules
target
temp


## Shell Variables
- Shell variables do not have to be declared — just assign to them:
```
$ a=unix
$ echo $a
$ b=wrong rm can have unexpected results
$ c="wrong rm can have unexpected results"
```

- The difference between the last two assignments is significant (see prepend variables definition to command). 
- A shell variable is by default local to the shell but can be exported to child processes using: 
```
$ export a
```
- C/C++ programs get the value using `char* value = getenv("VAR");`

In [2]:
a=unix
echo $a
b=wrong rm can have unexpected results
echo $b
c="wrong rm can have unexpected results"
echo $c

unix
rm: can: No such file or directory
rm: have: No such file or directory
rm: unexpected: No such file or directory
rm: results: No such file or directory

wrong rm can have unexpected results


In [16]:
x="once upon" y="a time" bash -c 'echo $x $y'

once upon a time


## Using Shell Variables

- Use a dollar sign before the name to get the value: `$HOME`.
- If you wish to concatenate a shell variable and a string, use `${VAR}suffix`
    - without `{}` you get wrong identifier

In [19]:
b=bumble
echo $b
echo ${b}bee
echo $bbee

bumble
bumblebee



## More about Using Shell Variables

- The value of `${var-thing}` is `$var` if var is defined, otherwise `thing` were thing is not expanded. Value of var is unchanged.
- The value of `${var=thing}` is `$var` if var is defined, otherwise `thing` and var is set to thing.
- The value of `${var+thing}` is thing if `var` is defined, otherwise nothing.
- The value of `${var?message}` is `$var` if var is defined, otherwise a message is printed and the shell exits.

In [3]:
echo ${a-something}
echo ${d-nothing}
echo $d
echo ${e=everything}
echo $e
echo ${d?Variable d not defined}

unix
nothing

everything
everything
bash: d: Variable d not defined


: 1

## PS1 and PS2
- The prompts, `$` and `>` are called the primary and secondary prompts. These were the original values of these and they are stored in PS1 and PS2.
- For the root user, the prompt is `#`
- It is possible to get a more informative prompt by using the escapes: e.g. `PS1="\w "`
    - `\$` # if root, otherwise dollar.
    - `\!` Current history number (see below).
    - `\w` Pathname of working directory.
    - `\W` Basename of working directory.
    - `\h` Hostname.
    - `\H` Hostname including domain.
    - `\u` User.
    - `\t` 24-hour time.
    - `\d` Date.

## Reexecuting Commands with a Builtin Editor
- To reexecute a command, use either the builtin editor (vi or emacs) as specified in your `.inputrc` file.
- `.inputrc` can contain e.g. `set editing-mode vi`
- Using the editor is very convenient since you can change the command if it didn’t work as expected. Simply hit ESC (for vi).
- This is a convenient way to experiment with new commands.

## Reexecuting Commands with an Exclamation
Commands available include:
- `!!` Reexecute most recent command.
- `!n` Reexecute command number n.
- `!-n` Reexecute the nth preceding command.
- `!string` Redo the most recent command starting with string.
- `!?string` Redo the most recent command containing string.
- The last word on the previous command can be refered to as `!$`

Check also `history`

In [3]:
ls
ls f1
ls -al f1
!!
!557
!-2

EDAF35 Lecture 3.ipynb	ls-in-home		svib
a.c			svi
f1			svia
f1
-rw-r--r--  1 flagr  staff  30 Mar 27 10:29 f1
ls -al f1
-rw-r--r--  1 flagr  staff  30 Mar 27 10:29 f1
bash: !557: event not found
ls -al f1
-rw-r--r--  1 flagr  staff  30 Mar 27 10:29 f1


In [15]:
ls -al ls-in-home
cat !$

-rw-r--r--  1 flagr  staff  176 Mar 26 11:07 ls-in-home
cat ls-in-home
Box Sync
Desktop
Documents
Downloads
Dropbox
Library
Movies
Music
Pictures
Privat
Public
Qt
Sites
Terminal Saved Output
bin
gcviewer.properties
git
go
node_modules
target
temp


In [5]:
history
!39

   21  cd somestupid/
   22  ls
   23  less file5
   24  less file54
   25  cd..
   26  cd ..
   27  ls
   28  man perror
   29  man waitpid
   30  man snprintf
   31  ls
   32  cd ..
   33  ls
   34  cd 2018/
   35  ls
   36  cd labs/
   37  ls
   38  cd lab3_fs/
   39  ls
   40  cd myfs/
   41  ls
   42  less file54 
   43  cd ..
   44  ls -al
   45  less TEST_FS 
   46  ls
   47  cd myfs/
   48  ls
   49  cd ..
   50  ls
   51  man atoi
   52  man itoa
   53  man itoa
   54  ls
   55  cd myfs/
   56  ls
   57  ls
   58  cd..
   59  ls myfs
   60  ls myfs&
   61  ls myfs/
   62  cd ..
   63  ls
   64  cd myfs/
   65  ls
   66  cd ..
   67  man snprintf
   68  cd myfs/
   69  ls
   70  ls
   71  ls
   72  cd ..
   73  ls
   74  cd myfs/
   75  ls
   76  cat 0
   77  cd ..
   78  ls
   79  ls
   80  cd myfs/
   81  ls
   82  cd ..
   83  ls
   84  cd myfs/
   85  ls
   86  ls -al
   87  less 1
   88  ls -al
   89  ls -al
   90  cd ..
   91  ls
   92  cd myfs/
   93  ls
   94  less 3
  

  414  )
  415  echo "$variable"
  416  echo $?
  417  function fun() { echo $1 # echo first argument; echo $2 # echo second argument; }
  418  fun ha hi
  419  fun he ho hu
  420  fun hi
  421  echo $?
  422  for x in *.c; do     lpr $x; done
  423  echo $?
  424  for x in *; do     lpr $x; done
  425  echo $?
  426  for x in *; do     echo $x; done
  427  echo $?
  428  for x in *; do     echo $x; done
  429  echo $?
  430  a="x y z"
  431  for s in $a; do     echo $s; done
  432  echo $?
  433  for s in a b c; do     echo $s; done
  434  echo $?
  435  ls -l | cut -c2-10
  436  echo $?
  437  ls -l | cut -c2-10 -c51-55
  438  echo $?
  439  ls
  440  echo $?
  441  find . -name ’*.ipynb’
  442  echo $?
  443  find . -name '*.ipynb'
  444  echo $?
  445  wc `find . -name '*.ipynb'`
  446  echo $?
  447  man find
  448  echo $?
  449  awk ’{ print $1, $5; }’
  450  echo $?
  451  awk '{ print $1, $5; }'
  452  echo a b c d e | awk '{ print $1, $5; }'
  453  echo $?
  454  echo a b c d

## Quotation Marks
- There are three kinds of quotation marks:
    - in a string enclosed by `"`: variables are expanded.
    - in a string enclosed by `’`: variables are not expanded.
    - the value of `‘string‘` is the stdout from executing string as a command and removing each trailing newline character:
```
$ rm ‘du -ks * | sort -n | awk ’ { print $2 } ’‘ # remove big file/directory
```
*Note:* the last form (back single quote) is equivalent to `$(command)`.


In [6]:
du -ks * | sort -n | awk '{ print $2 }'

a.c
f1
ls-in-home
svi
svia
svib
EDAF35


In [7]:
echo $(du -ks * | sort -n | awk '{ print $2 }')

a.c f1 ls-in-home svi svia svib EDAF35


In [8]:
echo `du -ks * | sort -n | awk '{ print $2 }'`

a.c f1 ls-in-home svi svia svib EDAF35


## Here Documents
- Sometimes it can be useful to provide the input to a script in the script file. The input is right ”here”.
```
$ cat phone
grep "$*" <<End
Office 046 222 9484
Mobile 0767 888 124
$X
End
```
- Above script contains both the command and the input.
- The variable X is expanded; suppress this behaviour by preceding End with a backslash on first line.

In [24]:
variable=$(cat <<SETVAR
This variable
runs over multiple lines.
SETVAR
)

echo "$variable"

This variable
runs over multiple lines.


### broadcast: Sends message to everyone logged in
```
#!/bin/bash

wall <<zzz23EndOfMessagezzz23
E-mail your noontime orders for pizza to the system administrator.
    (Add an extra dollar for anchovy or mushroom topping.)
# Additional message text goes here.
# Note: 'wall' prints comment lines.
zzz23EndOfMessagezzz23

# Could have been done more efficiently by
#         wall <message-file
#  However, embedding the message template in a script
#+ is a quick-and-dirty one-off solution.

exit
```


more about [here documents](http://tldp.org/LDP/abs/html/here-docs.html)

## Functions
```
function fun()
{
echo $1 # echo first argument
echo $2 # echo second argument
}
```
- The keyword `function` is optional.
- A function must be declared before it can be used.
- A function can be used as if it was any other UNIX command, i.e. no parentheses when the function is called 
  (not even for passing arguments).

In [10]:
function fun()
{
echo $1 # echo first argument
echo $2 # echo second argument
echo $0
}

fun ha hi
fun he ho hu
fun hiii

ha
hi
/bin/bash
he
ho
/bin/bash
hiii

/bin/bash


## Simple Shell Syntax
- `a && b` executes `b` only if `a` succeeds (ie returns 0).
- `a || b` executes `b` only if `a` fails (ie returns nonzero).

The following commands can cause unhappiness if you run out of disk space during tar:
```
$ tar cf dir.tar dir; rm -rf dir; bzip2 -9v dir.tar
```

This is better:
```
$ tar cf dir.tar dir && rm -rf dir && bzip2 -9v dir.tar
```
Edit-compile-run without leaving the keyboard:
```
vi a.c && gcc a.c && a.out
```
But it is better to remap e.g. v, V, or t in `vi` to run `make`

## For Loops
Iterate through certain files in your the current directory:
```
for x in *.c
do
    lpr $x # prints them
done
```

or through all argumets passed to a script:
```
for x in $*
do
    lpr $x
done
```

In [11]:
for x in *
do
    echo $x
done

EDAF35 Lecture 3.ipynb
a.c
f1
ls-in-home
svi
svia
svib


You can also iterate through a string:

In [12]:
a="x y z v"
for s in $a
do
    echo $s
done

x
y
z
v


Or simply a list:

In [13]:
for s in a b c b
do
    echo $s
done

a
b
c
b


## While and Until
```
while command
do
    body # do body while command returns true
done

until command
do
    body # do body while command returns false
done
```

## If-Then-Else-Fi
```
if command
then
    then-commands
[else
    else-commands]
fi

if ! command
then
    then-commands
[else
    else-commands]
fi
```

## Case
```
case word in
pattern1) commands;;
pattern2) commands;;
*) commands;;
esac
```

- Nothing happens if no pattern matches: putting `*)` last makes a default.

### Longer example: 
This is an excerpt of the script that starts Anacron, a daemon that runs commands periodically with a frequency specified in days.
```
case "$1" in
        start)
            start
            ;;
         
        stop)
            stop
            ;;
         
        status)
            status anacron
            ;;
        restart)
            stop
            start
            ;;
        condrestart)
            if test "x`pidof anacron`" != x; then
                stop
                start
            fi
            ;;
         
        *)
            echo $"Usage: $0 {start|stop|restart|condrestart|status}"
            exit 1
 
esac
```

## cmp, diff, and ndiff
- `cmp` reports whether two files are equal.
- `diff` does the same but also shows how they differ.
- `ndiff` is a variant for which one can specify numerical differences which should be ignored.
    - `ndiff` is not standard but easy to find.

## cut
- `cut` cuts out characters from each line of stdin
- `ls -l | cut -c2-10` prints the rwx-flags of the files.
- The first character on a line is c1.
- Multiple ranges can be specified: `ls -l | cut -c2-10 -c51-55` also prints five characters from the file name.

In [15]:
ls -l | cut -c2-10
ls -l

otal 152
rw-r--r--
rw-r--r--
rw-r--r--
rw-r--r--
rwxr-xr-x
rwxr-xr-x
rwxr-xr-x
total 152
-rw-r--r--  1 flagr  staff  50934 Mar 27 11:33 EDAF35 Lecture 3.ipynb
-rw-r--r--  1 flagr  staff     14 Mar 26 14:58 a.c
-rw-r--r--  1 flagr  staff     30 Mar 27 10:29 f1
-rw-r--r--  1 flagr  staff    176 Mar 26 11:07 ls-in-home
-rwxr-xr-x  1 flagr  staff     93 Mar 26 15:24 svi
-rwxr-xr-x  1 flagr  staff     81 Mar 26 15:26 svia
-rwxr-xr-x  1 flagr  staff     93 Mar 26 15:26 svib


In [16]:
ls -l | cut -c2-10 -c51-55

otal 152
rw-r--r--F35 L
rw-r--r--
rw-r--r--
rw-r--r--in-ho
rwxr-xr-x
rwxr-xr-xa
rwxr-xr-xb


## find
Example: `find . -name ’*.c’` 
The output will be a list of files (with full path) with suffix c.

We can feed that list to wc using: `wc ‘find . -name ’*.java’‘`
The default action is to print the file name.

A number of criteria can be specified, including
- `-anewer filename` selects files newer than filename.
- `-type type` selects files of type type which is one of b,c,d,f,l, p, or s (with the same meaning as printed by `ls -l`: block special file (eg disk), character special file (eg usb port), directory, ordinary file, symbolic link, name pipe, or socket).

In [17]:
find . -name '*.ipynb'
find . -name '*.c'

./EDAF35 Lecture 3.ipynb
./.ipynb_checkpoints/EDAF35 Lecture 3-checkpoint.ipynb
./a.c


## cleanfiles
```
find . -name *.tac.??? -exec rm ’{}’ \;
find . -name *.pr -exec rm ’{}’ \;
find . -name cmd.gdb -exec rm ’{}’ \;
find . -name *.ps -exec rm ’{}’ \;
find . -name *.dot -exec rm ’{}’ \;
find . -name *.aux -exec rm ’{}’ \;
find . -name *.o -exec rm ’{}’ \;
find . -name out -exec rm ’{}’ \;
find . -name x -exec rm ’{}’ \;
find . -name y -exec rm ’{}’ \;
find . -name a.out -exec rm ’{}’ \;
find . -name cachegrind.out.* -exec rm ’{}’ \;
```

Have a look at `man find`

## awk
- Stands for Aho (from the [Dragonbook](https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools)), Weinberger (from hashpjw in the Dragonbook), and Kernighan (K in K&R C).
- Each line of input is separated into fields and are denoted `$1,$2,...`
  Assume a variable is called `X` and has value `2`. Then `$X` refers to the second field.
- The entire line is `$0`, number of fields on a line is denoted `NF`, and line number is `NR`.
- Each line in an `awk` program has *a pattern* and *an action*.
  If a line in the input matches the pattern, the action is executed.

## Example awk programs
```
$ awk ’{ print $1, $5; }’ # print first and fifth item.
$ awk ’$1 > 10 { print $1, $2; }’ # print first two items if $1 is > 10.
$ awk ’NR == 10’ # print tenth line.
$ awk ’NF > 4’ # print each line with > 4 fields.
$ awk ’NF > 0 ’ # print each nonempty line.
$ awk ’$NF > 4 ’ # print each line with last field > 4.
$ awk ’/abc/ ’ # print each line containing abc.
$ awk ’/abc/ { n = n + 1; }\
  END { print n;}’ # print number of lines containing abc.
$ awk ’length($0) > 80’ # print each line longer than 80 bytes.
```

The `END` pattern matches at `EOF`. There is also a `BEGIN` pattern which is matched before the first character is read.

In [18]:
echo a b c d e | awk '{ print $1, $5; }'


a e


## head and tail
- `head` prints the first 10 lines of a file (or stdin).
- `head -100` prints the first 100 lines of a file (or stdin).
- `tail` prints the last 10 lines of a file (or stdin).
- `tail -100` prints the last 100 lines of a file (or stdin).
- `tail -f file` like normal tail but at EOF waits for more data.

## od
- Octal dump
- `od file` dumps the file contents on stdout in as octal numbers.
- `od -c file` prints file as characters.
- `od -x file` prints file as hex numbers.

## sed
- stream editor.
- It can be useful for e.g. changing prefixes in a `Yacc` generated parser:
```
sed ’s/yydebug/pp_debug/g’ y.tab.c > tmp; mv tmp y.tab.c
```

In [19]:
echo a b c d aa | sed 's/a/Hahahah/g' 

Hahahah b c d HahahahHahahah


## grep
- Grep searches for a pattern in files.
- GNU `grep` has the useful `-r` option which traverses directories.
- In *basic regular expressions* ?, +, braces, parentheses and bar (i.e. |) have no special meaning. Backslash them to get that.
- In *extended regular expressions*, enabled with `-E`, above characters are special. More about that on next slide.
```
$ grep abc # matches line with abc.
$ grep -e ’[abc]’ # matches line with any of a, b, or c.
$ grep -e ’[^abc]’ # matches line with none of a, b, or c.
$ grep -e ’[^ab-d]’ # matches line with none of a, b, c, or d.
$ grep ab*c # matches line with ac, abc, abbbbbc.
```

In [20]:
grep abc EDAF*

      "  466  grep abc EDAF*\n",
      "  468  grep abc EDAF*\n",
      "  470  grep '[^abc]' EDA*\n",
      "  472  grep -e '[^abc]' EDA*\n",
      "  474  grep -e '[^abc]' f1\n",
    "$ awk ’/abc/ ’ # print each line containing abc.\n",
    "$ awk ’/abc/ { n = n + 1; }\\\n",
    "  END { print n;}’ # print number of lines containing abc.\n",
    "$ grep abc # matches line with abc.\n",
    "$ grep -e ’[abc]’ # matches line with any of a, b, or c.\n",
    "$ grep -e ’[^abc]’ # matches line with none of a, b, or c.\n",
    "$ grep ab*c # matches line with ac, abc, abbbbbc.\n",
      "    \"$ awk ’/abc/ ’ # print each line containing abc.\\n\",\n",
      "    \"$ awk ’/abc/ { n = n + 1; }\\\\\\n\",\n",
      "    \"  END { print n;}’ # print number of lines containing abc.\\n\",\n",
      "    \"$ grep abc # matches line with abc.\\n\",\n",
      "    \"$ grep -e ’[abc]’ # matches line with any of a, b, or c.\\n\",\n",
      "    \"$ grep -e ’[^abc]’ # matches line with none of a, b, or

In [21]:
cat f1
echo ----------------------
grep -e '[^leoH]' f1

Hello
 world!
 world!
 world!
----------------------
 world!
 world!
 world!


In [15]:
grep -e '[^leo]' f1

Hello
 world!


## grep -E
```
$ grep -E -e ’a|b’ # matches line with a or b.
$ grep -E -e ’a|bc’ # matches line with a or bc.
$ grep -E -e ’(a|b)c’ # matches line with a or b, followed by c.
$ grep -E -e ’(a|b)?c’ # ? denotes optional item.
$ grep -E -e ’(a|b)+c’ # + denotes at least once.
$ grep -E -e ’(a|b)*c’ # + denotes zero or more.
$ grep -E -e ’(a|b){4}c’ # {4} matches pattern four times.
```

- Without -E use backslash before above metacharacters.
- Without ' the shell will try to setup a *pipe* ... `|`

## sort and uniq
- `sort file` sorts a file alphabetically.
- `sort -n file` sorts a file numerically.
- `uniq` removes duplicates line if found in sequence

In [22]:
sort f1

 world!
 world!
 world!
Hello


In [18]:
cat svi

#!/bin/bash
vi -c /$1 `egrep -e $1 *.[ch] */*.[ych] | awk -F: '{ print $1; }' | uniq | sort`


### What does the above script do?