# Shell scripting


## Introduction

Instead of typing all the UNIX commands we need to perform one after the other, we can save them all in a file (a "script") and execute them all at once. Recall from the [UNIX and Linux Chapter](./01-Unix.ipynb#Meet-the-UNIX-shell) that the bash shell (or terminal) is a text command processor that interfaces with the Operating System. The bash shell provides a computer language that can be used to build scripts (AKA shell scripts) that can be run through the terminal.

### What shell scripts are good for

It is possible to write reasonably sophisticated programs with shell scripting, but the bash language is not featured to the extent that it can replace a "proper" language like C, Python, or R. However, you will find that shell scripting is necessary. This is because as such, as you saw in the previous chapter, UNIX has an incredibly powerful set of tools that can be used thorugh the bash terminal. Shell scripts can allow you to automate the usage of these commands and create your own, simple utility tools/scripts/programs for tasks such as backups, converting file formats, handling & manipulating files and directories). This enables you to perform many everyday tasks on your computer without having to invoke another language that might require installation or updating.

  

## Your first shell script

Let's write our first shell script.

$\star$ Write and save a file called `boilerplate.sh` in `CMEECourseWork/week1/code`, and add the following script to it
(type it in your code editor):

```bash
#!/bin/bash
# Author: Your Name your.login@imperial.ac.uk
# Script: boilerplate.sh
# Desc: simple boilerplate for shell scripts
# Arguments: none
# Date: Oct 2019

echo -e "\nThis is a shell script! \n" #what does -e do?

#exit

```

The `.sh` extension is not necessary, but useful for you and your programing IDE (e.g., Visual Studio Code, Emacs, etc) to identifying the file type. 
* The first line is a "shebang" (or sha-bang or hashbang or pound-bang or hash-exclam or hash-pling! – Wikipedia). It can also can be written as `#!/bin/sh`. It tells the bash interpreter that this is a bash script and that it should be interpreted and run as such. 
* The hash marks in the following lines tell the interpreter that it should ignore the lines following them (that's how you put in script documentation (who wrote the script and when, what the script does, etc.) and comments on particular line of script. 
* Note that there is a commented out `exit` command at the end of the script. Uncommenting it will not change the behavior of the script, but will allow you to generate a error code, and if the command is inserted in the middle of the script, to stop the code at that point. To find out more, see [this](https://bash.cyberciti.biz/guide/The_exit_status_of_a_command) and [this](https://stackoverflow.com/questions/1378274/in-a-bash-script-how-can-i-exit-the-entire-script-if-a-certain-condition-occurs) in particular. 

Next, let's run this script.

## Running shell scripts

There are two ways of running a script:

1. Call the interpreter bash to run the file:

```bash
bash myscript.sh
```

(You can also use ```sh myscript.sh```)

This is the right way if the script is does something specific in a given project. 

2.  Make the script executable and execute it:

```bash 
chmod +x myscript.sh
myscript.sh
```
Use this second approach for a script that does something generic, and is likely to be reused again and again (*Can you think of examples?*)

The generic scripts of type (2) can be saved in `username/bin/`, and made easily accessible by telling UNIX to look in `/home/bin` for commands

```bash
mkdir ~/bin
PATH=$PATH:$HOME/bin 
```

So let's run your first shell script.

$\star$ `cd` to your `code` directory, and run it (here I am assuming you are in `sandbox` or `data`, continuing where you [left off](./01-Unix.ipynb#Using-grep) in the Unix and Linux Chapter):

In [1]:
cd ../code
bash boilerplate.sh


This is a shell script! 



## A useful shell-scripting example

Let's write a shell script to transform comma-separated files (csv) to tab-separated files and vice-versa. This can be handy — for example, in certain computer languages, it is much easier to read tab or space
separated files than csv (e.g., `C`)

To do this, in the bash we can use `tr` (abbreviation of `tr`anslate or `tr`ansliterate), which deletes or substitute characters. Here are some examples.

In [2]:
echo "Remove    excess      spaces." | tr -s "\b" " "

Remove excess spaces.


In [3]:
echo "remove all the as" | tr -d "a"

remove ll the s


In [4]:
echo "set to uppercase" | tr [:lower:] [:upper:]

SET TO UPPERCASE


In [5]:
echo "10.00 only numbers 1.33" | tr -d [:alpha:] | tr -s " " ","

10.00,1.33


Now write a shell script to substitute all tabs with commas called `tabtocsv.sh` in `Week1/Code`:

```bash
#!/bin/bash
# Author: Your name you.login@imperial.ac.uk
# Script: tabtocsv.sh
# Description: substitute the tabs in the files with commas
#
# Saves the output into a .csv file
# Arguments: 1 -> tab delimited file
# Date: Oct 2019

echo "Creating a comma delimited version of $1 ..."
cat $1 | tr -s "\t" "," >> $1.txt
echo "Done!"
exit
```

Now test it (note where the output file gets saved and why). First create a text file with tab-separated text:

In [6]:
echo -e "test \t\t test" >> ../sandbox/test.txt # again, note the relative path!

Now run your script on it

In [7]:
bash tabtocsv.sh ../sandbox/test.txt

Creating a comma delimited version of ../sandbox/test.txt ...
Done!


Note that

* `$1` is the way a shell script defines a placeholder for a variable (in this case the filename). See next section for more on variable names in shell scripts. 

* The new file gets saved in the same location as the original (*Why is that?*)

* The file got saved with a `.txt.csv` extension. That's not very nice. Later you will get an opportunity to fix this!

## Variables in shell scripts

There are three ways to assign values to variables (note lack of spaces!):

1.  Explicit declaration: `MYVAR=myvalue`
2.  Reading from the user: `read MYVAR`
3.  Command substitution: `MYVAR=\$( (ls | wc -l) )`

Here are some examples of assignments (try them out and save as a single  `Week1/Code/variables.sh` script):

```bash

#!/bin/bash

# Shows the use of variables
MyVar='some string'
echo 'the current value of the variable is' $MyVar
echo 'Please enter a new string'
read MyVar
echo 'the current value of the variable is' $MyVar

## Reading multiple values
echo 'Enter two numbers separated by space(s)'
read a b
echo 'you entered' $a 'and' $b '. Their sum is:'
mysum=`expr $a + $b`
echo $mysum
```

And also (save as `Week1/Code/MyExampleScript.sh`):

```bash
#!/bin/bash

msg1="Hello"
msg2=$USER
echo "$msg1 $msg2"
echo "Hello $USER"
echo
```

### Some more examples

Here are a few more illustrative examples (test each one out, save in `week1/code/` with the given name):

#### Count lines in a file

Save this as `CountLines.sh`:

```bash
#!/bin/bash

NumLines=`wc -l < $1`
echo "The file $1 has $NumLines lines"
echo
```
The `<` redirects the contents of the file to the stdin ([standard input](https://en.wikipedia.org/wiki/Standard_streams)) of the command `wc -l`. It is needed here because without it, you would not be able to catch *just* the numerical output (number of lines). To see this, try deleting `<` from the script and see what the output looks like (it will also print the script name, which you do not want).   

#### Concatenate the contents of two files

Save this as `ConcatenateTwoFiles.sh`:

```bash
#!/bin/bash

cat $1 > $3
cat $2 >> $3
echo "Merged File is"
cat $3
```

#### Convert tiff to png

This assumes you have done `apt install imagemagick` (remember `sudo`!) 

Save this as `tiff2png.sh`:

```bash
#!/bin/bash

for f in *.tif; 
    do  
        echo "Converting $f"; 
        convert "$f"  "$(basename "$f" .tif).jpg"; 
    done
```

## Practicals

### Instructions

* Along with the completeness of the practicals/exercises themselves, you will be marked on the basis of how complete and well-organized your directory structure and content is.

* Review (especially if you got lost along the way) and make sure all the shell scripts you created in this chapter are functional.

* Make sure you have your weekly directory organized with `data`, `sandbox`, `code` with the necessary files, under ` CMEECourseWork/week1`.

* *All scripts should run on any other Unix/Linux machine* — for example, always call data from the `data` directory using relative paths.

* Make sure there is a `readme` file in every week's directory. This file should give an overview of the weekly directory contents, listing all the scripts and what they do. This is different from the `readme` for your overall git repository, of which `Week 1` is a part. You will write a similar ` readme` for each subsequent weekly submission.

* Don't put any scripts that are part of the submission in your `home/bin` directory! You can put a copy there, but a working version should be in your repository.

**(a) Improve the existing scripts**

Note that some of the shell scripts that you have created in this chapter above requires input files. For example, `tabtocsv.sh` needs one input file, and `ConcatenateTwoFiles.sh` needs two. When you run any of these scripts without inputs (e.g., just `bash tabtocsv.sh`), you either get no result, or an error. 

* Make each such script robust so that it gives feedback to the user and exits if the right inputs are not provided. 

**(b) A new shell script**

* Write a `csvtospace.sh` shell script that takes a `c`omma `s`eparated `v`alues and converts it to a space separated values file. However, it must not change the input file — it should save it as a differently named file.

* Make this script robust, as you were asked to doin the previous exercise (a). 

* Save the script in `CMEECourseWork/Week1/Code`, and run it on the `csv` data files that are in `Temperatures` in the master repository's `Data` directory.




## Readings & Resources

-   Plenty of shell scripting resources and tutorials out there; in particular, look up
[http://www.tutorialspoint.com/unix/unix-using-variables.htm](http://www.tutorialspoint.com/unix/unix-using-variables.htm)