<!--NAVIGATION-->
< [Unix](01-Unix.ipynb) | [Main Contents](Index.ipynb) | [Version control with Git](03-Git.ipynb)>

# Shell scripting <span class="tocSkip"></span><a name="chap:sscripting"></a>

<h1>Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Shell-scripting:-What-and-Why" data-toc-modified-id="Shell-scripting:-What-and-Why-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Shell scripting: What and Why</a></span><ul class="toc-item"><li><span><a href="#Running-shell-scripts" data-toc-modified-id="Running-shell-scripts-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Running shell scripts</a></span></li></ul></li><li><span><a href="#Your-first-shell-script" data-toc-modified-id="Your-first-shell-script-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Your first shell script</a></span></li><li><span><a href="#A-useful-shell-scripting-example" data-toc-modified-id="A-useful-shell-scripting-example-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>A useful shell-scripting example</a></span></li><li><span><a href="#Variables-in-shell-scripts" data-toc-modified-id="Variables-in-shell-scripts-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Variables in shell scripts</a></span><ul class="toc-item"><li><span><a href="#Some-more-examples" data-toc-modified-id="Some-more-examples-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Some more examples</a></span><ul class="toc-item"><li><span><a href="#Count-lines-in-a-file" data-toc-modified-id="Count-lines-in-a-file-4.1.1"><span class="toc-item-num">4.1.1&nbsp;&nbsp;</span>Count lines in a file</a></span></li><li><span><a href="#Concatenate-the-contents-of-two-files" data-toc-modified-id="Concatenate-the-contents-of-two-files-4.1.2"><span class="toc-item-num">4.1.2&nbsp;&nbsp;</span>Concatenate the contents of two files</a></span></li><li><span><a href="#Convert-tiff-to-png" data-toc-modified-id="Convert-tiff-to-png-4.1.3"><span class="toc-item-num">4.1.3&nbsp;&nbsp;</span>Convert tiff to png</a></span></li></ul></li></ul></li><li><span><a href="#Practical" data-toc-modified-id="Practical-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Practical</a></span><ul class="toc-item"><li><span><a href="#A-shell-script-exercise" data-toc-modified-id="A-shell-script-exercise-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>A shell script exercise</a></span></li></ul></li><li><span><a href="#Readings-&amp;-Resources" data-toc-modified-id="Readings-&amp;-Resources-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Readings &amp; Resources</a></span></li></ul></div>


## Shell scripting: What and Why

Instead of typing all the UNIX commands we need to perform one after the other, we can save them all in a file (a "script") and execute them all at once.

The `bash` shell we are using provides a proper syntax that can be used to build complex command sequences and scripts.

Scripts can be used to automate repetitive tasks, to do simple data manipulation or to perform maintenance of your computer (e.g., backup). Indeed, most data manipulation can be handled by scripts without the need of writing a proper program.

### Running shell scripts

There are two ways of running a script:

1.  The first is to call the interpreter bash to run the file (try this, but won't work as you don't have a `myscript.sh` script !)

In [1]:
bash myscript.sh # OR sh myscript.sh

bash: myscript.sh: No such file or directory


: 127

(A script that does something specific in a given project)

2.  OR, make the script executable and execute it:

In [None]:
chmod +x myscript.sh
myscript.sh

(A script that does something generic, and is likely to be reused     again and again – can you think of examples?)

The generic scripts of type (2) can be saved in `username/bin/` and made executable (the `.sh` extension not necessary.)

In [None]:
mkdir ~/bin
PATH=$PATH:$HOME/bin #Tell UNIX to look in /home/bin for commands

## Your first shell script

Let's write our first shell script! For starters,

$\star$ Write and save `boilerplate.sh` in `CMEECourseWork/Week1/Code`, and add the following script to it
(type it in your code editor):

```bash
#!/bin/bash
# Author: Your Name your.login@imperial.ac.uk
# Script: boilerplate.sh
# Desc: simple boilerplate for shell scripts
# Arguments: none
# Date: Oct 2018

echo -e "\nThis is a shell script! \n" #what does -e do?

#exit

```
The first line is a "shebang" (or sha-bang or hashbang or pound-bang or hash-exclam or hash-pling! – Wikipedia). It can also can be written as `#!/bin/sh`. It tells the bash interpreter that this is a bash script and that it should be interpreted and run as such. The hash marks in the following lines tell the interpreter that it should ignore the lines following them (that's how you put in script documentation (who wrote the script and when, what the script does, etc.) and comments on particular line of script. 

Note that there is a commented out `exit` command at the end of the script. Uncommenting it will not change the behavior of the script, but will allow you to generate a error code, and if the command is inserted in the middle of the script, to stop the code at that point. To find out more, google it. See [this](https://bash.cyberciti.biz/guide/The_exit_status_of_a_command) and [this](https://stackoverflow.com/questions/1378274/in-a-bash-script-how-can-i-exit-the-entire-script-if-a-certain-condition-occurs) in particular. 

$\star$ Now `cd` to your `code` directory, and run it:

In [2]:
bash boilerplate.sh


This is a shell script! 



## A useful shell-scripting example

Let's write a shell script to transform comma-separated files (csv) to tab-separated files and vice-versa. This can be handy — for example, in certain computer languages, it is much easier to read tab or space
separated files than csv (e.g., `C`)

To do this, in the bash we can use `tr` (abbreviation of `tr`anslate or `tr`ansliterate), which deletes or substitute characters. Here are some examples.

In [3]:
echo "Remove    excess      spaces." | tr -s "\b" " "

Remove excess spaces.


In [4]:
echo "remove all the as" | tr -d "a"

remove ll the s


In [5]:
echo "set to uppercase" | tr [:lower:] [:upper:]

SET TO UPPERCASE


In [6]:
echo "10.00 only numbers 1.33" | tr -d [:alpha:] | tr -s " " ","

10.00,1.33


Now write a shell script to substitute all tabs with commas called `tabtocsv.sh` in `Week1/Code`:

```bash
#!/bin/bash
# Author: Your name you.login@imperial.ac.uk
# Script: tabtocsv.sh
# Desc: substitute the tabs in the files with commas
# saves the output into a .csv file
# Arguments: 1-> tab delimited file
# Date: Oct 2018

echo "Creating a comma delimited version of $1 ..."
cat $1 | tr -s "\t" "," >> $1.csv
echo "Done!"
exit
```

Now test it (note where the output file gets saved and why):

In [None]:
echo -e "test \t\t test" >> ../SandBox/test.txt
bash tabtocsv.sh ../SandBox/test.txt

Note that `$1` is the way a shell script defines a placeholder for a variable (in this case the filename). See next section for more on variable names in shell scripts. 

## Variables in shell scripts

There are three ways to assign values to variables (note lack of spaces!):

1.  Explicit declaration: `MYVAR=myvalue`
2.  Reading from the user: `read MYVAR`
3.  Command substitution: `MYVAR=\$( (ls | wc -l) )`

Here are some examples of assignments (try them out and save as a single  `Week1/Code/variables.sh` script):

```bash

#!/bin/bash

# Shows the use of variables
MyVar='some string'
echo 'the current value of the variable is' $MyVar
echo 'Please enter a new string'
read MyVar
echo 'the current value of the variable is' $MyVar

## Reading multiple values
echo 'Enter two numbers separated by space(s)'
read a b
echo 'you entered' $a 'and' $b '. Their sum is:'
mysum=`expr $a + $b`
echo $mysum
```

And also (save as `Week1/Code/MyExampleScript.sh`):

```bash
#!/bin/bash

msg1="Hello"
msg2=$USER
echo "$msg1 $msg2"
echo "Hello $USER"
echo
```

### Some more examples

Here are a few more illustrative examples (test each one out, save in `Week1/Code/` with the given name):

#### Count lines in a file

Save this as `CountLines.sh`:

```bash
#!/bin/bash

NumLines=`wc -l < $1`
echo "The file $1 has $NumLines lines"
echo
```
The `<` redirects the contents of the file to the stdin ([standard input](https://en.wikipedia.org/wiki/Standard_streams)) of the command `wc -l`. It is needed here because without it, you would not be able to catch *just* the numerical output (number of lines). To see this, try deleting `<` from the script and see what the output looks like (it will also print the script name, which you do not want).   

#### Concatenate the contents of two files

Save this as `ConcatenateTwoFiles.sh`:

```bash
#!/bin/bash

cat $1 > $3
cat $2 >> $3
echo "Merged File is"
cat $3
```

#### Convert tiff to png

This assumes you have done `apt install imagemagick` (remember `sudo`!) 

Save this as `tiff2png.sh`:

```bash
#!/bin/bash

for f in *.tif; 
    do  
        echo "Converting $f"; 
        convert "$f"  "$(basename "$f" .tif).jpg"; 
    done
```

## Practical

**Some instructions**:

* Along with the completeness of the practicals/exercises themselves, you will be marked on the basis of how complete and well-organized your directory structure and content is.
* Review (especially if you got lost along the way) and make sure all your shell scripts are functional: `boilerplate.sh`, ` ConcatenateTwoFiles.sh`, `CountLines.sh`,` MyExampleScript.sh`, `tabtocsv.sh`, `variables.sh`
* Don't worry about how some of these scripts will run on my computer without explicit inputs (e.g., `ConcatenateTwoFiles.sh` needs two input files) — I will run them with my own test files.
* Make sure you have your weekly directory organized with `Data`, ` Sandbox`, `Code` with the necessary files, under ` CMEECourseWork/Week1`. 
* *All scripts should run on any other Unix/Linux machine* — for example, always call data from the ` Data` directory using relative paths.

* Make sure there is a `readme` file in every week’s directory. This file should give an overview of the weekly
directory contents, listing all the scripts and what they do. This is different from the `readme` for your overall git repository, of which `Week 1` is a part. You will write a similar ` readme` for each subsequent weekly submission.

Don't put any scripts that are part of the submission in your `home/bin` directory! You can put a copy there, but a
working version should be in your repository.

###  A shell script exercise

Write a `csvtospace.sh` shell script that takes a `c`omma `s`eparated `v`alues and converts it to a space separated values file. However, it must not change the input file — it should save it as a differently named file.

Save the script in `CMEECourseWork/Week1/Code`, and run it on the `csv` data files that are in
`Temperatures` in the master repository's `Data` directory.

*Don't modify anything (or refer to anything) in your local copy of the master repository. All changes you make in the master repository will be lost. Copy whatever you need from the master repository to your own repository.*

*Commit and push everything by next Wednesday 5 PM.*

This includes `UnixPrac1.txt`! Check the updated instructions from the [Unix Chapter](01-Unix.ipynb) on this practical.

## Readings & Resources

-   Plenty of shell scripting resources and tutorials out there; in particular, look up
[http://www.tutorialspoint.com/unix/unix-using-variables.htm](http://www.tutorialspoint.com/unix/unix-using-variables.htm)