# Introduction to Unix

## Background

### What is an Operating System

<figure>
<img src="./os-components2.png" width=500">
<figcaption align = "center"> Main components of a Unix operating system.</figcaption>
</figure>

### History and Distribution of Unix

<figure>
<img src="./unix-history.png" width=600">
<figcaption align = "center"> History of the Unix operating system.</figcaption>
</figure>

<figure>
<img src="./android-architecture.jpg" width=500">
<figcaption align = "center"> Android and Unix</figcaption>
</figure>

<figure>
<img src="./device-shipment-2015.png" width=400">
<figcaption align = "center"> Worldwide device shipments by operating system, which includes smartphones, tablets, laptops and PCs together. </figcaption>
</figure>

<figure>
<img src="./desktop-os-statistics.png" width=400">
<figcaption align = "center"> Desktop OS platforms statistics in percentage share worldwide. </figcaption>
</figure>

### Why Unix ?

The Unix "philosophy" emphasizes building simple, compact, clear, modular, and extensible code that can be easily maintained and repurposed by developers other than its creators. The Unix philosophy favors composability as opposed to monolithic design. 

- Write programs that do one thing and do it well.
- Write programs to work together.
- Write programs to handle text streams, because that is a universal interface.

Peter Salus - in "Basics of Unix Philosophy" [1]

Use of Unix is quite common within scientific research. Many believe this is because of the common need for high performance multi user systems and large data stores in this area. This is indeed important, but it is also due to the flexibility that is conferred by the nature of Unix based systems. 

However, the most important, though perhaps least understood reason, is that the Unix "philosophy" is consistent with the scientific method i.e. it intrinsically enables and supports Replication, Repeatability, Reusability, and Openess.    

## Getting Started

Lancaster University has a number of Unix systems. All of them are accessed via an internet connection from another system, such as your laptop or a computer on campus. Some of the Unix systems only provide a terminal, some of them provide GUI style applications, but only when accessed from another Unix system, and some provide a full virtual desltop for using both GUI based applicationas and/or a terminal. It is the latter type of systems that will be used in this course.

### Accessing a Remote Unix Desktop 

From a browser follow this link to [Lancaster University MyLab](https://mylab.lancaster.ac.uk/). You will be prompted for a LU username, password, and one time passcode. You will then be presented with something like the following.

<figure>
<img src="./LU-mylab.png" width=600">
<figcaption align = "center"> LU Mylab </figcaption>
</figure>

Now double click on the **LU Ubuntu Lab** icon

<figure>
<img src="./ubuntu-lab.png" width=200">
<figcaption align = "center"> Ubuntu Lab </figcaption>
</figure>

<figure>
<img src="./ubuntu-lab-username.png" width=300">
<figcaption align = "center"> Ubuntu Lab username entry</figcaption>
</figure>

<figure>
<img src="./ubuntu-lab-password.png" width=300">
<figcaption align = "center"> Ubuntu Lab password entry</figcaption>
</figure>

<figure>
<img src="./ubuntu-lab-desktop.png" width=600">
<figcaption align = "center"> Ubuntu Lab Desktop</figcaption>
</figure>

### Using the Desktop

The Ubuntu desktop you are now using has a number of applications pre-installed. You can access many of them by using the "waffle" and selecting an app to start (or file to open etc).

<figure>
<img src="./waffle-application-launcher.png" width=600">
<figcaption align = "center"> Ubuntu Lab Desktop</figcaption>
</figure>

#### Exercises

Try to find and start R-studio from the app launcher.

Try and find and start a file manager from the desktop. Can you locate your LU h-drive ?

Can you share files between your client system and Ubuntu lab ?

Start a web browser and access these course notes.



## Using a Unix terminal

Start the app launcher (by clicking on the waffle) and type terminal in the search box.

<figure>
<img src="./ubuntu-lab-start-terminal.png" width=600">
<figcaption align = "center"> Starting a Unix Terminal</figcaption>
</figure>

Click on the 

<figure>
<img src="./terminal-icon.png" width=200">
<figcaption align = "center"> Terminal Icon </figcaption>
</figure>

which should launch a Terminal in a new window.

<figure>
<img src="./terminal.png" width=400">
<figcaption align = "center"> Terminal Window </figcaption>
</figure>

### Exercise

Type **whoami** in the terminal and press return. What do you see ?

In [1]:
whoami

grosed


Congratulations - you have now been infected by the Unix virus - there is no known cure !!

## Some Basic Unix Commands

Here are some commonly used Unix commands

- **ls** - list files
- **man** - get information about a unix command
- **pwd** - display present working directory
- **cp** - copy files
- **mv** - move files
- **rm** - remove files
- **cd** - change directory
- **mkdir** - make a directory
- **rmdir** - remove a directory


For example, you can generate a list of files and directories by using the command **ls**.  

In [13]:
ls

 [0m[01;34mDownloads[0m                                     [01;34mMusic[0m           [01;34mPictures[0m
 [01;32mfile.txt[0m                                     [01;34m'My Desktop'[0m     [01;34mVideos[0m
[01;32m'grosedj1 (lancshomes13) (H) - Shortcut.lnk'[0m  [01;34m'My Documents'[0m   [01;34mWINDOWS[0m
 [01;32minstall-pyenv-win.ps1[0m                        [01;34m'My Favorites'[0m


Notice that the output of the command **ls** shown above is not on your system (it is on one of mine), so the output will be different.

Most unix commands can accept additional options which modify what the basic command does. For example

In [3]:
ls -l

total 4700
-rw-rw-r-- 1 grosed grosed 198676 Dec 16 10:22 [0m[01;35mandroid-architecture.jpg[0m
-rw-rw-r-- 1 grosed grosed  72518 Dec 16 11:49 [01;35mcomputer-hardware.jpg[0m
-rw-rw-r-- 1 grosed grosed 539589 Dec 16 11:53 [01;35mdesktop-applications.jpg[0m
-rw-rw-r-- 1 grosed grosed 202835 Dec 16 10:39 [01;35mdesktop-os-statistics.png[0m
-rw-rw-r-- 1 grosed grosed  29663 Dec 16 10:32 [01;35mdevice-shipment-2015.png[0m
-rw-rw-r-- 1 grosed grosed   1255 Dec 20 13:19 [01;35mfile-1453.png[0m
-rw-rw-r-- 1 grosed grosed 236442 Dec 20 13:51 [01;35mfile-stdin-cmd-stdout.png[0m
-rw-rw-r-- 1 grosed grosed 237638 Dec 20 13:48 [01;35mfile-stdin-stdout-file.png[0m
-rw-rw-r-- 1 grosed grosed  62234 Dec 20 11:24 introduction-to-unix.ipynb
-rw-rw-r-- 1 grosed grosed 108310 Dec 16 11:40 [01;35mlinux-kernel-code.png[0m
-rw-rw-r-- 1 grosed grosed  47038 Dec 17 13:02 [01;35mLU-mylab.png[0m
-rw-rw-r-- 1 grosed grosed 275954 Dec 16 13:06 [01;35mos-components2.png[0m
-rw-rw-r-- 1 grose

Here, the **-l** options instructs **ls** to output information regarding the properties of the files being listed.

Unix commands take the general form of

**command [-options] \<arguments\>**



### man

You can find infomation about a Unix command using the **man** command. For example

In [9]:
man ls

LS(1)                            User Commands                           LS(1)

NAME
       ls - list directory contents

SYNOPSIS
       ls [OPTION]... [FILE]...

DESCRIPTION
       List  information  about  the FILEs (the current directory by default).
       Sort entries alphabetically if none of -cftuvSUX nor --sort  is  speci‐
       fied.

       Mandatory  arguments  to  long  options are mandatory for short options
       too.

       -a, --all
              do not ignore entries starting with .

       -A, --almost-all
              do not list implied . and ..

       --author
              with -l, print the author of each file

       -b, --escape
              print C-style escapes for nongraphic characters

       --block-size=SIZE
              with  -l,  scale  sizes  by  SIZE  when  printing  them;   e.g.,
              '--block-size=M'; see SIZE format below

       -B, --ignore-backups
              do not list implied entries ending with ~

       -c     with -lt: s

#### Exercise

- List all of the files as before, but print them in a single column 

- As above, but in date order

- Now do both of the above together.

Like many operating systems, Unix organises files into a hierarchy. Files are organised into directories (sometimes refered to as folders by Windows users), and  directories can contain subdirectories. 

Each file and directory has a name and you can navigate the tree like directory/file structure using Unix shell commands.

### pwd - display present/current/working directory

In [10]:
pwd

/home/grosedj1


### cd - change present/current/working directory

Try running the command

```bash
cd h-drive
```

### Exercise 

- What is the present working directory ?

- What files are in your h-drive directory ?

- What is the newest file in this directory ?

- What is the largest file in this directory ?

- Are there any directoies in your h-drive ?

Your Unix account has a home directory. It has a nickname **~**.



### Exercise 

Change the current direcoty to your home directory. What is your home direcoty called ?

### cp - copy files

### mkdir - make a new directory

To create a directory in your current working directory use the command

```
mkdir <directory-name>
```



#### Exercise

- cd to your home directory and make a new directory called **work**.

- cd into this new directory.

- Now create a new directory in your home directory called **play**- but without changing directories.

### wget - get remote files 

**wget** is a command that allows you to download a remote file by specifying its URL. Try the following command.

```bash
wget http://www.mathsbox.com/introduction-to-unix/files/examples.zip
```

### Exercise

List the contents of your work directoy.

### unzip - decompress files that were compressed using the zip command 

The file you downloaded using **wget** is a compressed file. It can be decompressed using the **unzip** command. 

```bash
unzip examples.zip
```

### Redirection

Set the current directory to your home directory and try this - 

``` bash
ls -1 > files.txt
```

What does it do ? 

Hint - try using **ls** to find out.

The **>** is called the redirection operator. Normally, the output of **ls** will go to the terminal. The **>** operator can be used to send it to other things, such as a file. 

### head - display the first few lines

Try this -

```bash
head files.txt
```



#### Exercise

- What did you see ?

- What does **head** command do ?

- What options does the **head** command have ?

- Try some of these options.

- What does the command **tail** do ?

### echo

#### Example

In [6]:
echo "hello"

hello


#### Exercise

- Make a new file called **messages.txt** with the sentence "hello world" in it.

#### Solution

In [7]:
echo "hello world" > messages.txt

### wc - count words, lines, and letters

Try this

```bash
wc messages.txt
```


## Files and File related Commands

Everything in Unix is a file !! 

Well - that is not quite true - but it is a pretty good approximation.

When you ran the **ls** command (yes - run is an appropriate term - each unix command is a small program) you probably saw something like this.

The items in the list which are blue (probably all of them) are directories (which are files, becuase this is Unix). As such, each directory can have files within it, and other directoies (known as subdirectories). 

## The Terminal

In Unix everything is a file, however, there are different types of files. A device file corresponds to a physical element of the system such as a mouse, printer, or terminal. These files are usually located in the systems /dev/ directory (try listing its contents). A terminal usually has the format **tty**\<number\>.  



Not convinced - try this on your system (use the output of the first command in the ones that follow).



In [8]:
tty

/dev/pts/10


In [9]:
ls -l /dev/pts/10

crw--w---- 1 grosed tty 136, 10 Dec 20 13:54 [0m[40;33;01m/dev/pts/10[0m


Now start a new terminal using Ctrl-shift-N and type

```bash
cat > /dev/pts/10
```
followed by some input. What happens ?



<figure>
<img src="./stdin-stdout.png" width=400">
<figcaption align = "center"> Terminal Window </figcaption>
</figure>

<figure>
<img src="./stdin-stdout-file.png" width=400">
<figcaption align = "center"> Terminal Window </figcaption>
</figure>

<figure>
<img src="./file-stdin-stdout.png" width=400">
<figcaption align = "center"> Terminal Window </figcaption>
</figure>

<figure>
<img src="./file-stdin-stdout-file.png" width=400">
<figcaption align = "center"> Terminal Window </figcaption>
</figure>

<figure>
<img src="./stdin-cmd-file-cmd-stdout.png" width=400">
<figcaption align = "center"> Terminal Window </figcaption>
</figure>

<figure>
<img src="./stdin-cmd-pipe-cmd-stdout.png" width=400">
<figcaption align = "center"> Terminal Window </figcaption>
</figure>

In [10]:
#### cd - change directory

Use this link to get an example of how to get file names/info from google

https://developers.google.com/drive/api/quickstart/python


In [11]:
ls -l

total 4684
-rw-rw-r-- 1 grosed grosed 198676 Dec 16 10:22 [0m[01;35mandroid-architecture.jpg[0m
-rw-rw-r-- 1 grosed grosed  72518 Dec 16 11:49 [01;35mcomputer-hardware.jpg[0m
-rw-rw-r-- 1 grosed grosed 539589 Dec 16 11:53 [01;35mdesktop-applications.jpg[0m
-rw-rw-r-- 1 grosed grosed 202835 Dec 16 10:39 [01;35mdesktop-os-statistics.png[0m
-rw-rw-r-- 1 grosed grosed  29663 Dec 16 10:32 [01;35mdevice-shipment-2015.png[0m
-rw-rw-r-- 1 grosed grosed   1255 Dec 20 13:19 [01;35mfile-1453.png[0m
-rw-rw-r-- 1 grosed grosed 236442 Dec 20 13:51 [01;35mfile-stdin-cmd-stdout.png[0m
-rw-rw-r-- 1 grosed grosed 237638 Dec 20 13:48 [01;35mfile-stdin-stdout-file.png[0m
-rw-rw-r-- 1 grosed grosed  43234 Dec 20 13:54 introduction-to-unix.ipynb
-rw-rw-r-- 1 grosed grosed 108310 Dec 16 11:40 [01;35mlinux-kernel-code.png[0m
-rw-rw-r-- 1 grosed grosed  47038 Dec 17 13:02 [01;35mLU-mylab.png[0m
-rw-rw-r-- 1 grosed grosed     12 Dec 20 13:54 messages.txt
-rw-rw-r-- 1 grosed grosed 275954 De

https://zenodo.org/records/1432702

https://drive.google.com/file/d/13_zvOH2iPweF-Ywgxc7weX10x86NfCD4/view?usp=drive_link

download a file from google drive using curl

In [12]:

filename="### filename ###"
fileid="### file ID ###"
curl -L -o ${filename} "https://drive.google.com/uc?export=download&id=${fileid}"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: filename
curl: (3) URL rejected: No host part in the URL
curl: (3) URL rejected: Malformed input to a URL function


: 3

In [None]:

filename="blob.mat"
fileid="13_zvOH2iPweF-Ywgxc7weX10x86NfCD4"
curl -L -o ${filename} "https://drive.google.com/uc?export=download&id=${fileid}"

In [None]:
curl https://drive.google.com/file/d/13_zvOH2iPweF-Ywgxc7weX10x86NfCD4/view?usp=drive_link

In [None]:
https://drive.google.com/file/d/13_zvOH2iPweF-Ywgxc7weX10x86NfCD4/view?usp=drive_link

In [None]:
cat | xargs Rscript -e

In [None]:
envi

environmental data

https://data.ceda.ac.uk/

nc format

https://help.ceda.ac.uk/article/106-netcdf

install netcdf (for ncdump)

sudo apt-get install netcdf-bin

example download

 wget https://data.ceda.ac.uk/bodc/BAS210037/JRA55IAF-ORCH0083-LIM3/1978/d01/I/ORCH0083-LIM3_19780101_I_d01.nc

regular expression to match filenames in html

cat blob | grep -Eo '\w+\.(nc)' | uniq

In [None]:
curl https://data.ceda.ac.uk/bodc/BAS210037/JRA55IAF-ORCH0083-LIM3/1978/d01/I | grep -Eo '\w+\.(nc)' | uniq

## Regular Expressions

A regular expression defines a set of one or more strings of char-
acters. Many Unix utilities, including
grep, ls, and find, use regular expressions to search for and replace strings. 
A simple string of characters is a regular expression that defines one string of characters: itself. A more complex regular expression uses letters, numbers, and special characters to define many different strings of characters. A regular expression is said to match any string it defines.



### Simple Strings

|Regular <br> Expression|Matches|Examples|
|-|-|-|
| ring | **ring** | **ring**,sp**ring**, <br> **ring**ing |
| or not| **or not** |**or not**, <br> po**or not**hing  |
|Thursday | **Thursday**| **Thursday**,**Thursday**'s  |

#### Exercise

Try describing what the following command does ? 

```bash
cat | grep -o "or not"
```

Try using it to test some example input.

Try using it to test the other regular expressions in the above preceding table.

### Special Characters

**Note** - in what follows, a regular expression always matches the longest possible string  starting as far forward in the line as possible

#### Period

|Regular <br> Expression|Matches|Examples|
|-|-|-|
| / .alk/ | stings containing a space, then any <br> character, followed by **awk** | will **talk** <br> may **balk** |
| /.ing/ | all strings with any character preceding <br>n**ing**| **sing**ing, **ping**, <br> before **ing**lenook|



#### Square Brackets

|Regular Expression|Matches|Examples|
|-|-|-|
| [bB]ill | **b** or **B** follwed by **ill** | **bill**,**Bill**,**bill**ed|
| t[aeiou].k | **t** followed by lower case <br> vowel, any character, and **k** | **talk**, **talk**ative |
| number [6-9] |**number** followed by a space <br> and a single number in range <br> 6-9 (inclusive)| **number 7**, k**number 6**01|
| [^a-zA-Z] | any character that is not a letter | **1**, pdf**2**tif |

Notice the negating action of ^ when it is inside [].


#### Asterix

|Regular Expression|Matches|Examples|
|-|-|-|
| ab*c| **a** folowed by zero or more <br> **b**s followed by **c** | **ac**, **abc**, <br> aa**abc**,**ac**|
| ab.*c| **ab** followed by zero or more <br> other characters followed by **c**| **abc**,**abxc**, **c**at, <br> **756.345**, **x**|
| t.*ing | **t** followed by zero or more <br> other characters <br> followed by **ing**| **thing**, **ting**, <br> I **was going**|
| [a-zA-Z]*| a string composed of only letters <br> (upper or lower case)| **1!!, !any text string** |
|(.*)  | as long a string as possible <br> between ()| Get **(this (and) that)** done| 
| ([^)]*)| the shortest sring possible <br> between ()| **(Get (this or)**|


#### Carat (^) and Dollar ($)

|Regular Expression|Matches|Examples|
|-|-|-|
| ^T | a **T** at the beginning <br> of a line| **T**ime, In Time|
| ^+[0-9]| a **+** followed by a number <br> at the beginning of a line | **+5**14, **+3**.14 |
|:$ | a **:** that ends a line| ... following **:**|
| | | |


#### QuotingSpecial Characters

|Regular Expression|Matches|Examples|
|-|-|-|
|end\. | strings containing **end.**| **end.**,end|
| \\$| a dollar sign | lot **$** of money|
| \[[0-9]\] | a number in [] | **[9]**,[90]|
| | | |


#### Full Regular Expressions

To test these examples you will need to use 

```bash
cat | egrep -o "or not"
```


|Regular Expression|Matches|Examples|
|-|-|-|
|ab+c| **a** followed by one or <br> more **b**s followed by <br> a **c**| y**abc**w, **abbc**57 |
| ab?c|**a** followed by zero or one **b** <br> followed by a **c** | b**ac**k,**abc**def|
| (ab)+c| one or more **ab** followed <br> by a **c**  | z**abc**d, **abababc** |
| (ab)?c| zero or one **ab** followed <br> by **c**| x**c**, **abc**c |
| ab\|ac | either **ab** or **ac** | **ab**,**ac**,**ab**ac|
| (D\|N)\.Jones| **D.Jones** or **N.Jones**| **N.Jones**, P.**D.Jones**|


### Summary

#### Standrd Special Characters 

These work with **grep**

|Special Character|Function|
|-|-|
| . | Matches any single character|
| [xyz] | Defines a character class that matches **x**,**y**, or **z**|
| ^xyz] | Defines a character class that matches any character except**x**,**y**, or **z**|
| [x-z]| Defines a character class that matches any character **x** through **z** inclusive|
| * |Matches zero or more of the preceding character |
| ^ | Forces a match to the beginning of the line|
| $ | A match to the end of the line|
| \|| Used to quote special characters|
| \(xyz\) | Matches **xyz** |
| \< |Forces match at beginning of word |
| \> | Forces match at end of word|













#### Full Special Characters 

These work with **egrep**

|Special Character|Function|
|-|-|
| + | Matches one or more occurences of the preceding character|
| ? | Matches zero or more occurence of the preceding character|
| (xyz)+] | One or more occurences of **xyz**|
| (xyz)+|Zero or one occurence of **xyz** |
| (xyz)? |Zero or more occurences of **xyz**|
| (xyz)* |Either **xyz** or **abc** |
| xyz\|abc |Either **xyc** or **abc** |









## References


[1]  Raymond, Eric S. (2004)."Basics of the Unix Philosophy". The Art of Unix Programming. Addison-Wesley Professional (published 2003-09-23). ISBN 0-13-142901-9. Retrieved 2016-11-01.

Available via Lancaster University Library [here](https://learning.oreilly.com/library/view/the-art-of/0131429019/?sso_link=yes&sso_link_from=lancaster-university)
