#  Basic Unix Tutorial

This is a tutorial on basic unix operations for the [KIPAC computing boot camp](http://kipac.github.io/BootCamp).

Authors: [Yao-Yuan Mao](http://yymao.github.io), [Joe DeRose](https://github.com/j-dr), [Justin Myles](https://github.com/jtmyles)

Before getting started we will need to install a bash kernel for jupyter notebooks:

```
pip install bash_kernel  
python -m bash_kernel.install
```

## Outline
* [Introduction to Unix](#intro)
* [Navigating the file system](#filesystem)
* [Manipulating the file system and files within it](#file_manip)
* [Executing commands](#exec_commands)
* [Useful utilities](#useful_utils)

## Introduction to Unix<a name="intro"></a>

Unix refers to an operating system paradigm that has become the *de facto* standard in scientific computing, HPC, and software development. Unix was originally a single propriety operating system, but is now most prevalent in the form of distributions of operating systems based on the open source Linux kernel. In this tutorial, we focus on two central concepts of Unix: the filesystem and the shell. The Unix filesystem is summarized by the maxim [“Everything is a file”](https://en.wikipedia.org/wiki/Everything_is_a_file). In this filesystem everything from external hardware components to text files are organized in a single [tree](https://en.wikipedia.org/wiki/Tree_%28data_structure%29).  The Unix shell is an interactive program (e.g. bash, csh, zsh, tcsh) that takes as input text commands written by a user, sends them to the operating system, and shows text output. 

## Navigating the file system<a name="filesystem"></a>

Let’s take a look at `/`, the root of the filesystem. Open a shell and enter `ls /`, then press `return`. An example output is below.

Here `bin` stands for binaries. This is a directory that contains very basic programs like `ls`. 

```  bash
-bash-4.1$ ls /
bin   cern    dev    lib    lost+found  media  mnt  nfs  proc  sbin     scs      selinux  sys  u    var
afs   boot  cgroup  etc  home  lib64  lustre      misc   net  opt  root  scratch  scswork  srv      tmp  usr
```

To learn more about these files, we can pass some flags to the `ls` program. Now try `ls -la /`. (to learn what flags a program allows, see its manual page with e.g. `man ls`). This tells `ls` that you want to see the output in *l*ong listing format and that you want to see *a*ll files, including files with names that begin with `.`

In this long output, you see that in the Unix filesystem, each file has a set of rules governing who can see and use the file. 

The first char specifies the file type (e.g. `-` for normal file, `d` for directory, `l` for link).

The next nine chars specify the permission to read (`r`), write (`w`), and/or execute (`x`) for the Owner, Group, and Other users, respectively. To learn more about the output, see the full `man` page with `info coreutils 'ls invocation’`


```bash
-bash-4.1$ ls -la /
total 1170
dr-xr-xr-x.   33 root root   4096 Sep 11 17:21 .
dr-xr-xr-x.   33 root root   4096 Sep 11 17:21 ..
-rw-r--r--     1 root root      0 Sep 11 17:21 .autofsck
-rw-r--r--     1 root root      0 Sep 13  2013 .autorelabel
drwx------     2 root root   4096 Sep 13  2013 .elinks
-rw-r--r--     1 root bin       0 Nov 12  2013 .readahead_collect
-rw-r--r--.    1 root root      0 Sep 13  2013 Kickstart_end
drwxr-xr-x   401 root root  18432 Dec 31  1969 afs
dr-xr-xr-x.    2 root root   4096 Jun 21 02:23 bin
dr-xr-xr-x.    5 root root   4096 Aug 16 02:05 boot
lrwxrwxrwx     1 root root     48 Sep 13  2013 cern -> /afs/slac.stanford.edu/package/cernlib/@sys/cern
drwxr-xr-x.    2 root root   4096 Nov 28  2017 cgroup
drwxr-xr-x    17 root root   3720 Sep 11 17:22 dev
drwxr-xr-x.  153 root root  12288 Oct  9 08:28 etc
drwxr-xr-x     4 root bin    4096 May  1  2015 gpfs
drwxr-xr-x.    2 root root   4096 Jun 28  2011 home
dr-xr-xr-x.   13 root root   4096 Jul  5 22:58 lib
dr-xr-xr-x.   11 root root  12288 Jun 21 02:24 lib64
drwx------.    2 root root  16384 Sep 13  2013 lost+found
drwxr-xr-x     3 root bin    4096 Sep 18  2013 lustre
drwxr-xr-x.    2 root root   4096 Jun 28  2011 media
drwxr-xr-x.    2 root root   4096 May 16 17:36 misc
drwxr-xr-x.    2 root root   4096 Jun 28  2011 mnt
drwxr-xr-x.    2 root root   4096 May 16 17:36 net
dr-xr-xr-x   112 root bin    4096 Mar 26  2018 nfs
drwxr-xr-x.    8 root root   4096 Jun 24  2015 opt
dr-xr-xr-x   492 root root      0 Sep 11 17:21 proc
drwx------.    4 root root   4096 Jan 13  2017 root
dr-xr-xr-x.    2 root root  12288 Sep 13 01:58 sbin
drwxrwxrwt. 2320 root root  90112 Sep 26 00:28 scratch
drwx------     3 scs  bin    4096 Oct  3 02:00 scs
drwxr-xr-x.    6 root root   4096 Sep 17  2014 scswork
drwxr-xr-x.    2 root root   4096 Sep 13  2013 selinux
drwxr-xr-x.    2 root root   4096 Jun 28  2011 srv
drwxr-xr-x    13 root root      0 Sep 11 17:21 sys
drwxrwxrwt. 2089 root root 933888 Oct  9 08:51 tmp
lrwxrwxrwx     1 root root     24 Sep 13  2013 u -> /afs/slac.stanford.edu/u
drwxr-xr-x.   18 root root   4096 May  1  2015 usr
drwxr-xr-x.   27 root root   4096 May  1  2015 var
```

Unlike using a desktop enviornment, when you are working with a command line interface, you always "exist" in a specfic place in that system. We call this place the **Working Directory** (WD).

- To print out the working directory, use *command* `pwd`. 
- To change the WD, use command `cd`, followed by an *argument*, the path of the new WD.
    - If the path does NOT start with the file tree root (/), then it's RELATIVE to current WD
    - To change the WD to the parent directory of current WD, usd `cd ..`
    - To go back to the last WD, use `cd -`
    - To go to your home directory, use `cd ~`
    
Try the following commands out. Use `pwd` to check if you get it right.

In [None]:
pwd

In [None]:
cd ~/Downloads

In [None]:
cd ..

In [None]:
cd -

In [None]:
cd ~

To see the files and directories inside a directory, use `ls`, followed by that directory. If no argument is followed, it lists the files and directories inside current WD. 

Try the following commands out.

In [None]:
ls

In [None]:
ls ..

In [None]:
ls ~

(Note that when you `ls` a directory, you did not automatically `cd` into that directory. You still stay in your original WD. Use `pwd` to be sure.)

Keep trying some more:

In [None]:
ls -a ~

In [None]:
ls -l ~

In [None]:
ls -t ~

Notice that by adding *optional arguments* such as `-a` or `-l` after `ls`, we obtain slightly different outputs. For `ls`, `-a` prints all files including the hidden ones, `-l` prints a long, detailed list, `-t` prints the list ordered by time.

You can also combine options, like:

In [None]:
ls -alt ~

**You might have noticed**:

- Operations are specificed by `<command> <arguments>`
- Optional arguments *usually* start with a dash (-)
- The names of hidden files/directories always start with a dot (.)

When in doubt use ```man```, which will show the manual for any command with documentation on the linux man pages.

##  Manipulating the file system<a name="file_manip"></a>

Now we are going to download some files for this session, so `cd` into a directory of your chioce, we will store the files there.

Want to create a new empty directory? `cd` into the directory where you want to create the new directory, and use `mkdir` followed by the name of new directory. Then `cd` into it.

In [None]:
mkdir new_dir
cd new_dir

(Also noticed that when you make a new directory, you do not automatically `cd` into it.)

Now we can download the files. I have compressed the files as a signle tarball. Type the following commands to download and decompress the file.

In [None]:
wget https://raw.githubusercontent.com/KIPAC/BootCamp/master/Unix/files_for_practice.tar.gz

Note: If you don't have `wget` on your machine, just download [this file](https://github.com/KIPAC/BootCamp/raw/master/Unix/files_for_practice.tar.gz) and then move it to the new directory you just created.

In [None]:
tar -xzf files_for_practice.tar.gz

### Task 1

- Create a directory call `mp3` under `files_for_practice/random_files`
- Move all the mp3 files under `files_for_practice/random_files` into `files_for_practice/random_files/mp3`

Here `mv` is the command to move or rename files/directories. It should be called as 

    mv [source file] [target file]
    ### OR ###
    mv [files to be moved [...]] [target directory]
    
The star symbol (\*), or asterisk, is a **wildcard character** that matches any charcters. 

### Task 2

- Create a directory call `est` under `files_for_practice/random_files`
- **Copy** all files whose filenames _are_ `est`, **regardless of their file extensions**, under `files_for_practice/random_files`, into `files_for_practice/random_files/est`

(What's the command for copy? You guessed it --- `cp`.)

### Task 3

- In `files_for_practice/random_files`, rename the file `architecto.mp3` to `consequuntur.html`

### Task 4

- In `files_for_practice/random_files`, remove the file `molestias.css`

(Yes, command for remove is `rm`. It is a good idea to add `-i` after it, so that you won't accidental delete files you don't want to delete.)

## Useful utilities <a name="useful_utils"></a>

### What we have learned so far

- `pwd`: print current working directory
- `cd`: change working directory
- `ls`: list files and directories
- `mkdir`: make new directories
- `mv`: move files and directories
- `cp`: copy files
- `rm`: delete files
- `wget [URL]`: download file
- `tar -xzf [FILE]`: uncompress file

Linux utilities like these are written in highly optimized code and can accomplish complex tasks. It's worth developing familiarity with these programs so you can take advantage of their speed when they apply to the task you need to accomplish.

## Read text files

- Use `more` or `less` to read a text file.
   - Use Page Up/Down to scroll
   - /keyword to search for a keyword, n/N to navigate to next/previous result
   - Type `q` to quit
- Use `cat` to print out the content of a file.
- Use `head` to print out the first few lines of a file.
- Use `tail` to print out the last few lines of a file.

### Task 5

- what's the second to the last line in file `users.txt`?
- what's the third line in file `clients.txt`?

## Execute files <a name="exec_commands"></a>

We have already used many *programs* already, like `wget` and `tar` --- they *are* programs with command-line interfaces. 

Just on like all other systems, these programs live *somewhere* in the system. You can find out where with the `which` command:

In [None]:
which tar

In [None]:
which wget

In [None]:
which which

The system knows where to find these programs by looking into some pre-defined paths, which are stored in an enviroment variable called `$PATH`. We'll talk more about variables later. For now you can check out what's in your pre-defined paths by printing out this variable:

In [None]:
echo $PATH

So what if I have a program also called `wget` in my current WD and I want to run it? 

You then need to specify it's path (i.e. at least one slash '/' needs to appear). It the program is in your current WD, you can call it with `./<program name>`, where the dot (.) stands for current WD.

### Task 6

- change WD to files_for_practice/executables
- try running the fake programs `wget` and `tar`.

You'll notice that you cannot execute `tar`. The reason is that you don't have the permission to execute it. 

You can **change the permission** (read/write/execute) by the command `chmod`. Try:

In [None]:
chmod u+x tar

### Task 7

- remove user's read permisson from the file `alphabet.txt`
- See if it works by reading the file
- If it works, reinstate user's read permisson

## I/O redirection and piping

**Redirection**:
To read `input_file` as standard input and print standard output to a new file `output_file`:

    command < input_file > output_file
    
You can use only one part of it:

    command < input_file
    
    command > output_file
    
You can also redirect the output and **append** to a file with two arrows:

    command >> output_file


**Piping**:
To use the standard output from `command1` as the standard input of `command2`

    command1 | command2

### Task 8

- add execute permission to user for the file `reverse`
- run `reverse` with the content of `alphabet.txt` as stanford input (try both redirection and piping)

**Hints**:
recall the command to change permission, and the command to print out the whole file.

### Task 9

- go to the directory `files_for_practice/random_files`
- how many files in total are in this directory?
- how many "png" files are in this directory?


**Hints**:

- `ls` has an option `-1`. With this option, each file would be printed as one single line.
- Command `wc` can count the words and lines from standard input

## grep

grep is a powerful program that finds patterns

In [None]:
grep ^d users.txt

In [None]:
grep ^d *.txt

In [None]:
ls -1 | grep -E i.\.mp3

### Task 10
Search grepdata.txt for all lines that do not begin with a capital letter.

## sed
sed is a stream editor whose strength is altering input streams (such as text files).

To replace a phrase in a file the syntax is:

```
sed 's/phrase/replacement/' filename
```

To learn more about using `sed`, see the extensive tutorial at [link](http://www.grymoire.com/Unix/Sed.html)

In [None]:
sed 's/\./SPAM/' users.txt

In [None]:
sed -i.bak 's/\./SPAM/' users.txt

### awk
awk is another useful program for selecting parts of text files.

To learn more about using `awk`, see the extensive tutorial at [link](http://www.grymoire.com/Unix/Awk.html)

In [None]:
ls -l | awk '{print $1}'

## Monitoring and killing processes

You can view all processes currently running in the terminal using ```ps```, and kill them using ```kill```

In [None]:
ps

### Task 11
Kill all anaconda processes using ```ps``` and ```kill```. 

*Hint: use ```xargs```