# Workshop 1 - Jupyter notebooks and the bash kernel

This workshop is designed to introduce you to the two core tools that you will be using for the other workshops.  

Before you begin this workshop you should know some very basic unix commands. These are covered in chapters 1-10 of the [interactive guide](http://bc2023.bioinformatics.guide/lessons/).  Unless you are already familiar with UNIX it is essential that you read over those chapters before you start (~ 30 minutes).

At the end of this workshop you should;

1. Understand what a jupyter notebook is and how it relates to the unix command line
2. Be able to edit text and run unix commands from within a jupyter notebook
3. Know how to assess your learning by using the self-assessment exercises in a jupyter notebook

### Jupyter notebooks

The document you are reading is a jupyter notebook. 

It consists of series of cells that contain either text or computer code. 

Jupyter notebooks are very useful for bioinformatics because they allow text to be mixed together with code for manipulating data, running programs and creating plots.

### Text cells and Markdown

The cell you are reading is a text cell. Click on it to make it the currently active/selected cell. The active cell will have thin coloured border around it with a thicker border on the left. If the border is blue the cell is not editable.  

Double click on this cell to make it editable.  

You should see that it's border turns green.  You should also see that it's content changes to plain text in [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Here-Cheatsheet) format.  Markdown is a way of writing documentation that is very simple but still allows some basic styling (headers, links, images, code, bold, italics, equations, quotes) 

### Code cells and the Bash kernel

The text you type into code cells should consist of valid commands that can be interpreted by the notebook's `kernel`. A notebook's `kernel` is the engine it uses to evaluate code cells.  This notebook is running the [Bash](https://en.wikipedia.org/wiki/Bash_%28Unix_shell%29) kernel. This means that when you run code cells they will be interpreted as if you typed the same text at the unix command prompt.

Jupyter [notebooks support many types of kernels](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels) including `Python`, `R` and `Bash` which are particularly useful for bioinformatics.

**Note:** You can tell which kernel a notebook is running by looking at the kernel indicator in the top right corner. 

### Running cells


The notebook will not actually run your cells until you tell it to.  You can do this by first selecting the cell and then using the menu to select Cell -> Run Cells. 

The cell immediately below this one is a code cell. 

The `ls` command in this cell should be familiar to you. Try running it.

Try double-clicking on a text cell to set it into edit mode.  Then run the text cell.  When text cells are run they aren't evaluated by the `kernel` but are rendered for display in your web browser.

In [1]:
ls

autograded_answer_example.png  E2  jupyter_intro.ipynb  setup.sh


# IMPORTANT

> ## Run the Setup Code 

In order for this notebook to work properly you need to **run the cell below before doing anything else**. This will load custom functions and settings required to make the self assessment exercises work. 

If you restart your kernel you will also need to rerun the setup code 

> ## Don't use the `cd` command 

The answers to all self assessment exercises assume that you don't change your directory from the default.  You shouldn't ever need to use the `cd` command to answer an exercise.


In [2]:
# Essential Setup Code : Must be run first.
wget https://www.dropbox.com/s/zqgacjshllprdcc/setup.sh?dl=0 -O setup.sh
source ./setup.sh


--2017-06-28 06:58:59--  https://www.dropbox.com/s/zqgacjshllprdcc/setup.sh?dl=0
Resolving www.dropbox.com (www.dropbox.com)... 162.125.82.1
Connecting to www.dropbox.com (www.dropbox.com)|162.125.82.1|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://dl.dropboxusercontent.com/content_link/w2958rgSXwoFZSm5xESk6MpKuxMUEqfeGDYcwcIDxkTifYLJKt7lPVytttatfCKe/file [following]
--2017-06-28 06:59:02--  https://dl.dropboxusercontent.com/content_link/w2958rgSXwoFZSm5xESk6MpKuxMUEqfeGDYcwcIDxkTifYLJKt7lPVytttatfCKe/file
Resolving dl.dropboxusercontent.com (dl.dropboxusercontent.com)... 162.125.7.6
Connecting to dl.dropboxusercontent.com (dl.dropboxusercontent.com)|162.125.7.6|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1534 (1.5K) [text/x-sh]
Saving to: 'setup.sh’


2017-06-28 06:59:03 (82.4 MB/s) - 'setup.sh’ saved [1534/1534]

Setup Done


### Keyboard shortcuts

Using the mouse all the time to run cells can be very tedious.  To save time, select the cell type (command)-(enter). Depending on your keyboard this combination may be slightly different (eg (control)-(return) on a mac).

## Exercise 1

**Your task: ** Write a command to list the contents of the current directory

This is deliberately easy (the answer is `ls`) so that you can focus on understanding the self-assessment mechanism. 

Follow these steps for every exercise:

1. Read the text describing the problem and figure out your answer.  Feel free to create new cells to experiment with commands until you get things right. You might also want to use the terminal on [the tutorial site](http://bc2023.bioinformatics.guide/lessons/)
2. Enter your code into the answer cell. The answer cell contains a blank space for you to put your answer but it is important that you don't change the other code in the cell.  Eg. like below
![autograded_example](autograded_answer_example.png)
3. Be sure to run your answer cell.  This will make your answer accessible to the test cell
4. Run the test cell to check your answer. The test cell is locked and always comes immediately after the answer cell. 

In [3]:
e1_answer(){
### BEGIN SOLUTION
ls
### END SOLUTION
}

In [4]:
test_e1

Your answer is correct


In [5]:
# This code cell is for you to experiment with the ls command (see exercise below)
ls -aFl

total 76
drwxrwxr-x 4 iracooke iracooke  4096 Jun 28 06:59 ./
drwx------ 5 iracooke iracooke  4096 Jun 27 03:24 ../
-rw-rw-r-- 1 iracooke iracooke 26858 Jun 27 06:06 autograded_answer_example.png
drwxrwxr-x 2 iracooke iracooke  4096 Jun 28 06:59 E2/
drwxr-xr-x 2 iracooke iracooke  4096 Jun 27 03:24 .ipynb_checkpoints/
-rw-rw-r-- 1 iracooke iracooke 25768 Jun 28 06:56 jupyter_intro.ipynb
-rw-rw-r-- 1 iracooke iracooke  1534 Jun 28 06:59 setup.sh


### Extending the `ls` command

Use the code cell above and try various optional arguments to the `ls` command. Eg.

```bash
ls -F
ls -1
ls -a
ls -R
ls -S
```

Now try printing the *help* text for the `ls` command

```bash
ls --help
```

Search through the help and look for each of the options in the commands above. Use the description for each option to understand the output you see when you run each command.

**Note:** Another way to bring up the *help* is the `man` command but unfortunately this doesn't work well in a jupyter notebook



## Exercise 2

**Your task: ** Write a command to list the contents of your current directory (not including hidden files) in reverse order



In [6]:
e2_answer(){
### BEGIN SOLUTION
ls -r
### END SOLUTION
}

In [7]:
test_e2

Your answer is correct


## Exercise 3

**Your task:** Write a command to list the contents of the `E2` directory



In [8]:
e3_answer(){
### BEGIN SOLUTION
ls E2
### END SOLUTION
}

In [9]:
test_e3

Your answer is correct


### Exercise 4

**Your task:** Write a command to list the contents of the E2 directory with one item per line and sorted by reverse size


In [10]:
e4_answer(){
### BEGIN SOLUTION
ls -1 -Sr E2
### END SOLUTION
}

In [11]:
test_e4

Your answer is correct


### Exercise 5

**Your task:** Write a command to list the contents of the E2 directory one item per line so that the word HELLO is spelled.  Your output should look like the text below

```bash
E2/5_H.txt
E2/2_E.txt
E2/3_L.txt
E2/4_L.txt
E2/1_O.txt
```

**Hint 1:** You will need to use the wild-card character `*`. See chapter 13 of the guide for examples.  
**Hint 2:** Look at the sizes of files using `ls -l`

In [12]:
e5_answer(){
### BEGIN SOLUTION
ls -1 -Sr E2/*.txt
### END SOLUTION
}

In [13]:
test_e5

Your answer is correct


In [14]:
ls --help

Usage: ls [OPTION]... [FILE]...
List information about the FILEs (the current directory by default).
Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.

Mandatory arguments to long options are mandatory for short options too.
  -a, --all                  do not ignore entries starting with .
  -A, --almost-all           do not list implied . and ..
      --author               with -l, print the author of each file
  -b, --escape               print C-style escapes for nongraphic characters
      --block-size=SIZE      scale sizes by SIZE before printing them; e.g.,
                               '--block-size=M' prints sizes in units of
                               1,048,576 bytes; see SIZE format below
  -B, --ignore-backups       do not list implied entries ending with ~
  -c                         with -lt: sort by, and show, ctime (time of last
                               modification of file status information);
                               with -l:

# Optional

Playing with the command line is the best way to learn.  

1. Try the `fortune` command.
```
    fortune
```
Run it a few times
2. Try the `cowsay` command like this
```
    cowsay "keyboard good, mouse bad"
```
3. To be a bit more faithful to [the original](https://en.wikipedia.org/wiki/Animal_Farm) we need to make the following change
```
    cowsay -f sheep "keyboard good, mouse bad" 
```
4. Now try combining the two commands  
```
    fortune | cowsay
```
    > This introduces a new concept, the pipe operator, `|`.
    > A pipe allows the output of one command to be used as input for another
    > .We will cover pipes in more detail in workshop 2
5. Try out various cows.  You can find more inside the directory `/usr/share/cowsay/cows`.  
6. Read the [cowsay man page](https://linux.die.net/man/1/cowsay) to see if you can change the appearance of cows in other ways. 
7. If you are truly unsatisfied with the default cows you can find more [here](https://github.com/paulkaefer/cowsay-files/tree/master/cows)