# Lab 1 - Getting to know bash

## Part 1 - Let's get into the bash stuff
### Navigation

Each  user is given a home directory on the course server under `/local-scratch/localhome/`. For example, the instructor's is `/local-scratch/localhome/sasneddo/` and the TA's is `/local-scratch/localhome/whatever/`. Using the Terminal window, you can find your current working directory with the bash `pwd` command (print working directory). The table below provides a few basic commands for navigating the file system in bash.

| Bash | Effect |
| :-: | - |
| `pwd` | Print/return the current working directory |
| `cd DIR` | Change to directory specified by DIR |
| `ls DIR` | Get a list of all the files in DIR |
| `ls` | Get a list of all the files in your working directory |
| `mkdir DIR` | Create a new directory named DIR (can be full or relative) | 
| `echo text` | Print text to the standard output (screen)|

**TASK:** Let's see where we are right now. In the code chunk below, type the command to print the current working directory, then run the code block using the run at the top of the page.

In [None]:
# write your command in the line below


You may have noticed in the code chunk above, I used a hashtag. These aren't just cool ways to highlight interesting themes on Twitter or Instagram; they actually have a purpose in programming. The hashtag allows us to write comments. These are designed to be descriptive lines within your code that explain something. In bash, python and R, all lines with a hashtag at the start are comments. Any line you write within a code block or a piece of programming code that starts with a hashtag will not be run. They are a way to annotate your code.

```bash
# this is a comment
echo This is actually code as there is no hashtag
```

In the code block below, try writing a comment, then listing all of the files in the current directory (note: the current directory can be accessed using `.`)

I mentioned that using `.` allows you to refer to the current working directory (the one you are in right now). You can also access other directories using similar references. To access the directory above the one you are in now, you can use `..`.

You should see that we have some folders, one called `folder01` and the other called `folder02`. How do we know they are folders? I made it kind of obvious for you. But there are other folders there, too. We can modify the use of the `ls` command with a flag - an option that tells us how we want to run the command. The `-l` flag will list the files and folders in the long format, giving you much more information.

```bash
ls -l
```

Run the code chunk below to see this in action.

In [None]:
ls -l

Do you notice that some of the lines begin with the letter `d`? This shows us that they are directories. Anything without a d is a file or a program that you can run. More on that later.

Using the reference table above that showed common bash commands for navigation, create a line of code in the next block that will move you into the folder `folder01` and list the files contained there.

In [None]:
# write your code in the next line


When you change directory using the command you used in the code block above, that's where you stay until you specify otherwise. Use the next code block to get back to the directory above (remember what I said about using `..` earlier). 

In [None]:
# write your code in the next line


Now you can check where you are by using the `pwd` command.

In [None]:
pwd

__NOTE:__ If you ever want to get back to your base home directory, you can use the symbol `~`. For instance, you can use the command:

```bash
cd ~
```

which will take you back to your home directory (that's what the `~` means). You can test this by adding a code block below, or just keep this in mind for later. `~` always refers to your base home directory, and will come in handy many, many times during your programming career.

### Viewing the contents of files

There are many ways to view the contents of files in bash. The following table summarizes them.

|Bash Command|Effect|
|-|-|
|`cat FILE1 FILEN` |print contents of one or more files to STDOUT (i.e. to the terminal)|
|`head FILE` | print the first few (10 by default) lines of a file to STDOUT |
|`tail FILE` | print the last few (10 by default) lines of a file to STDOUT |
|`tail -n NUM FILE` | print the last lines of a file to STDOUT starting at offset NUM |
|`mv FILE NEWNAME`|rename or move FILE to NEWNAME|
|`less FILE`|open file in a scrollable text viewer (q to exit)|
|`wc FILE`|print the tabulation of lines, words and characters to STDOUT|
|`cut -f 1,2,N FILE`| Take columns 1,2,N from a delimited file and pass to STDOUT|
|`grep PATTERN FILE`| `grep` and `egrep` both search for simple or complex patterns of text in files |

In the next code chunk, write code that will navigate to `folder01` and print out the entire contents of the file `file01.txt`. Remember - if you don't know where you are in the file system, you can always use `pwd` to figure that out first.

In [None]:
# write your code after this comment



In the next code block, write code that will determine how many lines are in the file. __Note:__ while `wc` will print the characters, words and lines to the standard output/screen (STDOUT), you can take a shortcut here by using a flag: `wc -l` (this specifies to print out the line count only). 

#### Wait, what on earth is a flag?

Flags are a way to customize the output of the particular command we are using. Every command has options. You can find out what they are by using the code:

```bash
man command
```

Run the next code block and you can see all the flags for the `wc` command.

In [None]:
man wc

__TASK:__ Using what you have learned from the `man` page for `wc`, write a line of code in the next code block that will display the number of words in the file `file01.txt`.

In [None]:
# write your code in the next line


### Creating and modifying files

Obviously, a file system would be nothing if we could only view files. Thankfully, we have the ability to create and edit files. 

Here, I need to introduce you to a couple of useful commands that will allow you to print things to files. These commands are summarized below:

|Bash Command|Effect|
|-|-|
|command `>` filename| print the output of the left side of `>` to a brand new filename specified on the right side|
|command `>>` filename| append output of the left side of `>>` to a brand new or existing filename specified on the right side |

You can try it using the following code. 

```bash
echo "Hello terminal!" > EMPTY.txt
echo $(date) >> EMPTY.txt
```

In [None]:
echo "Hello terminal!" > EMPTY.txt
echo $(date) >> EMPTY.txt

In [None]:
# check your output using cat
cat EMPTY.txt

If you tried the above commands using only the `>` you will notice a different output. Try the following commands instead:

```bash
echo "Hello terminal!" > EMPTY.txt
echo $(date) > EMPTY.txt
```

In [None]:
echo "Hello terminal!" > EMPTY.txt
echo $(date) > EMPTY.txt
cat EMPTY.txt

Why did it do this? The `>` operator __always__ creates a brand new file, so it will write over your existing `EMPTY.txt` file. If you want to __add__ to a file, always use the `>>` command. If the file doesn't exist when you use `>>`, it will create it  for you.

__TASK:__ Fix the code block above so that both lines are written to the file. The resulting file should only have the two lines.

__TASK:__ We want to set up a file that will load every time you open up your bash terminal (we haven't done that but we will get there). We want to create a file called `~/.bash_profile` that contains two lines:

```bash
source ~/.bashrc
DATA=/local-scratch/course_files/MBB243/
```

__Hint:__ Use the `echo` command on the left side, and the filename `~/.bash_profile` on the right side. Be specific as to whether you use `>` or `>>` with each line. Write your code in the code block below and run it. Check the output using the `cat` command on the new file you have created.

In [None]:
# Write your code here



### Exploring some other commands

Let's revisit some of the commands you saw used in class individually. The code below runs the same command and either prints the result out or stores it in a file. You can view the contents of the new file either in the file explorer to the left or by using a bash command (e.g. `cat`). 

In [None]:
head -n 25 /local-scratch/course_files/MBB110/Sneddon_genotypes.txt
# translation:
# head: take the first N lines (default is 10 if not specified)
# -n 5: specifies to take the first 5 lines instead
# the last bit is the full path to the file we want to use as input for the head command
# You can save the output you see by "redirecting" STDOUT to a file. 

head -n 25 /local-scratch/course_files/MBB243/Sneddon_genotypes.txt > genotype_head.txt

You should have a few lines of genotype information in a new file named `genotype_head.txt`. Do you notice any genotypes in there that look "off"? Assuming we wanted to sanity check this whole file for what genotypes exist, we can use a combination the `cut`, `uniq` and `sort` commands (uniq is dumb and only works properly if its input is sorted). The code chunk below does this but uses our smaller file as input for efficiency. The result should be all unique genotypes in that file. 

In [None]:
cut -f 4 genotype_head.txt | sort | uniq

We can also set variables in bash. Basic variables store a single value that can be changed over time within your program. Creating (or "declaring" a new variable is usually coupled with setting it's value. It is common to use a variable to capture or store the output of some other code or a function. Anything that is printed to your screen in Rstudio or shown in the output of your markdown can instead be stored in a variable. Why might we want to do this? Here's an example that we'll use later. The data files that we will use in each lab are all being put in one shared directory on this server:

```bash
/local-scratch/course_files/MBB110
```

Do you want to type that whole directory or paste it into your code every time you need to refer to a file there? Me neither! We can instead create a variable that stores that information and use it as a shortcut to represent that path. In bash, here's how we can create a variable named `DATA` and store this information in it. The next few lines show how it can be put to use various ways. This is a bit irrelevant for this directory since we created a symbolic link to it above and that acts as a shortcut but it isn't as versatile. For example, from any working directory, you can use $DATA to refer to that file path but you would need to be in your home directory to use the symbolic link (`files`) or specify the path to that link (`~/files`) if you are in another directory. 

In [None]:
DATA="/local-scratch/course_files/MBB110/" 

# print the contents of our new variable with the echo command
# IMPORTANT: when retrieving the contents of a variable we need to preface it with $
echo data path is $DATA
echo another way to refer to it is ~/files
# how about using it in combination with ls to view what's in the directory?
echo contents of the directory:
ls -l $DATA
echo contents of symbolic link to directory:
ls -l ~/files/

## Part 2 - Let's go into the real world and do a little treasure hunt!

So far, we have learned about a few bash commands that are going to make our lives easier when it comes to later labs. However, we have been using the notebook, which isn't really how we would do it in the real world. What we would normally do is use the command line: something we are going to do now.

In the top menu bar, click __File__ then choose __New Launcher__. This provides us with access to all of the other good stuff on this server. What we are interested in is found on the bottom of the launcher page: the __Terminal__. Click on the icon shown below to open it up in a new window.


![terminal](images/03_terminal.png)

When you click to the new window, you will see a prompt with a flashing cursor. This is your command line. The flashing cursor is waiting for you to enter commands. Here's what it might look like:

![cmd](images/04_cmd.png)

What is the prompt telling you? You will notice that the first part is your computing ID. The `@` symbol is separating your userid from the server that you are currently on. Mine says that I am working on the `mbb-test` server (yours will say `mbb-bioinf`). Then, after a space, it shows `~`. Do you remember what this means? We talked about it earlier. This part of the prompt will always show you what folder you are currently working within. Keep this in mind, as being in the wrong folder is often the cause of many problems in our labs.

__TASK:__ Every command that we have practiced in this notebook can be performed on the command line. In your **Terminal** tab, try a few things out:
- print your current working directory
- list the files in your current directory
- change into a subfolder (such as `mbb110`, `lab01` or `folder01` depending on where you are in the file system)
- echo some commands to a file

Remember that you can switch back to this page for reference.

### What's this about a treasure hunt?

As a homework task, I have included a little game that will test your skills in navigating the file system in bash. You should not do to this today unless you finish everything else first - you can do this from home later.

First, you must understand a couple more concepts that will come in handy:

#### Executing (running) files

To execute (run) a file in bash, you need to use the `./filename` command. You may find executable files within the game that you need to run, that might offer you some kind of reward. You may wish to run these using the code:

```bash
./treasure
```

and then follow the instructions to collect your wealth.

#### Setting and accessing variables

You will also recieve instructions on how to save your wealth and health points to variables that you can access throughout the minigame. Don't worry too much about what a variable is right now, but for reference, it is simply an identifier that points to a value. You will be asked to use commands such as `EXPORT` to put values into variables. You can find out the contents of any variable using the following code:

```bash
echo $VARIABLE
```

In this game, your inventory is stored in a variable called I, and your health in a variable called HP. In your __Terminal__ window (not this window) you can find out your current inventory and health info using the following commands:

```bash
echo $I
echo $HP
```

#### Now, onto the game!

To start the game, navigate in the __Terminal__ window to `~/mbb110/lab01/entrance` and read the scroll that you find there. 

Play the game for as long or as little as you like until you get a feel for working with bash on the command line. Once you are finished, run the above commands to show your inventory and HP in your __Terminal__ and paste them into the Markdown block below. I will create a leaderboard and post it on Canvas to show the results.

Your final stats:

