## Introduction to Unix and Bash


### What is Unix?

Unix is a widely used multiuser operating system. It was invented in the late 1960s, and preceded the “graphical user interface” that we use regularly on our modern computers. We know this typing and command issuing environment as a “command-line interface,” which allows for the easy point-and-click, icon-based interactions with our computers. We will use a interpreter/command language called Bash to operate Unix.

There are many variations of Unix, that we collectively call Unix-based operating systems (macOS, Linux), though the basic functions are similar between them.

### The Terminal


The terminal is a text-only environment on your computer that allows you to type "input" using your keyboard and read the resulting "outputs."

Before we can see exactly what I'm talking about, we need to open a terminal window on your computer. Let's also try logging onto TSCC to do some of these exercises.

#### For Mac:

Navigate to: Applications - Utilities - Terminal

In your terminal window, type:

`ssh ucsd-train##@tscc-login.sdsc.edu`

Replace the ## with your own training account number.

#### For Windows:

Make sure that you have [PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html) installed. You can find these instructions in your pre-class instructions ([Day_0_Setup](https://github.com/jvtalwar/2021-MSTP-Bioinformatics-Bootcamp/blob/master/Day_0_Setup/Generate_Keys/Generate_Public_Private_key_windows.ipynb))

In the "HostName" field enter ucsd-train##@tscc-login.sdsc.edu (remember to put your specific number in place of ##).

In the "Saved Sessions" field, type tscc and click Save.

In the left hand panel, click through Connection - SSH - Auth

In the "Private key file for authentication:" field, use the Browse button to find the private key file you have saved in step 2 in the installation instructions. You might have to change the file type to "all files" in the bottom right corner to see the files you have saved. Click on "private.pkk" and select Open.

Now you are back in the PuTTY window. Click "Open"

Enter the password that you chose when you generated your key to get us on TSCC!


You should have something like this on the left-hand side:

`[ucsd-train40@tscc-login2 ~]$`


If all looks good, we can now move on to executing some commands (!)

While we will cover some of the essential commands, there are others which are also very useful to know. A cheatsheet to some commands can be found [here](https://www.linuxtrainingacademy.com/linux-commands-cheat-sheet/)

### Basic commands

*Where are you currently?*

When using a terminal you will always be inside a folder, or directory. But, at any given time, you may lose track of what directory you are currently in. To do so, we must use "print working directory", or pwd:

`pwd`

What does your output look like? This is mine:

`/home/ucsd-train48`

This is my current, or working, directory. It also happens to be my home directory, or starting point when I log into TSCC. Luckily, we have a shortcut to get back to this point, which is marked by a tilde (~). So, if we ever wanted to get back to this directory from any point on the computer, we would simply type:

`cd ~`

`pwd`

*Where can we go?*

Now that we know where we are, we should try and see what folders are in my working directory:

`ls`

What do you see?

If you began and/or completed your installations (**and hopefully you did!**), your accounts likely have 2 things in them:

 1) The miniconda installation script - Miniconda3-py37_4.8.3-Linux-x86_64.sh <br>
 2) Your miniconda installation - miniconda 3 <br>

If you didn't begin installations or ran into issues, your accounts will probably be empty.

Regardless of whether your home directory is empty or not, let's start making things in our working directories!

#### Making and navigating directories

Let's make a new directory in our own home directory called "bootcamp" using the command mkdir:

`mkdir bootcamp`

Let's use ls to see what our home directory looks like now:

`ls`

`bootcamp`

You have now created a new directory, or folder, on your TSCC account. Let's navigate to another directory on the command-line by using the command cd, or change directory:

`cd bootcamp`

Try pwd again. You'll absolute path will be different than before.

`pwd`

`/home/ucsd-train48/bootcamp`

You are now in the directory, or folder, that you just created. This is how you will make new directories to properly organize your own workspace. When naming a new directory, it is always a good idea to separate words by an underscore ( _ ) to prevent unsual notation. For instance:

`mkdir new_directory`

As opposed to:

`mkdir new directory`

You can also make many levels of subdirectories simultaneously using the mkdir command, but need to add an additional note, or flag, to do it in one command:

`mkdir -p test1/test2/test3`

This can be a fast way of creating new directories several levels lower without having to first navigate to the level immediately below the one you started in.

To move back up a level in your directory hierarchy, you will use two periods ".." to do so:

`cd ..`

This will move you up one level. To move up two levels simply use ".." separated by a "/"

`cd ../..`

This will allow us to change our relative path, or location relative to where we are now. Compare this to absolute path, where the entire path is defined in identifying our absolute location. We can also change location to any other directory (in the following example, the directory "test3" from our current location (say our home directory ~ ), so long as we provide an absolute path.

`cd ~`

`cd ~/bootcamp/test1/test2/`

`pwd`

`/home/ucsd-train48/bootcamp/test1/test2`

#### Making a new file with text editor 

Now that we're in our new directory, let's make a blank text file here. To do so, we will use vi, one of many text editor applications. Others include Emacs and nano (mentioned in this [Unix tutorial](http://korflab.ucdavis.edu/bootcamp.html)), and each of these editors has their own unique feel. For this module, however, we will stick with using vi.

![image.png](attachment:image.png)

`cd ~/bootcamp/`

`vi test_file.txt`

This will open a blank screen with several ~ on the left-hand side:

`
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
~                                                                                                                  
"test_file.txt" [New File]  `

This is the text file (test_file.txt) that you just created. The filename will be displayed in the bottom left hand corner with a [New File] label. You can use the arrow keys to move the cursor around the file. You can edit the contents of your document by entering Insert mode by pressing "i", followed by typing in whatver you would like. To leave, we can press esc, followed by :wq to save and quit(w to save, q to quit)- you can also use :x! to save and quit as well. Let's try:

`i`

`I am writing in a new text file`

`esc`

`:wq!` or `:x!`

This will take you back to your bootcamp directory. Did it work? Let's check by looking at the contents of our directory using:

`ls`

Mine says:

`test_file.txt`

#### Viewing files

Let's take a look at the contents of our text file.

`cat test_file.txt`

It should look something like

`I am writing in a new text file`

**cat** is the simplest command to view a file in Linux. It simply prints the content of files. This is ok in this situation where our file is only 1 line, but imagine if we have a file with millions of lines, your screen will be flooded.

Another popular command to view files is **less**. Less command view the file one page at a time and you can exit by pressing q. There are no lines displayed on your screen.

`less test_file.txt`

`q`

What if you want to print just the first or last 10 lines of a file. **head** and **tail** are good linux commands for this purpose. Head prints the first **n** lines of a file and tail prints the last **n** lines of a file.

`head test_file.txt`

`tail test_file.txt`

By default 10 lines are printed, you can change the number of lines printed with a flag. A flag is a - (or --) followed by a letter or word that affects how your command is run. Flags are specific to a commmand, and so need to be looked up in order to be used to their full potential. To find out more about a specific program's flags, use the man function in front of the command name. For instance:

`man head`

How can we change the number of lines printed?

`head -n 20 test_file.txt`

`tail -n 20 test_file.txt`

#### Editing a file

Let's go back and edit the text file we just made:

`vi test_file.txt`

`i`

Use your arrow keys to go to the end of the prompt, delete what we wrote previously with backspace, and write:

`I am editing an existing text file`

`esc`

`:wq!`

Let's look at our file now using either less, cat, head,or tail:

You can see that we overwrote our original contents. This shows us that vi is a means to not only make new files, but to edit pre-existing ones too.

OR, if after making corrections you decide to quit insert mode without saving changes, you can press esc, then follow with :q! (q to quit, ! to force):

`vi test_file.txt`

`i`

`<make changes to file>`

`esc`

`:q!`

#### Use tabs for auto-completion

_Many mistakes can be introduced due to typos_. This is especially notable when receiving error messages that are a direct result providing the wrong location for input files. Thus, we try to rely on tabs to fill in what is already known by the computer. This will be essential when filling out absolute paths, or the series of directories that must be followed to get to a particular file. As an example, let's go from our home directory to our newly created file:

`cd ~`

Next we want to navigate to our newly made folder. If we begin typing "bootcamp", but press tab part-way through, our computer can fill out the rest if the word is unique:

`cd boot<tab>`

`cd bootcamp/`

If there multiple files or directories that have the same prefix, you can press tab twice to see all the objects that fit this description. Then, you can add additional characters and use tab complete to go to the file or directory of interest.

`less test<tab><tab>`

`test1/         test_file.txt `

`less test_<tab>`

`less test_file.txt`

This is a very useful way to adavnce without making any careless typos. We call also use tabs to list the contents of a directory we are interested in looking into next:

`cd /home/ucsd-train48/<tab><tab>`

This will display all the files in your home directory, even the hidden ones (file names begin with a ".")

**I promise that tab completion will make your work (and thus your life) much, much easier!**

#### Copying a file

Let's make a copy of our text file that we can call test_file_2.txt. We can do this using the following syntax:

`cd bootcamp`

`cp test_file.txt test_file_2.txt`

We should now have two identical copies of the same file in the same directory. Copying will follow the general structure of:

`cp source_file_name destination_file_name`

#### Moving/renaming a file/directory

If we want to move a file, we can do so using the mv command:

`mv file.txt new_destination/`

Let's try this with one of our text files:

`mkdir move_file`<br>
`mv test_file.txt move_file/`<br>
`cd move_file/`<br>
`ls`<br>

The "mv" command also works for renaming files. For instance, if we wanted to rename test_file.txt to really_interesting_file.txt, we would just enter:

`mv test_file.txt really_interesting_file.txt`

The contents of the file will be exactly the same, but its name will now be different. These principles also apply for moving and renaming directories.

#### Deleting a file

To remove a file, we use a command known as rm. Simply make sure that you are in the directory that houses the file that you wish to delete, and perform the following:

`rm interesting_file.txt`

To remove a directory, you may still use rm, though you will need to provide the -r flag to remove recursively:

`cd ~`

`rm -r bootcamp`

**Be ABSOLUTELY sure that you are prepared to lose this file or directory in question as it will be impossible to recover after deletion.**

### Organize your home directory

Organization is a really difficult thing in computational biology, and everyone has their own preferences on how to organize files. I recommend making at least two three folders in your home in addition to sub-folders within your projects directory as we add new projects. Really it doesn't matter how you do this, as long as your are organized and understand your own setup. For the purposes of this class, it is easiest for discussion if we are all operating under the same setup.

Make **3 directories in your home directory named scripts, projects and raw_data**

`mkdir ~/projects`<br>
`mkdir ~/scripts`<br>
`mkdir ~/raw_data`<br>

#### Making softlinks

Softlinks are a great way to easily access files without copying the entire thing into a new directory. Copying files uses a lot of unnecessary space, but sometimes it is annoying to have to give the full path of a filename every time you want to use it. To get around this, we make a softlink which is a pointer to the real file that you can put wherever you want that doesn't require the space of the full file. Since we will be using scratch a lot, we are going to make a softlink to that file in our home.
To make a softlink:


`ln -s sourcefilename destination`

Now let's make a softlink to our scratch directories in our home directory

`ln -s /oasis/tscc/scratch/ucsd-train## ~/scratch`

Check the softlink worked properly:

`ls -l scratch`

My ouput looks like this:

`lrwxrwxrwx 1 ucsd-train48 biom262-group 33 Jul  2 08:25 scratch -> /oasis/tscc/scratch/ucsd-train48/`

This notebook only scratches the surface of what we can do with Unix/bash. For some more helpful hints, try looking [here](https://learncodethehardway.org/unix/bash_cheat_sheet.pdf) for some useful tricks. To get a better sense of other Unix-based things you can do that we were unable to cover completely in class, check out UC Davis' Command-line Bootcamp page [here](http://korflab.ucdavis.edu/bootcamp.html). 