# 1.1 Unix Basics#

We'll start by going through some basic unix commands. "Unix" is a term for a family of operating systems (just like "Windows"). The Mac operating systems (OSX) are part of the Unix family. You will also hear the term "Linux" a lot - linux refers to series of operating systems that are also part of the Unix family. Unix operating systems are very popular for running servers. However, while you may be used to interacting with your laptop using a graphical interface, these servers do not support graphical interfaces (as graphical interfaces are a LOT of work to build and are less flexible). Instead, you need to interact with them through the command line. Don't worry, it's easy once you get the hang of it, and it looks really cool to people who don't know what you are doing!

## How commands are understood##

Let's clarify a little about how unix commands are understood. The program that understands your unix commands is something called a "shell". If you hear the term "bash" get thrown around, just know that this is the name of a shell. There are many different kinds of shells, and different commands are slightly different depending on the shell that is being run. For now, we will focus on the bash shell. To use the bash shell through an ipython notebook, let the first line be %%bash as illustrated below (run the code in the following shell)

In [7]:
%%bash
#lines that begin with a hastag are comments; they are ignored
#by the shell. Let us double check that the bash shell is being
#run. To do this, we will use the command "echo $SHELL" illustrated
#below:
echo $SHELL

/bin/bash


Let's understand in detail how the command above was understood by the shell.

Commands tend to have the format:<br />
[name of the program] [one or more arguments to the program...]<br />
("arguments" just refers to all the terms that control the behaviour of the program).

In the example above, "echo" is the name of the program. The echo program prints the value of its arguments to the screen.

There is also a concept of an "environment variable". A variable is something that stores information, and an "environment variable" is something that stores information that can be accessed by the shell (i.e. they pertain to the "environment" that commands are run in). Environment variables can be accessed by using "\$" (so \$SHELL produces the value of the SHELL variable). In the example above, \$SHELL gives the location where the current shell program is stored. On my Mac, this location happens to be /bin/bash. It may be slightly different when you run this notebook, but it should still end in "bash".

How do we read a path like "/bin/bash"? Files in a Unix system are organized into folders (also called "directories"). "/" refers to the topmost level. "/bin" is the "bin" folder ("bin" is an abbreviation for "binaries"; "binary" files refers to the form that runable programs often take). So "/bin/bash" refers to the "bash" program stored in the "bin" folder.

When the shell is told to run a program (like "echo"), how does the shell know where to find it? This is where the PATH environment variable comes in. The PATH variable stores the names of a number of directories, each separated by a colon. The shell looks at each of these directories in turn and sees if a runnable file (also called an "executable") with the appropriate name exists in any of those directories. Once it finds such an executable, it stops looking and executes it.

<b> Exercise 1.1.1 </b><br />
Display the contents of your PATH environment variable below:

In [10]:
%%bash
##enter the command to print out the value of PATH below

The "which" program will tell you the exact location of the file that would be used to execute a particular program. For example, we can find the location of the "echo" program as shown below:

In [11]:
%%bash
which echo

/bin/echo


We can even find the location of the "which" program:

In [12]:
%%bash
which which

/usr/bin/which


<b> Exercise 1.1.2 </b><br />
A colleague of yours has installed one version of a program. However, when try to launch the program, the shell keeps launching a different version of the program than what they installed. What might the problem be? How could you check whether this is the problem?

## Navigating the file system, creating and editing files##

Here are a number of handy commands used to navigate the filesystem:

In [17]:
%%bash
#Find out the directory you are in with the pwd command:
pwd

/Users/avantishrikumar/Research/training_camp/workflow_notebooks


In [20]:
%%bash
#Display the contents of the directory with ls
#note that the ls command can be used to reveal a lot of additional information about the files,
#such as file permissions, creation date and file size. You can read more about that
#here: http://www.tutorialspoint.com/unix/unix-file-management.htm
#and: http://www.tutorialspoint.com/unix/unix-file-permission.htm
ls

1.1 Unix Basics.ipynb
1.3 Getting ready to run code on the cluster.ipynb
2.0_Sequencing_Data_Analysis.ipynb
2.4 Creating count coverage tracks.ipynb
3.1 Clustering analysis and PCA.ipynb
3.2 Calling differentially expressed peaks.ipynb
3.3 GO Term Enrichment.ipynb
3.4 Finding TF motifs.ipynb
exercise


In [18]:
%%bash
#Create a new directory with "mkdir"
mkdir exercise

In [21]:
%%bash
#Change into that directory with cd
cd exercise
pwd #confirm you are in the right directory

/Users/avantishrikumar/Research/training_camp/workflow_notebooks/exercise


In [None]:
%%bash
#Make a 

## References##

Here is the tutorial that I (Avanti) used to learn Unix: http://www.ee.surrey.ac.uk/Teaching/Unix/

Here's a more detailed tutorial from tutorialspoint:
http://www.tutorialspoint.com/unix/index.htm

Another resource geared towards bioinformatics: http://manuals.bioinformatics.ucr.edu/home/linux‐basics

Reference for commonly useful commands: https://sites.google.com/site/anshulkundaje/inotes/programming/shell-scripts

Learning shell programming: http://www.learnshell.org/

Debugging shell scripts: http://www.shellcheck.net/