# Intro To Unix Basics

# Day 2 Slides + Exercises

![title](images/slide6.png)

### Exercises

1. Use `gzip` and `gunzip` to compress and decompress a few files. With `ls -l`, check that the size of the file (in bytes) is smaller when it is compressed than when it is not. What kinds of files shrink more/less when you compress them? Try it out on text/bed files as well as PNG files in the images directory.


2. Compress the bed file from the previous exercises section, and then use `zcat [gzipped file] | head` and confirm that the output is the same as when you run `head` on the non-gzipped file. Try out the `grep`, `sed`, and `awk` commands we ran on the bed file in the previous section, but using `zcat` and the pipe so that you avoid decompressing the bed file.


3. Try out interesting combinations of commands using piping. Can you combine `head` and `tail` to extract lines 101 to 201 from the bed file? Can you string together multiple `sed` commands to change all I numerals to "-one", all Vs to "-five", and all Xs to "-ten"?


4. Which chromosome is left after this series of grep filter commands: `grep "X" [bed_file] | grep "V" | grep -v "I"`? (Think about it before running to check your guess!)

![title](images/slide7.png)

Let's clarify a little about how unix commands are understood. The program that understands your unix commands called a "shell", and "bash" is the name of a shell. There are many different kinds of shells, and different commands are slightly different depending on the shell that is being run, but bash is pretty standard.

As an exercise, double check that the bash shell is being run in your terminal. To do this, we can look at what is stored in the variable `$SHELL` by running `echo $SHELL`.

How do we interpret `/bin/bash`? We can see that it is an absolute path (because it starts with `/`, and that `bash` is located inside the folder `bin` which is under the root directory. "bin" is an abbreviation for "binaries"; "binary" files refers to the form that runable programs often take. So "/bin/bash" refers to the "bash" program stored in the "bin" folder.



When the shell is told to run a program (like "echo"), how does the shell know where to find it? This is where the `$PATH` environment variable comes in. You can look at what's in your `$PATH` using `echo $PATH`.

The `$PATH` variable stores the names of a number of directories, each separated by a colon. When you enter a command in the terminal, the shell looks at each of these directories in `$PATH` in turn, checking if a runnable file (also called an "executable") exists in any of those directories and has the same name as the command you typed. Once it finds such an executable, it stops looking and executes it.

This is true for all the commands we have learned to run so far, such as `ls` and `cat`. You can look inside `/bin` and the other directories in `$PATH` to find where the file for each of these commands lives.

If you ever aren't sure where a particular command lives, you can retrieve the absolute path for it using `which`. Try it out with `which ls` and other commands. You can even run `which which`!

### Exercise

A colleague of yours has installed one version of a program. However, when try to launch the program, the shell keeps launching a different version of the program than what they installed. What might the problem be? How could you check whether this is the problem?

## Helpful References ##

Recommended Unix tutorial: http://www.ee.surrey.ac.uk/Teaching/Unix/

Here's a more detailed tutorial from tutorialspoint:
http://www.tutorialspoint.com/unix/index.htm

Another resource geared towards bioinformatics: http://manuals.bioinformatics.ucr.edu/home/linux‐basics

Reference for commonly useful commands: https://sites.google.com/site/anshulkundaje/inotes/programming/shell-scripts

Learning shell programming: http://www.learnshell.org/

Debugging shell scripts: http://www.shellcheck.net/