## Command line basics

###WHY?

- because when you work remotely on a server, that's all you have
- because most of the bioinformatics programs are command-line driven
- because when used properly it's so much faster than any GUI
- because DRY (don't repeat yourself!)

### UNIX philosophy

- write small programs that do _one_ thing and do it well
- write programs to communicate with each other so that output of one program is the input for another
- write programs to communicate in plain text (because it's the only universal interface)

Common paradigm:

`<verb> [modifiers] <subject>`

for example:

* `ls` - list directory (current by default)
* `ls dir` - list directory `dir`
* `ls -lah` - list directory `dir` with details, including hidden files, and in human-readable format

Up arrow and down arrow scroll through command history.

*Very important!*

* `<Ctrl-R>` - reverse history search
* `<Tab>` - completion: never have to type the whole thinga
* `<Alt-.>` - inserts last argument from the history

#### Aside:

Why whitespace in file/directory name is a bad idea:
- needs to be escaped with backslash _or_ the entire argument needs to be quoted

## Getting around

Path: `/<dir1>/<dir2>/<dir3>/foo.bar`
for example: `/home/ilya/src`

Absoute path:

* starts with `/`, aka `root`. Location relative to filesystem `root`.

Relative path:

* doesn't start with `/`. Location relative to the curent directory.

###Aside

### Commands

* `pwd` - print working directory
* `cd <dir>` - change directory
* `mkdir [options] <dir>` - make new directory
* `ls [options] [<dir>...]` - list directory

Examples:

### Shortcuts

These are huge time savers. But they are nothing more than aliases.

* `~` - current user's home directory (`/home/<user>/` or `/Users/<user>/` on Mac OS)
* `.` - current directory
* `..` - parent directory
* `-` - last directory (although in most contexts it means `stdin`)

Other useful things:

- `pushd <dir>` - pushes directory `<dir>` into stack
- `popd` - pops the last pushed directory from the stack

### Path and executables

Files that have `x` bit set in their permission are executable. These can be executed by typing their name at the prompt:

    $ /home/vasyapupkin/myprog1
    $ ./myprog1
    $ /bin/myprog1
    
or they can be executed by typing just their name at the prompt _if_ their location is listed in `PATH` variable:

    $ echo $PATH
    $ myprog1
    
if unsure, use `which` programm to find the executable (if it exists!):

    $ which python

## Lookin at things (well, files)

- `cat` will output its arguments to `stdout`
- `less` will do the same but in a humane way (pagination, search, scrolling, etc)
- `man` displays a help page for a given command
- `head` outputs n first lines in a file
- `tail` outputs n last lines in a file

## Creating, copying and moving stuff

### Create

Create a file (actually, change the file's timestamp):

    touch <filename>

Create a directory:

    mkdir <dirname>

usual path rules apply (see absolute vs relative paths). Fancy switch `-p`:

    mkdir -p path/to/my/new/dir
    mkdir -p path/to/{one,two,three}

### Copy

Copying stuff:

    cp <source> <destination>

by default, `cp` only copies regular files and skips directories. To copy directories use `-r` (recursively) option:

    cp -r <source_dir> <destination>

but watch for that trailing slash:

    cp -r <source>/ <destination>
    
behaves differently. Why?

Globbing works as one would expect:

    cp <source>/*.txt <destination>
    
will copy all files ending with .`txt` to `<destination>`

### Move

How to move stuff?

    mv <source> <destination>

But what if we want to move a bunch of stuff?
Sure this should work:

    mv <source>/*.txt <destination>
    
but it doesn't. WTF?

Cheating way: install `rename` programm. 
Won't work if you don't have admin rights.

Proper way: loop

    for f in *.txt; do mv $f <destintaion>; done

_HINT_: for a dry run replace `mv` with `echo`

## Selecting what to show (filtering)

### Globbing (aka wildcards)

- `?` matches one (any) character
- `*` matches any number of any characters _except_ OS seprator (`/, .`)
- `**` matches _any_ number of _any_ characters

### grep

`grep` stands for Global Regular ExPression. Regular expressions `regex` is an advanced and powerful way to match patterns.
`grep` can be thought of as a very versatile and efficient filter that can be configured to pass through only results you want.

## Some plumbing: pipes, redirects and tee

* `|` (aka pipe) - sends the output of the left program to the input of right program
* `tee` - same as `pipe` but at the same time saves the output of the left command into a file
* `>` - redirects the output of the programm to a file (overwriting the file if it exists)
* `>>` - same as `>` but _appends_ to the file if it exists

## Text processing and analysis

## Practical things

### Downloading stuff from Internet

`wget` - loads of options and protocols supported. Read manpages for all options.

Let's use it to download E.coli `.gff` file from NCBI (http://www.ncbi.nlm.nih.gov/genome/167):

    wget ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF_000005845.2_ASM584v2/GCF_000005845.2_ASM584v2_genomic.gff.gz
    
and make sure it's where you expect it to be:

    ls -lah *.gff.gz

### Working with compressed files

Most NGS data formats are text based and, therefore, are highly compressable. For instance, gzipped `.fastq` file can take 10-20% of the original space.

* `gzip` - compresses the file
* `gunzip` - uncompresses the file

By default both `gzip` and `gunzip` delete the original. To keep original file use `zcat` or `-c` flag for `gzip/gunzip`

It's a perfect usecase for `pipes`, so let's dig right in.


### Workin with compressed directories

`tar` is all you'll ever need

## Putting it all together

## Permissions

### Optional: editing files (`nano` and `vim`)