# Basic Linux

## The Commandline

The commandline or 'terminal' is a text-based interface you can use to run programs and analyse your data.  If this is your first time using one it will seem pretty daunting at first but, with just a few commands, you'll start to see how it helps you to get things done more efficiently.

## Getting started

When working on the commandliine you will always work in a specific location or directory in the filesystem. So let's check that you're in the right place. Type the command below in the terminal window followed by the `Enter` key:

In [None]:
pwd

The `pwd` command stands for print working directory. It should display something similar to:

`/home/manager/course_data/linux/data`

This means you are working in the `data` directory found inside the `linux` directory inside the `course_data` directory and so on. There will be more on the linux filesystem structure shortly.

Now continue through this tutorial, entering any commands that you encounter (highlighted in a grey box with a keyboard symbol) into your terminal window followed by the `Enter` key. Let's start by listing the contents of the current working directory:

In [None]:
ls

You should see 6 items:

![Output of the list command](images/ls_output.png)

## Commands, options and arguments

A command consists of three parts: the _command_, _options_, and _arguments_. Of these, the command is required while options and arguments are optional. Options are used to modify the default behavior of the command. Arguments are used to provide input to the command. 

`command -options arguments`

In the previous call to `ls`, the command is `ls` and there are no options or arguments passed to the `ls` command.

Now let's add some options, here the command is `ls`, and `-a` and `-l` are options:

In [None]:
ls -a -l

**Note** that there is a space between the command `ls` and the options `-a` and `-l`. There is no space between the dashes and the letters `a` and `l`.

The `-a` option will list _all_ contents of a directory including hidden files and directories (hidden files and directories are not shown by default and are used to help important data from being deleted) and the `-l` option will print additional information about each item in the directory (size, owner, date changed etc.).

You can also combine the `-a` and `-l` options into what appears to be a single `-al` option. It's almost always ok to do this for options which are made up of a single dash followed by a single letter.

In [None]:
ls -al

Now let's add an argument, here the command is `ls`, the options are `-a` and `-l` and `basic/Pfalciparum` is the argument:

In [None]:
ls -al basic

The argument instructs the `ls` command to display the content of the specified directory, in this case the `basic` directory instead of the current directory.

There are some general points to remember that will make your life easier:

* Linux is case sensitive - typing `ls` is not the same as typing `LS`.
* `ls` is the letter `l` followed by the letter `s` (not the number one).
* Often when you have problems with Linux, it is due to a spelling mistake. Check that you have not misspelled a command or missed a space between the command and the options and arguments. Pay careful attention if typing commands across multiple lines.

## Files and directories

_Directories_ are the Linux equivalent of folders on a PC or Mac. They are organised in a hierarchy, so directories can have sub-directories and so on. Directories are very useful for organising your work and keeping your account tidy - for example, if you have more than one project, you can organise the files for each project into different directories to keep them separate. 

The start of the filesystem is known as the root directory and denoted with `/` symbol. The location or directory that you are in at any given time is referred to as the current working directory denoted with the `.` symbol. 

Every file and directory on the computer is found in a specific location. The location is specified as a path to that file or directory and can be expresed as either the _relative path_ or _absolute path_.

The _relative path_ is the location of a file or directory from the current working directory (.).

The _absolute path_ is the location of a file or directory from the start of the file system or root directory (/).

For example, for the file called `directory_structure.png` under the `linux` directory, the location or _relative path_ can be expressed as:

In [None]:
ls ./basic/directory_structure.png

The location of this file can also be expressed using the _absolute path_ as:

In [None]:
ls /home/manager/course_data/linux/data/basic/directory_structure.png

![Directory structure](images/directory_structure.png)

## tree - display the directory hierarchy

The command `tree` can be used to recursively list or display the content of a directory in a tree-like format. It outputs the directory paths and files in each sub-directory and a summary of a total number of sub-directories and files.

To display the contents of the current directory, type:

In [None]:
tree

In [None]:
tree /home/manager/course_data/linux

## cd - change directory

The command `cd` stands for change directory. The `cd` command will move you from the current working directory to another directory.

To move into the `basic` directory type the following command. Note, you'll remember this more easily if you type this rather than copying and pasting.

In [None]:
cd basic

Now look at the directory hierarchy of this directory:

In [None]:
tree

## Getting help man

It is not possible to remember all the options and arguments for all Linux commands! A useful command to obtain further information on any Linux command is the `man` command. For example, to get a full description and examples of how to use the `find` command type the following command in a terminal window.

In [None]:
man find

To exit out of the man command type `q`.

## find - find a file

The `find` command can be used to find files in a directory hierarchy that match a specific criteria. For example, it can be used to recursively search the directory tree for a specific file name or pattern, seeking files and directories that match the given name or pattern.

To find all files in the current directory (.) named `Pfalciparum.fa`:

In [None]:
find . -name "Pfalciparum.fa"

To find all files in the current directory (.) and all its subdirectories that have the suffix `.bed`:

In [None]:
find . -name "*.bed"

How many bed files did you find?

Can you construct a command to find all the subdirectories contained in the current directory? **Hint:** You may need to go back to reading the manual page for the `find` command and search for the `-type` option. Don't spend too much time on this as the solution will be provided and explained later. 

How many subdirectories did you find?

## Tab completion

Typing out directory or file names is really boring and you're likely to make typos which will at best make your command fail with a strange error and at worst overwrite some of your carefully crafted analysis.  _Tab completion_ is a trick which normally reduces this risk significantly.

List the contents of the Styphi directory:

In [None]:
ls Styphi

Now instead of typing out `ls Styphi/`, try typing `ls S` and then press the `tab` key (instead of `Enter`). The rest of the folder name should just appear.  

If you have two files or directories with simiar names (e.g. `directory_structure.png` and `directory_structure2.png`) then you might need to give your terminal a bit of a hand to work out which one you want.  Type the following and then press the `tab` key (instead of `Enter`).

In [None]:
ls -l d

In this case, when you press `tab` the terminal reads `ls -l directory_structure`, you then need to type `2` followed by another `tab` and it works out that you meant `directory_structure2.png`.

## Tips

There are some short cuts for referring to directories:

* . Current directory (one full stop)
* .. Directory above in the hierarchy (two full stops)
* ~ Home directory (tilde)
* / Root of the file system (like C:\ in Windows)

Try the following commands, what do they do?

In [None]:
ls .

In [None]:
ls ..

In [None]:
ls ~

## Exercises

Many people panic when they are confronted with a Linux prompt! Don’t! All the commands you need to solve these exercises are provided above and don't be afraid to make a mistake. If you get lost ask an instructor. If you are a person skilled at Linux, be patient this is only a short exercise.

To begin, open a terminal window and navigate to the `basic` directory under the the `linux` directory (remember use the command `cd`) and then complete the exercise below.

1. List _all_ the contents of the `basic` directory. How many files did you find?
2. How many files are there in the `Styphi` directory?
3. How many files are there in the `Pfalciparum` directory?
4. Use the `find` command to find all gff files in the `linux` directory, how many did you find?
5. Use the `find` command to find all fasta files in the `linux` directory, how many did you find?

When you have completed these exercises move on to the next part of the tutorial, [grep and awk](grep_and_awk.ipynb). 