# HPC intro

## Files and directories

High-Performance Computing (HPC) systems typically run a flavor of the Linux operating system.  Although you can up to a point interact with these systems via a GUI it is still very useful (and likely more productive) to know your way around the terminal.

In this tutorial, you will learn the basics of how to work with files and directories in the terminal, and hence also in Bash scripts.  Indeed, the commands that you execute in the terminal are the same as you will use to write job scripts that run on the compute nodes of the HPC clusters you have access to.

### Your home directory

When you log in to our HPC system, and to any Linux system for that matter, your session will start in your home directory.  However, since this tutorial is hosted by Open OnDemand, you willstart this session in your data directory.  Execute the following command to change to your home directory.  Details will be discussed later.

In [None]:
cd

### ls: directory listings

You can check the contents of a directory with the `ls` (list) command.

In [None]:
ls

You likely don't see any output, but that is normal.  When you just start out as a new user, you don't have any files or directories.  Well, actually, you do.  Some files and directories are hidden because their names start with a '.'.  You can show these using the `-a` (all) option of the`ls` command.

In [None]:
ls -a

Please don't modify these files (unless you *really, really* know what your are doing) or delete them.  When you make mistakes, you may not be able to login anymore.  Support staff will pretend to be sympathetic, but they will make fun of you behind your back.

`ls` and other Bash commands often have many options to fine-tune their behaviour.  As mentioned, `-a` will reveal hidden files, while `-l` (long) will show more information on the files and directories such as the permissions, the owner and group, the file size and the last modification date.

You can combine those options, e.g., `-a -l`.

In [None]:
ls -a -l

### man and --help: getting help

Although you will soon have memorized the most commonly used options for various commands, you may need information on them.  For most linux commands, this is straightforward: you can use the `--help` option for many, the `-h` option for others.

In [None]:
ls --help

The Linux system comes with documentation as well, you can access that using the `man` (manual) command.

In [None]:
man bash

For many commands there will even be a list of examples to get you started.

### Editing files

There are many ways to create files.  You can edit files directly on the system itself, or you can do that on your own laptop and desktop, and then transfer them to the HPC infrastructure.  In this section, we will concentrate on the first option, creating a file on the HPC system.

Again, there is a plethora of options of which we will mention only three:
  * `nano`: a quite friendly editor that is straightforward to use;
  * `vim` or `emacs`: veritable powertools with many features, but a considerable learning curve.

For now, we will start with `nano` as it is the most accessible.  In the screencast below you will see how to do that.

Click to watch the [**video**](https://youtu.be/UZefby-10u4).

When you've watched the screencast, open a terminal and try it yourself.  If you don't know how to open a terminal, watch the video below.

Click to watch the [**video**](https://youtu.be/MzmxjQ6tT9E).

Use `nano` to create a Bash script `hello.sh` that writes "hello world!" to standard output.  When you are done and run `ls`, you should now see the script.

In [None]:
ls

Of course you may wonder why you can't simply use the editor that comes with Jupyter lab to create and edit your files, and in theory, you could.  However, this is not the most ideal way to access the infrastructure, and we only use it for this tutorial since it is pretty expensive.

### less: viewing files

Often, you quickly want to view the contents of a file without modifying it.  Here, the `less` command is your friend.  Technically, it is called a pager, since it allows you to view a (long) file one page at the time.  Watch the video below to see it in action.

Click to watch the [**video**](https://youtu.be/_lsUZc8Qb3k).

Remember, to exit less, just press the 'q' key.  Head over to the terminal and try it yourself to view the contents of the `hello.sh` file.

### mkdir: create a directory

To organize your files and your work, you typically use folder, also called directories.  It is trivial to create a new one using the `mkdir` (make directory) command.

In [None]:
mkdir my_project

When you list the contents of your home direcgtory, you will now see the directory `my_project` as well.

In [None]:
ls

Note that you will get an error when you try to create a directory that already exists.

In [None]:
mkdir my_project

### mv: move and rename

You are probably used to moving files around your computer by dragging them from one directory to another using the mouse.  In a terminal, you can use the `mv` (move) command for that purpose.

For instance, suppose that our `hello.sh` script is part of the project for which you created the `my_project` directory, you can move it there easily by using the file name `hello.sh` and the destination directory `my_project`.

In [None]:
mv hello.sh my_project

When you list the contents of the directory, you see that the `hello.sh` file is no longer in the currect directory.

In [None]:
ls

Let's verify that it is indeed in the `my_project` directory by giving an argument to `ls`.

In [None]:
ls my_project

Perhaps the name `my_project` for our directory is a bit too generic, and you would prefer a different name.  This is easy, you can rename files and directories using `mv` as well.  Again specify the current name `my_project` and the new name `hello_project` as arguments to `mv``.

In [None]:
mv my_project hello_project

Checking the content of the directory shows that the name was indeed changed to `hello_project`.

In [None]:
ls

### pwd: where am I?

Sometimes you want to verify in which directory you are, and that is what the `pwd` (present working directory) will tell you.

In [None]:
pwd

As you can see, you are in your home directory.

### cd: navigating directories

To move to another directory, you can use the `cd` (change directory) command.  For isntance, to go into the `hello_project` directory you simply give the destination as an argument to `cd`.

In [None]:
cd hello_project

Use `pwd` to verify that you are indeed in that directory.

In [None]:
pwd

Now, head over to the terminal.  Take a close look at your current prompt.  This is the string of characters after which you type commands.  You will notice that it shows a `~`.  This is an abbreviation for your home directory.  Now execute the `cd hello_project` command in the terminal and note the change to the prompt.  It will now show `~/hello_project`, i.e., your present working direcgtory.


So, just by looking at the prompt in a terminal, you can always see where you are.

You can view the file system as a (mathematical) tree.  The `hello_project` directory is in your home directory, so the latter is the parent directory of `hello_project`.  This induces the notion of "up" and "down": you go up to the parent directory, or from your home directory down into `hello_project`.  From that perspective, it makes sense to call `hello_project` a subdirectory of your home directory.

To move up to the parent directory, you use `..` as a destination.

In [None]:
cd ..

Confirm that you are back in your home directory.

In [None]:
pwd

A useful shortcut, just using `cd` without any arguments will always bring you to your home directory, regardless what your current working directory is.

### Home, data and scratch directory

Besides your home directory, you have access to two more personal directories: the data and scratch directories.

| directory | environment variable | quota | purpose |
|-----------|----------------------|-------|---------|
| home      | VSC_HOME             | 3 GB  | configuration files |
| data      | VSC_DATA             | 75 GB | scripts, data       |
| scratch   | VSC_SCRATCH          | 500 GB | temporary data/instensive I/O |

For a thorough discussion of the purpose, stengths and weaknesses of these various directories, be sure to read the [documentation](https://docs.vscentrum.be/en/latest/access/where_can_i_store_what_kind_of_data.html).

The environment variables mentioned in the table allow you to easily navigate to these directories using the `cd` command.

In [None]:
cd $VSC_DATA

Note the `$` prefixed to the variable name: this lets you use the *value* of the variable named `VSC_DATA`, so don't forget to type that.

Also note that Bash is case sensitive, so upper and lower case matter!

Verify that you are in the `VSC_DATA` directory using `pwd` and list the files.

In [None]:
pwd

In [None]:
ls

You will see a directory "ondemand" as well as this tutorial.

Similarly, you can change to the scratch directory, which is also empty.

In [None]:
cd $VSC_SCRATCH

In [None]:
pwd

In [None]:
ls

This directory will likely be empty, so you won't see any output for the command above.

Change back to the home directory, and move the `hello_project` directory to the data directory.  After that, change to that directory.

In [None]:
cd

In [None]:
mv hello_project $VSC_DATA

In [None]:
cd $VSC_DATA/hello_project

In [None]:
pwd

### cp: copy files and directories

To copy files and directories, you can use the `cp` (copy) command.  Simply specify the name of the file you want to copy and the directory and/or name you want to copy it to.  For instance, you can copy the script `hello.sh` to `bye.sh`.

In [None]:
cp hello.sh bye.sh

If you feel like practising your `nano` skills a bit more, you can now modify `bye.sh` so that it writes "bye world!" to the screen, rather than "hello world!".

You can also copy an entire directoy by adding the `-r` (recursive) flag.

### Running your own script

You created the `hello.sh` and `bye.sh` scripts, but you didn't do anything with them yet.  Of course, you would want to run them.  Since they are Bash scripts, you would use the Bash interpreter to run them.

In [None]:
bash hello.sh

### chmod: change permissions

It would be more convenient not to have to type `bash` each time you want to execute your script.  Remember the first line?  That was:
```bash
#!/usr/bin/env bash
```
That line contains all the information that the script `hello.sh` should be executed as a Bash script by the `bash` interpreter... if the file were executable.  Let's have a look at the file permissions.

In [None]:
ls -l *.sh

Note: `*.sh` is called a "glob", it means all files in this directory that have names ending in `.sh`.

As you can see, the output of the `ls` command starts with `-rw-r--r--`.  These are the permissions of the file.  Actually, it consists of four parts:
  * `-`: this means that `hello.sh` is a file, if it were a directory there would be a `d`;
  * `rw-`: these are the permissions of the owner of the file;
  * `r--`: these are the permissions for members of the group of the file (ignore this for now);
  * `r--`: these are the permissoins for everyone else on the system.

Roughly speaking, there are three permissions:
  * `r`: read permission, i.e., permission to view the contents of the file;
  * `w`: write permission, i.e, permission to modify the file;
  * `x`: execute permission, i.e., permission to run the file.

You can now give execute permission for the user and the group to the Bash scripts.

In [None]:
chmod ug+x *.sh

You can check that the permissions have changed.

In [None]:
ls -l 

In [None]:
bye.sh

### rm: removing files and directories

You perhaps noticed the backup file `bye.sh~`, and maybe you want to get rid of it at some point.  You can remove a file using the `rm` (remove) command.

In [None]:
rm bye.sh~

Note that removing files and directories is typically **irreversable** on a Linux system, so make sure that you don't loose any work when removing something.

Just to practice, you can make a copy of the `hello_project` directory, remove the original, and rename the backup.

First change to the partent directory of the `hello_project` directory.

In [None]:
cd ..

List the files and directories.

In [None]:
ls

Recursively copy the `hello_project` directory is copied to a directory called "hello_project_bak".

In [None]:
cp -r hello_project hello_project_bak

Now you can recursively remove the original directory.

In [None]:
rm -r hello_project

Since this was just an exercise, rename the `hello_project_bak` directory back to "hello_project".

In [None]:
mv hello_project_bak hello_project

## Summary

At this point, you mastered the basics of directory navigation and file management.  You have already encountered:

  * `ls`: showing the contents of your directories;
  * `man`: get information on Linux commands;
  * `nano` and `less`: edit and view your files;
  * `pwd`: show the present working directory;
  * `cd`: change directory, i.e., navigate the file system;
  * `mkdir`: create directories;
  * `mv`: move and rename files and directories;
  * `cp`: copy files and directories;
  * `chmod`: change permissions on files and directories;
  * `rm`: remove files and directories.
  
You also learned about the following concepts:

  * your home, data and scratch directories;
  * how to run your scripts in the terminal.

## Where to go from here?

Of course, there is a lot more to learn, but what exactly depends a little bit on your domain and interests:

  * you want to [run Python scripts](005_running_python.ipynb);
  * you want to [run R scripts](running_r_scripts.ipynb);
  * using [software modules](003_software_modules.ipynb).