# Storage

## Overview:
- **Teaching:** 10 min
- **Exercises:** 0 min

**Questions**
- What storage is available?
- How does this affect my workflow?
- Where should I keep my data?


**Objectives**
- Understand the storage structure of the new system.
- Know where to keep your data, and how to stage data for runs.
- Understand the costs associated with storage.


## Filesystem on Phase 1

The filesystem on Janus, the phase 1 cloud HPC system, has the following layout:

![storage](../images/storage.png)

The areas of the filesystem you should know about are:

* ```/shared/home```: 2TB Azure Managed Disk. Each user is given a home folder, with a strict quota of 5GB. You can check your quota usage with ```quota -s```. This area can be accessed from both the login and compute nodes. This disk is backed up daily, and could be used for storing e.g.:
    - Code/scripts for your calculations
    - Template jobscripts

* ```/campaign/```: 16TB Azure NetApp Files Standard. This is mid performance filesystem which should be used for files required during your current project workflow, for example any datasets you require to complete the current project thrust you are working on. It is <font size="3">**not backed up**</font>, and should not be used for storing important results. You should ensure you have the capacity elsewhere to back up any important data. This space should be used for staging data to the ```/scratch/``` space during a calculation, which should then be moved off the ```/scratch/``` area and back to campaign following completion of the run. This space can be accessed from both the login and compute nodes.

* ```/scratch/```: 16TB Azure NetApps Files Ultra. This is a high performance filesytem which can be used during your calculation runs. There is no quota, **however** because of the high associated cost with this storage type there is limited space for all users. You should stage your data required for a run from ```/campaign/``` into this space. Data should only be left on ```/scratch/``` during an active run, and should be copied back to ```/campaign``` following its successful completion. It is <font size="3">**not backed up**</font> and should not be used for storing data. This psace can be accessed from both the login and compute nodes.

* ```/apps/```: 1TB Azure managed Disk. This space contains centrally compiled software for use of users.

* ```/u/```: The University `H` drive is mounted and can be accessed from the login nodes only.

* ```/mnt/resource/```: Some compute instances have fast local storage located in this area, only accessible from the compute node. e.g. HBv3 (450GB), HBv2(450GB), HB(450GB), HC44(650GB) 


## Exercise: Quota and disk usage

We saw earlier that you have a 5GB quota for your home area.  Check your home area quota using the following commands:-

```bash
$ quota
$ quota -s
```

You can also use the df command we saw earlier to check partition size and usage information
```bash
$ df -h
```

### Environment Variables

There are several environment variables to help you navigate the filesystem.

*```$HOME```: /shared/home/<username> your home directory

*```$SCRATCH```: /scratch/<username> your scratch directory
    
*```$CAMPAIGN```: /campaign/<username> your campaign directory

*```$BUCSHOME```: /u/u/<username> your University `H` folder

## Exercise: Go home

How can we use the environment variables we've used above to return to our `home` directory?

[Solution]()

## Solution: Go home

We can use the `$HOME` variable in exactly the same way as we used `$SCRATCH`:

```bash
cd $HOME
pwd
```