# Running on NERSC

Let's try to run `displayRank.jl` at NERSC. You should log into `cori.nersc.gov` now and we'll set things up.

## One time setup

Perform the following steps once to set things up

### Clone this git repository
In your NERSC home area, do 
```bash
git clone https://github.com/HighVelocityJuliaAnalysis/TryMPI.git
```

That will download this repository to your home area.

### Prepare Julia

We need to prepare Julia to install the packages we'll use. Do the following...

```bash
cd TryMPI/notebooks
julia --project
```

Once Julia starts, do the following in package mode (the `]` puts Julia in package mode).
```julia
]
instantiate
build --verbose
```

The `build --verbose` command will produce a bit of output. You should seee a line with `Info: using systme MPI`. If you don't see that line, then something is wrong.

Exit Julia with Ctrl-d.

## Running interactively

`cd` back to `~/TryMpi/notebooks` if you aren't already there. 

On your laptop, we used `mpiexecjl` to run Julia scripts with MPI. On Cori, we instead use a command called `srun`. 

You are logged into a Cori *login* node. The login nodes are used to write code and prepare software. MPI programs will not run on the login node. You can see for yourself with,

```bash
srun -n 4 julia --project displayRank.jl
```

You'll likely get an error like,

```
srun: error: No architecture specified, cannot estimate job costs.
srun: error: Unable to allocate resources: Unspecified error
```

To run Julia with MPI interactively, we need to request an *interactive worker node*. The worker node is a Cori node used for running jobs and can run MPI. You can request a node with,

```bash
salloc --qos=interactive -C haswell --time=10 --nodes=1
```

You are requesting one *Haswell* node for 10 minutes of interactive use. Haswell is the type of machine you want. Cori has two choices, *Haswell* and *KNL*. The Haswell nodes are faster and have more memory, so we'll use that. 

It can take while for the `salloc` command to run. You should eventually someething like,

```
salloc: Pending job allocation 61073043
salloc: job 61073043 queued and waiting for resources
salloc: job 61073043 has been allocated resources
salloc: Granted job allocation 61073043
salloc: Waiting for resource configuration
salloc: Nodes nid00284 are ready for job
lyon@nid00284:~/TryMPI/notebooks> 
```

Note that your command prompt has changed. It doesn't say `cori` anymore. It will now say the name of the worker node (`nid00284` in this case). 

You may now try,

```bash
srun -n 4 julia --project displayRank.jl
```

You may request up to 64 ranks with the `-n` option. Try `calc_e_gather.jl` and `calc_e_reduce.jl` too. 

When you are done with the interactive worker node, you may type `exit` to go back to the Cori worker node. 

Or, when your 10 minutes are up, you will be logged out of the worker node automatially. You'll see a message like,

```
salloc: Job 61073043 has exceeded its time limit and its allocation has been revoked
```

If you want more time, issue the `salloc` command again. 





## Running batch jobs

For running big programs, we'll use batch jobs instead of interactive nodes. Will write this tomorrow. 