# Workshop on interactive analyses on NESH

## Jupyter on HPC

### Setup of Jupyter env

Follow [https://git.geomar.de/python/jupyter_on_HPC_setup_guide](https://git.geomar.de/python/jupyter_on_HPC_setup_guide#install-the-base-environment-and-jupyterlab) to install Conda-based environment with JupyterLab.
Once `conda` is installed, the essential thing to to is:

```shell
# on the HPC system
conda install -n base jupyterlab nb_conda_kernels
```

Then, add a conda environment containing the software used in your analysis:
```shell
# on the HPC system
conda create -n python3_env -c conda-forge python=3.7 dask distributed dask-jobqueue matplotlib numpy scipy zarr ipykernel
```
_(**Note** the `ipykernel` package which allows for using this env as a Jupyter kernel.)_

### Test by starting JupyterLab on frontend

_(**Don't** work like this in production.)_


On the frontend, start JupyterLab from the `base` environment:
```shell
# on HPC system
conda activate base
jupyter lab --ip 127.0.0.1
```

And open a web browser that proxies through an SSH tunnel (to the same frontend).
[There's a script for this,](https://git.geomar.de/python/jupyter_on_HPC_setup_guide#wrapped-in-a-script) but the essentials are:

```shell
# on your laptop
ssh -f -D localhost:54321 user@host.example.com sleep 15
chromium-browser --proxy-server="socks5://localhost:54321"
```

### Start Jupyter on a compute node

[Check the guide for the job script.](https://git.geomar.de/python/jupyter_on_HPC_setup_guide#start-jupyterlab-on-a-compute-node-of-an-hpc-centre)

On the HPC frontend, submit a job (noting the job-id):

```shell
# on the NESH frontend
qsub nesh-linux-cluster-jupyterlab.sh \
    -l elapstim_req=<hh:mm:ss> \
    -b <node-no> \
    -l cpunum_job=<cpu-no> \
    -l memsz_job=<mem-size> \
    -q <batch-class>
```

And check if JupyterLab is ready (and learn about the URL to connect to):

```shell
# on the NESH frontend
bash nesh-linux-cluster-jupyterlab.sh <jobid>
```

As soon as you got the JupyterLab URL, connect to it via the tunneled browser
```shell
# on your laptop
bash run_chromium_through_ssh_tunnel.sh user@host.example.com <URL-from-JupyterLab>
```


## Example Problem for the rest of the first session

Calculate the number $\pi$ using a Monte Carlo method.

We'll tackle the problem in different ways:

- using a single CPU with Numpy

- with Dask and parallelized on a single node or frontend

- with Dask on many compute nodes