![logo](../../_static/images/NCI_logo.png)

-------

# Setup Pangeo Environment


In this notebook:

- Load Pangeo module
- Activate pangeo enviornment
- Run a jupyter notebook
- Open the port at your local host to work on your notebook remotely
- Visulise job dashboard

The following material uses Coupled Model Intercomparison Project (CMIP6) collections. The CMIP6 terms of use are found [here](https://pcmdi.llnl.gov/CMIP6/TermsOfUse/TermsOfUse6-1.html). For more information on the collection, please [click here](https://geonetwork.nci.org.au/geonetwork/srv/eng/catalog.search#/metadata/f6600_2266_8675_3563).


**Pangeo** is a community platform for big data in geoscience, funded by US NSF. Pangeo project serves as a coordination point between scientists, software and computing infrastructure. The Pangeo software ecosystem involves open source tools such as xarray, iris, dask, jupyter, and many other packages. [This site](http://pangeo.io) provides guidance for accessing data and performing analysis using these tools. NCI has installed the Pangeo environment on Raijin by following instructions [here](http://pangeo.io/setup_guides/hpc.html). Please note that Pangeo will be transferred to Gadi when major Raijin/Gadi transition happens in Nov/Dec. It will not be available on Raijin from transition period onwards. This notebook provides instructions on how to use the Pangeo environment to run your jupyter notebook locally and interact with Raijin remotely on Raijin. 

### Load Pangeo module from Raijin and activate Pangeo environment

```
$ module load pangeo/2019.10
$ source ${PANGEO_ROOT}/etc/profile.d/conda.sh
$ conda activate pangeo
```
You will see pangeo appear in the bracets in front of the promt sign. You can quit the enviornment using **conda deactivate**

```
$codna deactivate
```

![1](images/pangeo_setup1.png)

If you ask where your Python command lives, it should direct you to where pangeo was installed on Raijin. 

![2](images/pangeo_setup2.png)



### Configure Jupyter

Run the following two lines of command.

```
$ jupyter notebook --generate-config
$ jupyter notebook password
```
It will promote you to enter a password for opening jupyter notebook on your local machine later. You can simply type a password and you need to remember it!

If the command does not work (often in older versions of Jupyter), there are [instructions](http://pangeo.io/setup_guides/hpc.html) on how to set up step-by-step.

### Start a Jupyter Notebook Server

First create a directory where you will run the jupyter notebook, let's call it <home_dir>/tutorial. 

Let's submit a job first to get it going. You can create a shell script by copying the following commands into a script file. Let's name it as run_ipynb_job.sh. Or you can download the example script here. We request 2 notes with 32 CPU and 64GB memory in this instance. Further instructions  about job submission and running jobs on Raijin can be found [here](https://opus.nci.org.au/display/Help/Running+Jobs).

**You can modify your project name and project code as needed in the first two lines.**

```
#!/bin/bash
#PBS -N pangeo_test
#PBS -P fp0
#PBS -q express
#PBS -l walltime=5:00:00
#PBS -l ncpus=32
#PBS -l mem=64GB
#PBS -l jobfs=100GB
module load pangeo/2019.10
pangeo.ini.all.sh
sleep infinity
```

![3](images/pangeo_setup3.png)


Once the job is complete, there are two files appearing in your current directory. 

* client_cmd
* scheduler.json

![4](images/pangeo_setup4.png)

Note the port number underlined in the screenshot will be needed when interacting Raijin from your local computer later.

### Launch the jupyterlab on your local computer

Open a termial in your local computer. Copy and past the content of the client_cmd in the command line. They are actually two commands login into Raijin from your local computer. 

![5](images/pangeo_setup5.png)

Open a web brower, type the following and enter. The jupyter notebook port number is 8343 in this example. 

**Don't copy this number as it might be different in your case!**

```
localhost:8343
```

![6](images/pangeo_setup6.png)

Then it will prompt the password. Type the password that you set up in the second step in this tutorial. 

![7](images/pangeo_setup7.png)

Once your authentication passed, a jupyterlab interface will be launched in a few seconds.

![8](images/pangeo_setup8.png)

Now you are ready to run your own notebooks.

## IMPORTANT NOTES

Please make sure the following two lines are added at the beginning and the end of the notebook.

```
# start the dask client
client =  Client(scheduler_file='scheduler.json')
 
# stop the pbs job.
! pangeo.end.sh
```


### Let's import a notebook example

You can drag and drop a notebook from your local computer into this Jupyterlab. Then the file will also appear in your working directory in Raijin. 

![9](images/pangeo_setup9.png)

The screen shot above shows

- left: jupyter notebook interface
- up right: local dir where a notebook is dragged and dropped into the Jupyterlab
- down right: Raijin command window showing the notebook appears instantly

### View the DASK job dashboard

Open a new tab in the web browser, type the following, the second port in the client_cmd file. 
If the job starts running, you should be able to see the dynamic resources of the processing.

```
localhost:8890
```
![10](images/pangeo_setup10.png)


### Reference

- http://pangeo.io