# Distributed computing on HPC

To scale up computation, it can be relevant to transfer the computational load
to a remote server such as a high performance computing (HPC) cluster.

## Installation
1. Connect to the remote server with ssh from your local machine:
```
ssh <REMOTE_USER>@<REMOTE_HOST>
```
2. Install  miniconda
```
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh
conda init tcsh
```
3. Create a conda environement and install Jupyter Lab 
```
conda create -n dasktest
conda activate dasktest
conda install jupyterlab nodejs ipywidgets -c conda-forge -y
# optional packages:
# conda install scikit-image matplotlib pandas -y
```
4. Register the jupyter kernel
```
python -m ipykernel install --user --name dasktest
```

## Connecting to a Jupyter notebook running on a remote server
We want to run a jupyter on the server from a local computer, to do so we need
to configure an python environment on the remote server, then we can:

1. Open a ssh tunnel using the same port than the notebook by runing on the local
machine:
```
ssh -L 8080:localhost:8080 <REMOTE_USER>@<REMOTE_HOST>
```
This command opend a interactive session on the cluster.
2. Start a jupyter lab server from the environment
```
conda activate dasktest
jupyter lab --no-browser --ip="*" --port 8080
```
3. Connect to the notebook by opening a browser and navigating to
http://localhost:8080 or use the link provided http://localhost:8888/lab?token=
Alternatively, you can use a remote server in visual code by clicking on
Jupyter Server:Local in the task bar and paste the link http://localhost:8888/lab?token=
when prompted. New kernel will then be visible.

At this point, we have an openned terminal connected to the cluster with
jupyter lab running. We also have either a web browser tab or a Visual Code displaying a notebook.

At  the end of the session, we need to stop Jupyter Lab by pressing CTRL-C in the
terminal running Jupyter Lab or using the File>Shutdown in the jupyter lab 
interface. Then logout from the terminal to stop the session.

## Using Dask distributed
Dask allows to perform parallel and distributed in python using well know data
structures such as numpy's ndarray and pandas's dataframes. Additionally we can 
use [dask-jobqueue](https://jobqueue.dask.org/) to manage the connection to a job 
scheduler such as SLURM.

We need to install dask on the remote computer and the extensions for jupyter lab:
```
condata activate dasktest
conda install dask distributed -c conda-forge
pip install dask_labextension
jupyter labextension install dask-labextension
jupyter labextension install @jupyter-widgets/jupyterlab-manager
```

Open a notebook on the remote computer and create a cluster scheduler:

In [None]:
from dask_jobqueue import SLURMCluster
from dask.distributed import Client, progress
cluster = SLURMCluster(
     cores=1,
     memory='64GB',
     shebang='#!/usr/bin/env tcsh',
     processes=1,
     local_directory='/ssd',               
     walltime='02:00:00',
)
cluster.adapt(maximum_jobs=20)

Create a client to connect to the scheduler and display the client. This will
print a link that you can copy paste in the Juypter lab dask extension tab in 
order to monitor the active processes.

In [None]:
client = Client(cluster)
client

One typical example is to load a list of files to process in a data frame. 

In [89]:
import pandas as pd
from pathlib import Path
folder = Path('../data')
# load a Dask Data Frame listing the files and additional informations
exp = pd.read_csv(folder/'experiment.csv')
exp

Unnamed: 0,filename,condition
0,file1.tif,WT
1,file2.tif,WT
2,file3.tif,Treated
3,file4.tif,Treated


Using Dask, we can then map each entry to be processed in parallel.

Note that calling dask.delayed on function loading array using dask will load 
the all file each time. Here we lazily load the images before hand and process
them one by one.

In [None]:
import nd2
import dask

# load all images lazily
imgs = [nd2.imread(folder/f, dask=True) for f in exp['filename']]

# process each image
def process_image(img):
    return img.mean(), img.std()    

# create tasks for each file
tsk = [dask.delayed(process_image)(img) for img in imgs]

# run the tasks
result = dask.compute(tsk)


If we want to store the result in a pandas' data frame, it can be convenient to
 map a function to the input list of files.

In [106]:
import dask.dataframe as dd

# define the func to process blocks of the dataframe
def process_rows(df):
      """Process rows of the data frame"""
      result = []
      for x in df.itertuples():            
            # retreive the line of the input data frame
            # for example we could open a file and process it
            fname = x.filename
            m = 1
            # create a data frame, note that values must be lists or you need 
            # to pass an index            
            result.append(pd.DataFrame({'filename':[fname], 'mean':[m]}))
      return pd.concat(result,ignore_index=True)


# schedule the computations
ddf = dd.from_pandas(exp, chunksize=1).map_partitions(process_rows,
                            meta={'filename':'object', 'mean':'f'})

# compute the values
res = ddf.compute()
# merge the new columns to the original table
exp.merge(res, on='filename')

Unnamed: 0,filename,condition,mean
0,file1.tif,WT,1
1,file2.tif,WT,1
2,file3.tif,Treated,1
3,file4.tif,Treated,1


How to read a tiff as a delayed dask array:

In [49]:
import tifffile
import dask.array

class imgtiffdask():
    def __init__(self,fname):
        self.store = tifffile.imread(fname, aszarr=True)
        self.array = dask.array.from_zarr(self.store)
    def __del__(self):
        self.store.close()
    def __getitem__(self,pos):
        return self.array[pos]

img = imgtiffdask('../scratch/tmp.tif')
img.array
    
    

Unnamed: 0,Array,Chunk
Bytes,6.10 MiB,312.50 kiB
Shape,"(20, 200, 200)","(1, 200, 200)"
Count,21 Tasks,20 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 6.10 MiB 312.50 kiB Shape (20, 200, 200) (1, 200, 200) Count 21 Tasks 20 Chunks Type float64 numpy.ndarray",200  200  20,

Unnamed: 0,Array,Chunk
Bytes,6.10 MiB,312.50 kiB
Shape,"(20, 200, 200)","(1, 200, 200)"
Count,21 Tasks,20 Chunks
Type,float64,numpy.ndarray


In [48]:
img = imgtiffdask('../scratch/tmp.tif')
img[:]

Unnamed: 0,Array,Chunk
Bytes,6.10 MiB,312.50 kiB
Shape,"(20, 200, 200)","(1, 200, 200)"
Count,21 Tasks,20 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 6.10 MiB 312.50 kiB Shape (20, 200, 200) (1, 200, 200) Count 21 Tasks 20 Chunks Type float64 numpy.ndarray",200  200  20,

Unnamed: 0,Array,Chunk
Bytes,6.10 MiB,312.50 kiB
Shape,"(20, 200, 200)","(1, 200, 200)"
Count,21 Tasks,20 Chunks
Type,float64,numpy.ndarray
