## Run a Jupyter Notebook on NYU Cluster: pdf_combine

The purpose of this notebook is to facilitate the use of the [High Performance Computing at NYU](https://wikis.nyu.edu/display/NYUHPC/High+Performance+Computing+at+NYU). Use this Jupyter Notebook to:
- Login to the bastion host __gw.hpc.nyu.edu__ [[Link](https://wikis.nyu.edu/display/NYUHPC/Logging+in+to+the+NYU+HPC+Clusters)]


- Login to the HPC cluster Prince __prince.hpc.nyu.edu__ [[Link](https://wikis.nyu.edu/display/NYUHPC/Clusters)]


- Sync `/home/$USER` and `/beegfs/$USER` file systems [[Link](https://wikis.nyu.edu/display/NYUHPC/Storage)].
    - My virtual env are stored in `/home/$USER/pyenv/epdf_combine`
    
    - My notebooks are stored in `/home/$USER/notebooks/pdf_combine`
    

- Load module Python 3.6.3 [[Link](https://wikis.nyu.edu/display/NYUHPC/Software+and+Environment+Modules)]


- Create or activate a epdf_combine environment


- Configure and submit a bash script [[Link](https://wikis.nyu.edu/display/NYUHPC/Submitting+jobs+with+sbatch)]


- Check status of job [[Link](https://wikis.nyu.edu/display/NYUHPC/Slurm+Tutorial)]


- Concatenate slurm-job_number

    - Get a simple terminal command to login to Prince

    - Get URL to access jupyter notebook
    

- Cancel job and delete slurm-job_number.out


- Repeat sync `/beegfs/$USER` and `/home/$USER` file systems

### Imports

In [None]:
# pip install paramiko

In [None]:
import os, paramiko, re, time, webbrowser
import ipywidgets as widgets
from IPython.display import clear_output, display, HTML

### Bastion, Cluster, User, Password

In [None]:
bastion = 'gw.hpc.nyu.edu'
cluster = 'prince.hpc.nyu.edu'

In [None]:
def on_value_change(change=1):
    global widgets_user
    try:
        chosen1 = widgets_user.value       
    except NameError:
        chosen1 = ""    
    global widgets_password    
    try:
        chosen2 = widgets_password.value       
    except NameError:
        chosen2 = ""          
   
    widgets_user = widgets.Text(value=chosen1, placeholder='Type your NetID', description='NetID:', disabled=False) 
    widgets_password = widgets.Password(value=chosen2, placeholder='Type your password', description='Password:', disabled=False)   
    clear_output()    
    widgets_user.observe(on_value_change, names='name')    
    display(widgets_user, widgets_password)   
on_value_change(change=1)

In [None]:
user = widgets_user.value
passwd = widgets_password.value

### Faction that executes SSH commands using [Paramiko](http://www.paramiko.org/)

In [None]:
def ssh_comm(cluster, user, passwd, command = ""):
    client = paramiko.SSHClient()
    client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    client.connect(cluster, username=user, password=passwd)
    ssh_session = client.get_transport().open_session()
    if ssh_session.active:
        ssh_session.exec_command(command)
    return

### Log in to bastion host

In [None]:
ssh_comm(bastion, user, passwd)

### Sync between `/home/$USER` and `/beegfs/$USER` 
Sync the virtual env and the notebooks. The distination (`/beegfs/$USER`) is going to mirror the source directory (`/home/$USER`). __Extraneous files from destination directory are going to be deleted.__ Enter "Yes" to continue.

In [None]:
person = input('Be careful! Do you want to sync /home and /beegfs? ')
if person == "Yes":
    ssh_comm(cluster, user, passwd, """rsync -av --delete /home/$USER/pyenv/epdf_combine/ --rsync-path="mkdir -p /beegfs/$USER/pyenv/epdf_combine/ && rsync" /beegfs/$USER/pyenv/epdf_combine
    rsync -av --delete /home/$USER/notebooks/pdf_combine/ --rsync-path="mkdir -p /beegfs/$USER/notebooks/pdf_combine/ && rsync" /beegfs/$USER/notebooks/pdf_combine""")

### Create or activate a pdf_combine environment in `/beegfs/$USER`

In [None]:
person = input('Do you want to create a new virtual env?: ')
if person == "Yes":
    ssh_comm(cluster, user, passwd, """
        module purge
        module load python3/intel/3.6.3
        cd /beegfs/$USER/pyenv
        virtualenv --system-site-packages epdf_combine
        source /beegfs/$USER/pyenv/epdf_combine/bin/activate
        pip3 install -I jupyter

    """)
else:
    ssh_comm(cluster, user, passwd, """module purge \n
        module load python3/intel/3.6.3 \n
        source /beegfs/$USER/pyenv/epdf_combine/bin/activate \n""")

### Create bash script file

In [None]:
def new_bashscript(cluster, user, passwd, command):
    client = paramiko.SSHClient()
    client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    client.connect(cluster, port=22, username=user, password=passwd)
    ssh_session = client.get_transport().open_session()
    stdin, stdout, stderr = client.exec_command(command)
    output = stderr.readlines()

    print(output)
    return               
new_bashscript(cluster, user, passwd, r"""echo '#!/bin/bash
 
#SBATCH --job-name=pdf_combine
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=64GB
 
## For a gpu card:
##SBATCH --gres=gpu:1
## For a specific card:
##SBATCH  --gres=gpu:v100:1
 
#SBATCH --time=48:00:00
#SBATCH --mail-type=END
#SBATCH --mail-user=$USER@nyu.edu
#SBATCH --output=slurm_%j.out
 
module purge
module load python3/intel/3.6.3
source /beegfs/$USER/pyenv/epdf_combine/bin/activate
 
port=$(shuf -i 10000-65500 -n 1)
 
/usr/bin/ssh -N -f -R $port:localhost:$port log-0
/usr/bin/ssh -N -f -R $port:localhost:$port log-1
 
cat<<EOF
 
ssh -L $port:localhost:$port $USER@prince.hpc.nyu.edu

$(hostname)
 
EOF
 
unset XDG_RUNTIME_DIR
if [ "$SLURM_JOBTMP" != "" ]; then
    export XDG_RUNTIME_DIR=$SLURM_JOBTMP
fi
 
jupyter notebook --no-browser --port $port --notebook-dir=$(pwd)' > /beegfs/$USER/notebooks/pdf_combine/pdf_combine.sh""")

### Run bash script

In [None]:
ssh_comm(cluster, user, passwd, """
    cd /beegfs/$USER/notebooks/pdf_combine \n
    sbatch pdf_combine.sh""") #

### Check status of job in Slarm
Function that checks if the sumbitted job has a R ("Running") status. It returns the number of the job.

In [None]:
def status_check(cluster, user, passwd):
    client = paramiko.SSHClient()
    client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    client.connect(cluster, username=user, password=passwd)
    ssh_session = client.get_transport().open_session()
    if ssh_session.active:
        ssh_session.exec_command("squeue -u " + user)  
        status = str(ssh_session.recv(1024))
        status = re.sub("  ", " ", status)
        attempts = 0
        while attempts < 5:
            job = str(re.findall(r'(\d{7})', status)).replace("'", "").replace("[","").replace("]","")   
            if job != "":
                attempts += 5
            else:
                attempts += 1
                time.sleep(8)
                print(attempts) 
    return job

##### It may take some time...

In [None]:
job_number = str(status_check(cluster, user, passwd)); job_number

### Concatenate  slurm-job_number.out
NB: You may need rerun the cell it until it works

In [None]:
concatenate = "cat slurm-" + job_number ; concatenate
ssh_comm(cluster, user, passwd, concatenate)

### Jupyter and ssh
Function that returns a URL for Jupyter Notebook and a ssh command

In [None]:
def port_and_ssh(cluster, user, passwd, x=0):
    client = paramiko.SSHClient()
    client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    client.connect(cluster, port=22, username=user, password=passwd)
    ssh_session = client.get_transport().open_session()
    stdin, stdout, stderr = client.exec_command("cat /beegfs/$USER/notebooks/pdf_combine/slurm_"+job_number+".out")
    output = stdout.readlines()
    attempts = 0
    test = re.findall(r'(ssh.*?edu)\\n', str(output))
    if test or x>4:
        sshpass = str("sshpass -p '"+passwd+"' ssh -o StrictHostKeyChecking=no " + re.findall(r'ssh (.*?edu)\\n', str(output))[0])
        jupyter = re.findall(r'(http://localhost.*?)\\n', str(output))[0]
        attempts += 5
        return sshpass, display(HTML('<a href="'+jupyter+'">Click to run Jupyter</a>'))
        
    else:
        print("Attempt %x out of 5" % int(x+1))
        time.sleep(5)
        return port_and_ssh(cluster, user, passwd, x+1)

##### Run the line below in bash.  Clear the output when you finish!
_Depending on the queue of the cluster, it may take some time..._

In [None]:
secureshell = port_and_ssh(cluster, user, passwd)[0]; secureshell

### Cancel job and delete slurm-job_number.out

Type "Yes" if you have finished your work

In [None]:
person = input('Do you want to cancel your job and delete the slurm-job_number.out file?: ')
if person == "Yes":
    ssh_comm(cluster, user, passwd, 'scancel '+job_number)
    ssh_comm(cluster, user, passwd, 'rm /beegfs/$USER/notebooks/pdf_combine/slurm_'+job_number+'.out')

### Sync between `/beegfs/$USER` and `/home/$USER` 
Sync the virtual env and the notebooks. The distination (`/home/$USER`) is going to mirror the source directory (`/beegfs/$USER`). __Extraneous files from destination directory are going to be deleted.__ Enter "Yes" to continue.

In [None]:
person = input('Be careful! Do you want to sync /beegfs and /home? ')
if person == "Yes":
    ssh_comm(cluster, user, passwd, """rsync -av --delete /beegfs/$USER/notebooks/pdf_combine/ /home/$USER/notebooks/pdf_combine
    rsync -av --delete /beegfs/$USER/notebooks/pdf_combine/ /home/$USER/notebooks/pdf_combine""")

---