<h2>Setting Stuff Up</h2>

Here we import some packages that we'll need in various places. We'll also load all the variables we set in config.

In [19]:
!mkdir -p ~/agave

%cd ~/agave

!pip3 install --upgrade setvar

import re
import os
import sys
from setvar import *
from time import sleep

# This cell enables inline plotting in the notebook
%matplotlib inline

import matplotlib
import numpy as np
import matplotlib.pyplot as plt
loadvar()

/home/jovyan/agave
Requirement already up-to-date: setvar in /opt/conda/lib/python3.6/site-packages
AGAVE_JSON_PARSER=jq
AGAVE_PASSWD=**HIDDEN**
AGAVE_TENANTS_API_BASEURL=https://agave-auth.solveij.com/tenants
AGAVE_USERNAME=dooley
APP_NAME=funwave-tvd-nectar-dooley
DEPLOYMENT_PATH=agave-deployment
DOCKERHUB_NAME=dooley
DOMAIN=nectar.org
EMAIL=deardooley@gmail.com
EXEC_MACHINE=nectar-exec-dooley
HOME_DIR=/home/jovyan
MACHINE_IP=204.90.47.30
MACHINE_NAME=nectar
MACHINE_USERNAME=jovyan
PBTOK=**HIDDEN**
PORT=10022
REQUESTBIN_URL=https://requestbin.agaveapi.co/voup3yvo
SCRATCH_DIR=/home/jovyan
STORAGE_MACHINE=nectar-storage-dooley
WORK_DIR=/home/jovyan


## Creating the Storage Machine   

Agave wants to know which place (or places) you want to store the data associated with your jobs. Here, we're going to set that up. Authentication to the storage machine will be through SSH keys. The key and public key files, however, contain newlines. To encode them in Json (the data format used by Agave), we will run the jsonpki command on each file. Next, we will store its contents in the environment for use by setvar.

In [7]:
!jsonpki --public ~/.ssh/id_rsa.pub > ~/.ssh/id_rsa.pub.txt
!jsonpki --private ~/.ssh/id_rsa > ~/.ssh/id_rsa.txt

In [8]:
os.environ["PUB_KEY"]=readfile("${HOME}/.ssh/id_rsa.pub.txt").strip()
os.environ["PRIV_KEY"]=readfile("${HOME}/.ssh/id_rsa.txt").strip()

Reading file `/home/jovyan/.ssh/id_rsa.pub.txt'
Reading file `/home/jovyan/.ssh/id_rsa.txt'


In this next cell, we create the json file used to describe the storage machine.

In [9]:
writefile("${STORAGE_MACHINE}.txt","""{
    "id": "${STORAGE_MACHINE}",
    "name": "${MACHINE_NAME} storage (${MACHINE_USERNAME})",
    "description": "The ${MACHINE_NAME} computer",
    "site": "${DOMAIN}",
    "type": "STORAGE",
    "storage": {
        "host": "${MACHINE_IP}",
        "port": ${PORT},
        "protocol": "SFTP",
        "rootDir": "/",
        "homeDir": "${HOME_DIR}",
        "auth": {
          "username" : "${MACHINE_USERNAME}",
          "publicKey" : "${PUB_KEY}",
          "privateKey" : "${PRIV_KEY}",
          "type" : "SSHKEYS"
        }
    }
}
""")

Writing file `nectar-storage-dooley.txt'


Here, we tell Agave about the machine. You can re-run the previous cell and the next one if you want to change the definition of your storage machine.

In [10]:
!systems-addupdate -F ${STORAGE_MACHINE}.txt

[1;0m[1;0mSuccessfully added system nectar-storage-dooley[0m[0m


Next we run the Agave command `files-list`. This provides a check that we've set up the storage machine correctly.

In [11]:
!files-list -S ${STORAGE_MACHINE} ./ | head -5

.
.bash_logout
.bashrc
.cache
.docker


<h2>Setting up the Execution Machine</h2>

You may not always wish to store your data on the same machine you run your jobs on. However, in this tutorial, we will assume that you do. The description for the execution machine is much like the storage machine. However, there are a few more pieces of information you'll need to provide. In this example, we are going to call commands directly on the host as opposed to using a batch queue scheduler. It is slightly simpler.

In [12]:
# Edit any parts of this file that you know need to be changed for your machine.
writefile("${EXEC_MACHINE}.txt","""
{
    "id": "${EXEC_MACHINE}",
    "name": "${MACHINE_NAME} (${MACHINE_USERNAME})",
    "description": "The ${MACHINE_NAME} computer",
    "site": "${DOMAIN}",
    "public": false,
    "status": "UP",
    "type": "EXECUTION",
    "executionType": "CLI",
    "scheduler" : "FORK",
    "environment": null,
    "scratchDir" : "${SCRATCH_DIR}",
    "queues": [
        {
            "name": "none",
            "default": true,
            "maxJobs": 10,
            "maxUserJobs": 10,
            "maxNodes": 6,
            "maxProcessorsPerNode": 6,
            "minProcessorsPerNode": 1,
            "maxRequestedTime": "00:30:00"
        }
    ],
    "login": {
        "auth": {
          "username" : "${MACHINE_USERNAME}",
          "publicKey" : "${PUB_KEY}",
          "privateKey" : "${PRIV_KEY}",
          "type" : "SSHKEYS"
        },
        "host": "${MACHINE_IP}",
        "port": ${PORT},
        "protocol": "SSH"
    },
    "maxSystemJobs": 50,
    "maxSystemJobsPerUser": 50,
    "storage": {
        "host": "${MACHINE_IP}",
        "port": ${PORT},
        "protocol": "SFTP",
        "rootDir": "/",
        "homeDir": "${HOME_DIR}",
        "auth": {
          "username" : "${MACHINE_USERNAME}",
          "publicKey" : "${PUB_KEY}",
          "privateKey" : "${PRIV_KEY}",
          "type" : "SSHKEYS"
        }
    },
    "workDir": "${WORK_DIR}"
}""")

Writing file `nectar-exec-dooley.txt'


In [13]:
!systems-addupdate -F ${EXEC_MACHINE}.txt

[1;0m[1;0mSuccessfully added system nectar-exec-dooley[0m[0m


In [14]:
# Test to see if this worked...
!files-list -S ${EXEC_MACHINE} ./ | head -5

.
.bash_logout
.bashrc
.cache
.docker


<h3>Create the Application</h3>
Agave allows us to describe custom allocations, limiting users to run a specific job. In this case, we're going to create a simple "fork" scheduler that just takes the command we want to run as a job parameter. The wrapper file is a shell script we will run on the execution machine. If we were using a scheduler, this would be our batch file.

In [15]:
writefile("fork-wrapper.txt","""
#!/bin/bash
\${command}
""")

Writing file `fork-wrapper.txt'


Using Agave commands, we make a directory on the storage server an deploy our wrapper file there.

In [16]:
!files-mkdir -S ${STORAGE_MACHINE} -N ${DEPLOYMENT_PATH}
!files-upload -F fork-wrapper.txt -S ${STORAGE_MACHINE} ${DEPLOYMENT_PATH}/

[1;0mSuccessfully created folder agave-deployment[0m
Uploading fork-wrapper.txt...
######################################################################## 100.0%


All agave applications require a test file. The test file is a free form text file which allows you to specify what resources you might need to test your application.

In [17]:
writefile("fork-test.txt","""
command=date
fork-wrapper.txt
""")

Writing file `fork-test.txt'


In [18]:
!files-mkdir -S ${STORAGE_MACHINE} -N ${DEPLOYMENT_PATH}
!files-upload -F fork-test.txt -S ${STORAGE_MACHINE} ${DEPLOYMENT_PATH}/

[1;0mSuccessfully created folder agave-deployment[0m
Uploading fork-test.txt...
######################################################################## 100.0%


Like everything else in Agave, we describe our application with Json. We specifiy which machines the application will use, what method it will use for submitting jobs, job parameters and files, etc.

In [20]:
writefile("fork-app.txt","""
{  
   "name":"${AGAVE_USERNAME}-${MACHINE_NAME}-fork",
   "version":"1.0",
   "label":"Runs a command",
   "shortDescription":"Runs a command",
   "longDescription":"",
   "deploymentSystem":"${STORAGE_MACHINE}",
   "deploymentPath":"${DEPLOYMENT_PATH}",
   "templatePath":"fork-wrapper.txt",
   "testPath":"fork-test.txt",
   "executionSystem":"${EXEC_MACHINE}",
   "executionType":"CLI",
   "parallelism":"SERIAL",
   "modules":[],
   "inputs":[
         {   
         "id":"datafile",
         "details":{  
            "label":"Data file",
            "description":"",
            "argument":null,
            "showArgument":false
         },
         "value":{  
            "default":"/dev/null",
            "order":0,
            "required":false,
            "validator":"",
            "visible":true
         }
      }   
   ],
   "parameters":[{
     "id" : "command",
     "value" : {
       "visible":true,
       "required":true,
       "type":"string",
       "order":0,
       "enquote":false,
       "default":"/bin/date",
       "validator":null
     },
     "details":{
         "label": "Command to run",
         "description": "This is the actual command you want to run. ex. df -h -d 1",
         "argument": null,
         "showArgument": false,
         "repeatArgument": false
     },
     "semantics":{
         "label": "Command to run",
         "description": "This is the actual command you want to run. ex. df -h -d 1",
         "argument": null,
         "showArgument": false,
         "repeatArgument": false
     }
   }],
   "outputs":[]
}
""")

Writing file `fork-app.txt'


In [21]:
!apps-addupdate -F fork-app.txt

[1;0m[1;0mSuccessfully added app dooley-nectar-fork-1.0[0m[0m


<h2>Running Jobs</h2>
Now that we have specified our application using Agave, it is time to try running jobs. To start a job we, once again, create a Json file. The Json file describes the app, what resource to run on, as well as how and when to send notifications. Notifications are delivered by callback url. EMAIL is the easiest type to configure, but we show here how to send webhook notifications to the popular [RequestBin](https://requestb.in/). 

The way this job is configured, it will only send email notifications for FINISHED or FAILURE. The requestbin will receive notifications of every job event until it reaches a terminal state. Other statuses exist, however. You can find them at http://docs.agaveplatform.org/#job-monitoring

In [22]:
writefile("job.txt","""
 {
   "name":"fork-command-1",
   "appId": "${AGAVE_USERNAME}-${MACHINE_NAME}-fork-1.0",
   "executionSystem": "${EXEC_MACHINE}",
   "archive": false,
   "notifications": [
    {
      "url":"${EMAIL}",
      "event":"FINISHED",
      "persistent":false
    },
    {
      "url":"${EMAIL}",
      "event":"FAILED",
      "persistent":false
    },
    {
      "url":"${REQUESTBIN_URL}?event=\${EVENT}&jobid=\${JOB_ID}",
      "event":"*",
      "persistent":"true"
    }
   ],
   "parameters": {
     "command":"echo hello"
   }
 }
""")

Writing file `job.txt'


Because the setvar() command can evalute `$()` style bash shell substitutions, we will use it to submit our job. This will capture the output of the submit command, and allow us to parse it for the JOB_ID. We'll use the JOB_ID in several subsequent steps.

In [23]:
setvar("""
# Capture the output of the job submit command
OUTPUT=$(jobs-submit -F job.txt)
# Parse out the job id from the output
JOB_ID=$(echo $OUTPUT | cut -d' ' -f4)
""")

OUTPUT=Successfully submitted job 8508968551844286951-242ac114-0001-007
JOB_ID=8508968551844286951-242ac114-0001-007


<h2>Job Monitoring and Output</h2>

While the job is running, the requestbin you registered will receive webhooks from Agave every time a job event occurs. To monitor this in real time, evaluate the next cell an visit the printed url in your browser:

In [37]:
!echo ${REQUESTBIN_URL}?inspect

https://requestbin.agaveapi.co/voup3yvo?inspect


Of course, you can also monitor the job status by polling. Note that the notifications you receive via email and webhook are less wasteful of resources. However, we show you this for completeness.

In [24]:
for iter in range(20):
    setvar("STAT=$(jobs-status $JOB_ID)")
    stat = os.environ["STAT"]
    sleep(5.0)
    if stat == "FINISHED" or stat == "FAILED":
        break

STAT=STAGED
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=FINISHED


The jobs-history command provides you a record of the steps of what your job did. If your job fails for some reason, this is your best diagnostic.

In [27]:
!echo jobs-history ${JOB_ID}
!jobs-history ${JOB_ID}

jobs-history 2784026591666368025-242ac11b-0001-007
[1;0mJob accepted and queued for submission.
Skipping staging. No input data associated with this job.
Preparing job for submission.
Attempt 1 to submit job
Fetching app assets from agave://jetstream-storage-stevenrbrandt/agave-deployment
Staging runtime assets to agave://jetstream-exec-stevenrbrandt//home/jovyan/stevenrbrandt/job-2784026591666368025-242ac11b-0001-007-fork-command-1
CLI job successfully forked as process id 15413
CLI job successfully forked as process id 15413
Job receieved duplicate RUNNING notification
Job completed execution
Job completed. Skipping archiving at user request.[0m


This command shows you the job id's and status of the last 5 jobs you ran.

In [29]:
!jobs-list -l 5

[1;0m2784026591666368025-242ac11b-0001-007 FINISHED
7195273770654297625-242ac11b-0001-007 FINISHED
2035779517029608985-242ac11b-0001-007 FINISHED
7419306274200808985-242ac11b-0001-007 FINISHED
1021257183143390745-242ac11b-0001-007 FINISHED[0m


This next command provides you with a list of all the files generated by your job. You can use it to figure out which files you want to retrieve with jobs-output-get.

In [25]:
!jobs-output-list --rich --filter=type,length,name ${JOB_ID}

[1;0m.agave.archive
.agave.log
fork-command-1.err
fork-command-1.ipcexe
fork-command-1.out
fork-command-1.pid
fork-test.txt
fork-wrapper.txt[0m


Retrieve the standard output.

In [30]:
!jobs-output-get ${JOB_ID} fork-command-1.out
!cat fork-command-1.out

######################################################################## 100.0%
hello


Retrieve the standard error output.

In [31]:
!jobs-output-get ${JOB_ID} fork-command-1.err
!cat fork-command-1.err




<h3>Automating</h3>
Because we're working in Python, we can simply glue the above steps together and create a script to run jobs for us and fetch the standard output. Let's do that next.

In [33]:
%%writefile runagavecmd.py
from setvar import *

from time import sleep

def runagavecmd(cmd,infile=None):
    setvar("REMOTE_COMMAND="+cmd)
    # The input file is an optional parameter, both
    # to our function and to the Agave application.
    if infile == None:
        setvar("INPUTS={}")
    else:
        setvar('INPUTS={"datafile":"'+infile+'"}')
    setvar("JOB_FILE=job-remote-$PID.txt")
    # Create the Json for the job file.
    writefile("$JOB_FILE","""
 {
   "name":"fork-command-1",
   "appId": "${AGAVE_USERNAME}-${MACHINE_NAME}-fork-1.0",
   "executionSystem": "${EXEC_MACHINE}",
   "archive": false,
   "notifications": [
    {
      "url":"${REQUESTBIN_URL}?event=\${EVENT}&jobid=\${JOB_ID}",
      "event":"*",
      "persistent":"true"
    }
   ],
   "parameters": {
     "command":"${REMOTE_COMMAND}"
   },
   "inputs":${INPUTS}
 }""")
    # Run the job and capture the output.
    setvar("""
# Capture the output of the job submit command
OUTPUT=$(jobs-submit -F $JOB_FILE)
# Parse out the job id from the output
JOB_ID=$(echo $OUTPUT | cut -d' ' -f4)
""")
    # Poll and wait for the job to finish.
    for iter in range(80): # Excessively generous
        setvar("STAT=$(jobs-status $JOB_ID)")
        stat = os.environ["STAT"]
        sleep(5.0)
        if stat == "FINISHED" or stat == "FAILED":
            break
    # Fetch the job output from the remote machine
    setvar("CMD=jobs-output-get ${JOB_ID} fork-command-1.out")
    os.system(os.environ["CMD"])
    print("All done! Output follows.")
    # Load the output into memory
    output=readfile("fork-command-1.out")
    print("=" * 70)
    print(output)

Writing runagavecmd.py


In [34]:
import runagavecmd as r
import imp
imp.reload(r)

<module 'runagavecmd' from '/home/jovyan/agave/runagavecmd.py'>

In [37]:
r.runagavecmd("lscpu")

REMOTE_COMMAND=lscpu
INPUTS={}
JOB_FILE=job-remote-11198.txt
Writing file `job-remote-11198.txt'
OUTPUT=Successfully submitted job 7682515682713136665-242ac11b-0001-007
JOB_ID=7682515682713136665-242ac11b-0001-007
STAT=PENDING
STAT=PENDING
STAT=STAGED
STAT=STAGED
STAT=STAGED
STAT=SUBMITTING
STAT=RUNNING
STAT=FINISHED
CMD=jobs-output-get 7682515682713136665-242ac11b-0001-007 fork-command-1.out
All done! Output follows.
Reading file `fork-command-1.out'
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                6
On-line CPU(s) list:   0-5
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             6
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Stepping:              2
CPU MHz:               2494.224
BogoMIPS:              4988.44
Virtualization:        VT-x
Hypervisor vendor: 

<h2>Permissions and Sharing</h3>

List the users and the permssions they have to look at the given job.

In [27]:
!jobs-pems-list ${JOB_ID}

[1;0mdooley READ WRITE [0m


Grant read permission to the user, ktraxler.

In [28]:
# permissions: READ, WRITE, READ_WRITE, ALL, NONE
!jobs-pems-update -u ktraxler -p READ ${JOB_ID}

[1;0mSuccessfully updated permission for ktraxler[0m


Run the list again and see the modified result.

In [29]:
!jobs-pems-list ${JOB_ID}

[1;0mdooley READ WRITE 
ktraxler READ [0m


In [30]:
!apps-pems-list ${AGAVE_USERNAME}-${MACHINE_NAME}-fork-1.0

[1;0mdooley READ WRITE EXECUTE 
dooley READ WRITE EXECUTE [0m


Now do the same thing for the application itself...

In [31]:
# permissions: READ, WRITE, EXECUTE, READ_WRITE, READ_EXECUTE, WRITE_EXECUTE, ALL, and NONE
!apps-pems-update -u ktraxler -p READ_EXECUTE ${AGAVE_USERNAME}-${MACHINE_NAME}-fork-1.0

[1;0m[1;0mSuccessfully updated permission for ktraxler[0m[0m


In [32]:
!apps-pems-list ${AGAVE_USERNAME}-${MACHINE_NAME}-fork-1.0

[1;0mdooley READ WRITE EXECUTE 
ktraxler READ EXECUTE 
dooley READ WRITE EXECUTE [0m


## Using the TOGO web portal  

Follow the link below to run your job from a web portal.

In [36]:
!echo http://togo.solveij.com/app/#/apps/${AGAVE_USERNAME}-${MACHINE_NAME}-fork-1.0/run

http://togo.solveij.com/app/#/apps/dooley-nectar-fork-1.0/run
