In [7]:
!mkdir -p ~/agave/funwave-tvd-docker

%cd ~/agave

!pip3 install setvar

import re
import os
import sys
from setvar import *
from time import sleep

# This cell enables inline plotting in the notebook
%matplotlib inline

import matplotlib
import numpy as np
import matplotlib.pyplot as plt
loadvar()
!auth-tokens-refresh || auth-tokens-create

/home/jovyan/agave
[33mYou are using pip version 9.0.1, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
AGAVE_APP_DEPLOYMENT_PATH=agave-deploy
AGAVE_APP_NAME=training-dooley
AGAVE_EXECUTION_SYSTEM_ID=sandbox-exec-dooley
AGAVE_STORAGE_HOME_DIR=/home/jovyan
AGAVE_STORAGE_SYSTEM_ID=sandbox-storage-dooley
AGAVE_STORAGE_WORK_DIR=/home/jovyan
AGAVE_SYSTEM_SITE_DOMAIN=localdomain
MACHINE_NAME=sandbox
MACHINE_USERNAME=jovyan
PBTOK=asdfasd
SCRATCH_DIR=/home/jovyan
VM_IPADDRESS=18.216.53.253
[1;0mToken for sandbox:dooley successfully refreshed and cached for 14400 seconds
352d5452a851e255c8456e663a16bd2[0m


## Creating the Docker Image
To start, we need a Dockerfile, which has a number of simple commands.
It starts with "FROM" which can specify any docker image available from DockerHub. That not only includes basic operating systems such as "ubunto", "fedora", "centos", etc. but specialized containers made by anyone with a dockerhub account. I've provided "science-base" which has OpenMPI 2.1.1 and some standard compilers, i.e. gfortran, gcc, and g++.

MAINTAINER is a bit of metadata that (hopefully) will allow you to contact the container's creator, if need be.

WORKDIR is the dockerfile equivalent of the "cd" command. Note that running "cd" will not change your directory.

RUN simply runs the command that follows. Because the container is saved after each step, we want to avoid creating files that we don't want to keep (we want containers to be as small as possible).

USER specifies the user id for running subsequent RUN commands.

COPY can be used to copy files into the container from the build directory.

ENTRYPOINT is a script that runs when the container starts up. What our script does is create a new user on the docker image with a user id and name that is convenient.

In [8]:
writefile("funwave-tvd-docker/Dockerfile","""
FROM stevenrbrandt/science-base

LABEL baseImage="stevenrbrandt/science-base:latest"
LABEL version="3"
LABEL software="FUNWAVE-TVD"
LABEL softwareVersion="v3.2-beta"
LABEL description="FUNWAVE–TVD is the TVD version of the fully nonlinear Boussinesq wave model (FUNWAVE) initially developed by Kirby et al. (1998)"
LABEL website="https://fengyanshi.github.io/build/html/index.html"
LABEL documentation="https://fengyanshi.github.io/build/html/index.html"
LABEL license="BSD 3-Clause"
LABEL tags="crc,fortran,tvd"

MAINTAINER Steven R. Brandt <sbrandt@cct.lsu.edu>

USER root
RUN mkdir -p /home/install
RUN chown jovyan /home/install
USER jovyan

RUN cd /home/install && \
    git clone https://github.com/fengyanshi/FUNWAVE-TVD && \
    cd FUNWAVE-TVD/src && \
    perl -p -i -e 's/FLAG_8 = -DCOUPLING/#$&/' Makefile && \
    make

WORKDIR /home/install/FUNWAVE-TVD/src
RUN mkdir -p /home/jovyan/rundir
WORKDIR /home/jovyan/rundir
""")

Writing file `funwave-tvd-docker/Dockerfile'


Now that we've create our Dockerfile and entrypoint.sh, bundle them up in a tarball and send them somewhere that agave can access them.

In [9]:
!tar -czf dockerjob.tgz -C funwave-tvd-docker Dockerfile
!files-mkdir -S ${AGAVE_STORAGE_SYSTEM_ID} -N funwave-tvd-docker
!files-upload -F dockerjob.tgz -S ${AGAVE_STORAGE_SYSTEM_ID} funwave-tvd-docker/

[1;31mFailed to connect to the remote system. Connection refused: Unable to contact SFTP server at 0.tcp.ngrok.io:16798[0m

Uploading dockerjob.tgz...
######################################################################## 100.0%
[1;31mFailed to connect to the remote system. Connection refused: Unable to contact SFTP server at 0.tcp.ngrok.io:16798[0m



In [5]:
import runagavecmd as r
import imp
imp.reload(r)

<module 'runagavecmd' from '/home/jovyan/agave/runagavecmd.py'>

Run the docker build command. We will "tag" this build with the name "funwave-tvd" when it is complete.

In [54]:
r.runagavecmd(
    "tar xzf dockerjob.tgz && sudo docker build --rm -t stevenrbrandt/funwave-tvd-2:latest .",
    "agave://${AGAVE_STORAGE_SYSTEM_ID}/funwave-tvd-docker/dockerjob.tgz"
)

REMOTE_COMMAND=tar xzf dockerjob.tgz && sudo docker build --rm -t stevenrbrandt/funwave-tvd-2:latest .
REQUESTBIN_URL=https://requestbin.agaveapi.co/1n9zzow1

 ** QUERY STRING FOR REQUESTBIN **
https://requestbin.agaveapi.co/1n9zzow1?inspect

INPUTS={"datafile":"agave://sandbox-storage-dooley/funwave-tvd-docker/dockerjob.tgz"}
JOB_FILE=job-remote-19.txt
Writing file `job-remote-19.txt'
OUTPUT=Successfully submitted job 3418047882601894376-242ac114-0001-007
JOB_ID=3418047882601894376-242ac114-0001-007
STAT=PENDING
STAT=PROCESSING_INPUTS
STAT=STAGING_INPUTS
STAT=STAGED
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=FINISHED
CMD=jobs-output-get 3418047882601894376-242ac114-0001-007 fork-command-1.out
All done! Output follows.
Reading file `fork-command-1.out'
Sending build context to Docker daemon  13.82kB

Step 1/19 : FROM stevenrbrandt/science-base
 ---> e93db47f971c
Step 2/19 : LABEL baseImage "stevenrbrandt/science-base:latest"
 ---> Using cache
 

In [55]:
!jobs-output-get -P ${JOB_ID} fork-command-1.err

## Running the Docker Image
It is possible to run docker interactively, but that isn't convenient inside scripts. So instead, we start it in detached mode, with the -d flag.

Because your docker image has its own internal file system, it can't see files on the host machine. You can, however, transfer them using the "docker cp" command.

Running docker is slightly tricky. When a Docker image starts up, you can execute any command you want--but when you type "exit" all the changes you've made to the file system vanish.

The traditional solution would be to create a volume on the host Therefore it's necessary to copy them out before the docker container stops.

In [56]:
writefile("rundock.sh","""
rm -fr cid.txt out.tgz

# Start a docker image running in detached mode, write the container id to cid.txt
sudo docker run -d -it --rm --cidfile cid.txt stevenrbrandt/funwave-tvd-2:latest bash

# Store the container id in CID for convenience
CID=\$(cat cid.txt)

# Copy the input.txt file into the running image
sudo docker cp input.txt \$CID:/home/jovyan/rundir/

# Run funwave on the image
sudo docker exec --user jovyan \$CID mpirun -np 2 /home/install/FUNWAVE-TVD/src/funwave_vessel

# Extract the output files from the running image
# Having them in a tgz makes it more convenient to fetch them with jobs-output-get
sudo docker exec --user jovyan \$CID tar czf - output > out.tgz

# Stop the image
sudo docker stop \$CID

# List the output files
tar tzf out.tgz
""")

Writing file `rundock.sh'


Upload the input.txt file and the rundock.sh script.

In [57]:
!tar czf rundock.tgz rundock.sh input.txt
!files-upload -F rundock.tgz -S ${AGAVE_STORAGE_SYSTEM_ID} funwave-tvd-docker/

Uploading rundock.tgz...
######################################################################## 100.0%


Execute the rundock.sh script

In [58]:
r.runagavecmd(
    "tar xzf rundock.tgz && bash rundock.sh",
    "agave://${AGAVE_STORAGE_SYSTEM_ID}/funwave-tvd-docker/rundock.tgz")

REMOTE_COMMAND=tar xzf rundock.tgz && bash rundock.sh
REQUESTBIN_URL=https://requestbin.agaveapi.co/1g4ynr91

 ** QUERY STRING FOR REQUESTBIN **
https://requestbin.agaveapi.co/1g4ynr91?inspect

INPUTS={"datafile":"agave://sandbox-storage-dooley/funwave-tvd-docker/rundock.tgz"}
JOB_FILE=job-remote-19.txt
Writing file `job-remote-19.txt'
OUTPUT=Successfully submitted job 7059965401245094376-242ac114-0001-007
JOB_ID=7059965401245094376-242ac114-0001-007
STAT=PENDING
STAT=PROCESSING_INPUTS
STAT=STAGING_INPUTS
STAT=STAGED
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=FINISHED
CMD=jobs-output-get 7059965401245094376-242ac114-0001-007 fork-command-1.out
All done! Output follows.
Reading

Get the output of the job back to our local machine

In [59]:
!jobs-output-list ${JOB_ID}
!jobs-output-get ${JOB_ID} out.tgz
!tar xzf out.tgz

[1;0m.agave.archive
.agave.log
cid.txt
fork-command-1.err
fork-command-1.ipcexe
fork-command-1.out
fork-command-1.pid
fork-test.txt
fork-wrapper.txt
input.txt
out.tgz
rundock.sh
rundock.tgz[0m
######################################################################## 100.0%


In [60]:
!head output/eta_00001

   -0.208179E-06    0.580880E-08    0.108330E-06   -0.999326E-07    0.430505E-07    0.431768E-07    0.432850E-07    0.433777E-07    0.434572E-07    0.435252E-07    0.435835E-07    0.436334E-07    0.436761E-07    0.437127E-07    0.437441E-07    0.437710E-07    0.437940E-07    0.438137E-07    0.438306E-07    0.438450E-07    0.438574E-07    0.438680E-07    0.438771E-07    0.438849E-07    0.438915E-07    0.438972E-07    0.439021E-07    0.439063E-07    0.398649E-07   -0.108576E-06   -0.112619E-06   -0.112617E-06   -0.112615E-06   -0.112613E-06   -0.112612E-06   -0.112610E-06   -0.112609E-06   -0.112608E-06   -0.112608E-06   -0.105228E-06   -0.112606E-06   -0.119985E-06   -0.112605E-06   -0.112605E-06   -0.112605E-06   -0.112604E-06   -0.112604E-06   -0.112604E-06   -0.112604E-06   -0.112603E-06   -0.112603E-06   -0.112603E-06   -0.186395E-06   -0.112602E-06   -0.388096E-07   -0.112602E-06   -0.112601E-06   -0.112601E-06   -0.112601E-06   -0.112600E-06   -0.112600E-06   -0.112599E-06   -0.38

## Running with Singularity
If we have a public docker image, we can run it directly with Singularity. Singularity is desiged to be more HPC friendly than Docker. First, because it doesn't allow the running user to access any user id but their own inside the container, and second, because singularity images can be run through MPI, making it easier to scale up to a distributed cluser.

In this first step, we build the singularity installation. Because the result of this job is intended to be an installation for subsequent jobs, we install it to a hard-coded directory rather than using the normal Agave job directory.

In [61]:
!files-mkdir -S ${AGAVE_STORAGE_SYSTEM_ID} -N sing
!files-upload -F input.txt -S ${AGAVE_STORAGE_SYSTEM_ID} sing/
r.runagavecmd(
            "mkdir -p ~/singu && "+
            "cd ~/singu && "+
            "rm -f funwave-tvd.img && "+
            "sudo singularity build funwave-tvd.img docker://stevenrbrandt/funwave-tvd-2:latest")

[1;0mSuccessfully created folder sing[0m
Uploading input.txt...
######################################################################## 100.0%
REMOTE_COMMAND=mkdir -p ~/singu && cd ~/singu && rm -f funwave-tvd.img && sudo singularity build funwave-tvd.img docker://stevenrbrandt/funwave-tvd-2:latest
REQUESTBIN_URL=https://requestbin.agaveapi.co/1f6assl1

 ** QUERY STRING FOR REQUESTBIN **
https://requestbin.agaveapi.co/1f6assl1?inspect

INPUTS={}
JOB_FILE=job-remote-19.txt
Writing file `job-remote-19.txt'
OUTPUT=Successfully submitted job 5121744173497848296-242ac114-0001-007
JOB_ID=5121744173497848296-242ac114-0001-007
STAT=PENDING
STAT=STAGED
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=FINISHED
CMD=jobs-output-get 5121744173497848296-242ac114-0001-007 fork-command-1.out
All done! Output follows.
Re

Now that the Singularity image is built, we can run it with mpi. Notice that mpi executes the singularity command. The tricky part here is to make sure you've got the same version of mpi running inside and outside the container.

In [62]:
!files-upload -F input.txt -S ${AGAVE_STORAGE_SYSTEM_ID} ./
r.runagavecmd(
    "export LD_LIBRARY_PATH=/usr/local/lib && "+
    "mpirun -np 2 singularity exec ~/singu/funwave-tvd.img /home/install/FUNWAVE-TVD/src/funwave_vessel && "+
    "tar cvzf singout.tgz output",
    "agave://${AGAVE_STORAGE_SYSTEM_ID}/input.txt"
)

Uploading input.txt...
######################################################################## 100.0%
REMOTE_COMMAND=export LD_LIBRARY_PATH=/usr/local/lib && mpirun -np 2 singularity exec ~/singu/funwave-tvd.img /home/install/FUNWAVE-TVD/src/funwave_vessel && tar cvzf singout.tgz output
REQUESTBIN_URL=https://requestbin.agaveapi.co/tetpcbte

 ** QUERY STRING FOR REQUESTBIN **
https://requestbin.agaveapi.co/tetpcbte?inspect

INPUTS={"datafile":"agave://sandbox-storage-dooley/input.txt"}
JOB_FILE=job-remote-19.txt
Writing file `job-remote-19.txt'
OUTPUT=Successfully submitted job 646700512368202216-242ac114-0001-007
JOB_ID=646700512368202216-242ac114-0001-007
STAT=PENDING
STAT=PROCESSING_INPUTS
STAT=STAGING_INPUTS
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=FINISHED
CMD=jobs-output-get 646700512368202216-242ac114-0001-007 fork-command-1.out
All done! Out

In [68]:
!jobs-output-list --rich ${JOB_ID}
!jobs-output-get ${JOB_ID}  singout.tgz
!rm -fr output
!tar xzf singout.tgz

[1;0m| name                  | length  | permission | type | lastModified           |
| ----                  | ------  | ---------- | ---- | ------------           |
| .agave.archive        | 68      | READ_WRITE | file | Jul 19, 2018   9:32 pm |
| .agave.log            | 395     | READ_WRITE | file | Jul 19, 2018   9:33 pm |
| fork-command-1.err    | 136     | READ_WRITE | file | Jul 19, 2018   9:32 pm |
| fork-command-1.ipcexe | 2651    | READ_WRITE | file | Jul 19, 2018   9:32 pm |
| fork-command-1.out    | 5850    | READ_WRITE | file | Jul 19, 2018   9:33 pm |
| fork-command-1.pid    | 5       | READ_WRITE | file | Jul 19, 2018   9:32 pm |
| fork-test.txt         | 29      | READ_WRITE | file | Jul 19, 2018   6:24 pm |
| fork-wrapper.txt      | 22      | READ_WRITE | file | Jul 19, 2018   6:24 pm |
| Grid_Range.out        | 108     | READ_WRITE | file | Jul 19, 2018   9:32 pm |
| input.txt             | 1753    | READ_WRITE | file | Jul 19, 2018   9:32 pm |
| LOG.txt             

In [69]:
!head output/v_00003

   -0.370381E-07   -0.264216E-07   -0.453174E-07   -0.615901E-07   -0.867047E-07   -0.916501E-07   -0.873622E-07   -0.295986E-07    0.365446E-08   -0.118685E-07   -0.359654E-07   -0.657907E-07   -0.491518E-07   -0.167981E-07   -0.261493E-07   -0.331495E-07   -0.531462E-07   -0.380234E-07   -0.715021E-08   -0.531698E-07   -0.288188E-07   -0.224122E-07   -0.117357E-07   -0.334322E-07   -0.607799E-07   -0.793287E-07   -0.801748E-07   -0.278102E-07    0.820207E-08   -0.176744E-07   -0.376689E-07   -0.505262E-07   -0.474473E-07   -0.120814E-07    0.521564E-08   -0.202734E-07   -0.502107E-07   -0.495059E-07   -0.170843E-07   -0.794791E-09    0.287557E-07    0.286107E-08   -0.407979E-07   -0.571593E-07   -0.166976E-07   -0.917486E-08   -0.159806E-08   -0.293239E-07   -0.487221E-07   -0.363046E-07   -0.312466E-07   -0.454842E-07   -0.737199E-07   -0.372320E-07   -0.263822E-07   -0.606216E-07   -0.521675E-07   -0.414295E-08   -0.285707E-08   -0.387530E-07   -0.759180E-07   -0.594871E-07   -0.38

In this next and final singularity example, we get around the problem of needing to port MPI by using the same MPI that's in the container to launch the containers. The trick is this code, which comes at the end of the .bashrc. What it does is effectively replace our shell on the machine with an execution of bash inside the singularity image.

```
# Put the full path to a singularity image in the file $HOME/sing.txt.
if [ -r $HOME/work/sing.txt ]
then
    IMAGE=$(cat $HOME/work/sing.txt)
fi
if [ "$IMAGE" != "" ]
then
    if [ -r "$IMAGE" ]
    then
        # If the SINGULARITY_CONTAINER variable is set,
        # then we are already in the container
        if [ "$SINGULARITY_CONTAINER" = "" ]
        then
            # Switch to running inside singularity
            exec singularity exec $IMAGE bash --login
        fi
    else
        echo Could not read image file $IMAGE
    fi
fi
```

In [72]:
!echo /home/jovyan/singu/funwave-tvd.img > ~/work/sing.txt

There's no need to call singularity explicitly, as it's called by each invocation of .bashrc. Note that the funwave-tvd we are executing is the one from inside the image.

In [73]:
!files-upload -F input.txt -S ${AGAVE_STORAGE_SYSTEM_ID} ./
r.runagavecmd(
    "rm -fr output && "+
    "mpirun -np 2 /home/install/FUNWAVE-TVD/src/funwave_vessel && "+
    "tar cvzf singout.tgz output",
    "agave://${AGAVE_STORAGE_SYSTEM_ID}/input.txt"
)

Uploading input.txt...
######################################################################## 100.0%
REMOTE_COMMAND=rm -fr output && mpirun -np 2 /home/install/FUNWAVE-TVD/src/funwave_vessel && tar cvzf singout.tgz output
REQUESTBIN_URL=https://requestbin.agaveapi.co/1j34nqr1

 ** QUERY STRING FOR REQUESTBIN **
https://requestbin.agaveapi.co/1j34nqr1?inspect

INPUTS={"datafile":"agave://sandbox-storage-dooley/input.txt"}
JOB_FILE=job-remote-19.txt
Writing file `job-remote-19.txt'
OUTPUT=Successfully submitted job 3604058311987171816-242ac114-0001-007
JOB_ID=3604058311987171816-242ac114-0001-007
STAT=PENDING
STAT=PROCESSING_INPUTS
STAT=STAGING_INPUTS
STAT=STAGED
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=SUBMITTING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=RUNNING
STAT=FINISHED
CMD=jobs-output-get 3604058311987171816-242ac114-0001-007 fork-command-1.out
All done! Output follows.
Reading fil

In [None]:
!rm -fr output singout.tgz
!jobs-output-get ${JOB_ID} singout.tgz
!tar xzf singout.tgz
!ls output

In [None]:
# Clean up so that we don't boot into the singularity image without intending to
!rm -f ~/work/sing.txt