# Tapis v3 Hands-on

In this notebook, you will use Tapis v3 to create two systems and an application that will be used to run
an MPM job on a HPC like VM.

To execute each `In[#]` cell, you can click inside the cell and press `Shift + Enter`

## Install the Tapipy Python SDK

In [37]:
pip install tapipy

Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


## Enter training account information

To get things started, please run the following and enter the training account information provided to you. The username and password will be same trainingXX/trainingXX

In [54]:
import getpass

tenant = 'training'
base_url = 'https://' + tenant + '.tapis.io'

username = input('Username: ')
password = getpass.getpass(prompt='Password: ', stream=None)


## Authenticate and initialize Tapis v3 client

Using this information, you can now use `tapipy` to authenticate in the tenant and initialize the
Tapis v3 client. You should see your token information displayed. This may take a while to run but should take
no more than 30 seconds.

In [75]:
from tapipy.tapis import Tapis
#Create python Tapis client for user
client = Tapis(base_url= base_url, username=username, password=password)
# *** Tapis v3: Call to Tokens API
client.get_tokens()
# Print Tapis v3 token
client.access_token


access_token: eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJqdGkiOiJmZWUxNzE5Ny1iNmMyLTQ3NjQtYTc3Yi1mYTAzNmViNTY4NjEiLCJpc3MiOiJodHRwczovL3RyYWluaW5nLnRhcGlzLmlvL3YzL3Rva2VucyIsInN1YiI6InRyYWluaW5nMUB0cmFpbmluZyIsInRhcGlzL3RlbmFudF9pZCI6InRyYWluaW5nIiwidGFwaXMvdG9rZW5fdHlwZSI6ImFjY2VzcyIsInRhcGlzL2RlbGVnYXRpb24iOmZhbHNlLCJ0YXBpcy9kZWxlZ2F0aW9uX3N1YiI6bnVsbCwidGFwaXMvdXNlcm5hbWUiOiJ0cmFpbmluZzEiLCJ0YXBpcy9hY2NvdW50X3R5cGUiOiJ1c2VyIiwiZXhwIjoxNjg5NzExMDQ2LCJ0YXBpcy9jbGllbnRfaWQiOm51bGwsInRhcGlzL2dyYW50X3R5cGUiOiJwYXNzd29yZCJ9.R4NjUA7ltP-YGOPxh2PwYG8P-LYuR4ab3Eea0bIVCEk0A3UWvMlzqRf1XXmVQt4-pxyQtUCPdpdwUlwshdfkHUVZCiU6AlSakJ9bizKYH223RCjZwjbjFpeE-bbmBkNt7PSmVjKdzaaSbyeKR8H1yCUKREEvzbKDYCcgxrLiTDYr5RJxUNyZnZdRkEykQZIV9adYdpOLWyh6l9bLdW48GsohvMcPysw2GNk7C13cI8Of5RPxHpMXDFVBF7XcV0DBtGTC_UoxsReQGmEJuET-8OaEQejTjDDYiJoNkz0Aop2l-BP354VYs7oxOjrPpbegH1At8iVSThJS6bH-EjVKrA
claims: {'jti': 'fee17197-b6c2-4764-a77b-fa036eb56861', 'iss': 'https://training.tapis.io/v3/tokens', 'sub': 'training1@training', '

In order to create Tapis Systems, we need an actual user on the VM. For simplicity we have created the same trainingXX user on the host. The password will be the alphanumeric provided to you.

In [55]:
password_vm = getpass.getpass(prompt='Password for VM: ', stream=None)
host = input('Host: ')

## Systems

In this section we create a Tapis systems, one for running on a VM host using FORK and one for running on an HPC type host using BATCH.

Note that although it is possible, we have not provided any login credentials in the system definitions.
Well-crafted system definitions are likely to be copied and re-used, so, for security reasons, it is recommended that
login credentials be registered using separate API calls as discussed below.

### Create a system for the VM host

In [56]:
user_id = username
system_id_vm = "tapis-vm-" + user_id + "-test2"

# Create the system definition
exec_system_vm = {
  "id": system_id_vm,
  "description": "Test system",
  "systemType": "LINUX",
  "host": host,
  "defaultAuthnMethod": "PASSWORD",
  "rootDir": "/home/"+user_id,
  "canExec": True,
  "jobRuntimes": [ { "runtimeType": "SINGULARITY" } ],
  "jobWorkingDir": "workdir",
}

# Use the client to create the system in Tapis
print("****************************************************")
print("Create system: " + system_id_vm)
print("****************************************************")
client.systems.createSystem(**exec_system_vm)


# If you need to update the system, you can modify the original definition and use the putSystem call.
# - modify the above definition as needed
# - comment out the above line with the call to createSystem()
# - uncomment the below line with the call to updateSystem()
# - re-run the cell
# Note that not all attributes may be updated.
#client.systems.putSystem(**exec_system_vm, systemId=system_id_vm)

****************************************************
Create system: tapis-vm-training1-test2
****************************************************



url: http://training.tapis.io/v3/systems/tapis-vm-training1-test2

In [43]:
# You can also update just a few attributes using the patchSystem call.
# Note that not all attributes may be updated and some attributes, such as *enabled*,
#   may only be updated using a specific call.
# For example, to update the description, first define the json to be used:
patch_system_vm = {
  "description": "System for testing jobs on a VM for Tapis tutorial"
}

# Then use the client to make the update:
client.systems.patchSystem(**patch_system_vm, systemId=system_id_vm)


url: http://training.tapis.io/v3/systems/tapis-vm-training1-test1

In [57]:
# List all systems available to you
print("****************************************************")
print("List all systems")
print("****************************************************")
client.systems.getSystems()

****************************************************
List all systems
****************************************************


[
 canExec: true
 defaultAuthnMethod: PASSWORD
 effectiveUserId: training1
 host: 129.114.35.143
 id: tapis-vm-training1-test1
 owner: training1
 parentId: None
 systemType: LINUX,
 
 canExec: true
 defaultAuthnMethod: PASSWORD
 effectiveUserId: training1
 host: 129.114.35.143
 id: tapis-vm-training1-test2
 owner: training1
 parentId: None
 systemType: LINUX,
 
 canExec: true
 defaultAuthnMethod: PASSWORD
 effectiveUserId: training1
 host: 129.114.35.184
 id: tapis-vm-scblacktraining1
 owner: training1
 parentId: None
 systemType: LINUX,
 
 canExec: true
 defaultAuthnMethod: PASSWORD
 effectiveUserId: training1
 host: 129.114.35.184
 id: tapis-hpc-scblack-training1
 owner: training1
 parentId: None
 systemType: LINUX,
 
 canExec: true
 defaultAuthnMethod: PASSWORD
 effectiveUserId: training1
 host: 129.114.35.143
 id: tapis-vm-training1tsi123
 owner: training1
 parentId: None
 systemType: LINUX]

In [101]:
#client.systems.deleteSystem(systemId='tapis-vm-training1')


changes: 1

In [58]:
# Get details for the system you created
print("****************************************************")
print("Fetch system: " + system_id_vm)
print("****************************************************")
client.systems.getSystem(systemId=system_id_vm)

****************************************************
Fetch system: tapis-vm-training1-test2
****************************************************



allowChildren: False
authnCredential: None
batchDefaultLogicalQueue: None
batchLogicalQueues: []
batchScheduler: None
batchSchedulerProfile: None
bucketName: None
canExec: True
canRunBatch: False
created: 2023-07-17T23:24:40.491839Z
defaultAuthnMethod: PASSWORD
deleted: False
description: Test system
dtnMountPoint: None
dtnMountSourcePath: None
dtnSystemId: None
effectiveUserId: training1
enableCmdPrefix: False
enabled: True
host: 129.114.35.143
id: tapis-vm-training1-test2
importRefId: None
isDtn: False
isDynamicEffectiveUser: True
isPublic: False
jobCapabilities: []
jobEnvVariables: []
jobMaxJobs: 2147483647
jobMaxJobsPerUser: 2147483647
jobRuntimes: [
runtimeType: SINGULARITY
version: None]
jobWorkingDir: workdir
mpiCmd: None
notes: 

owner: training1
parentId: None
port: -1
proxyHost: None
proxyPort: -1
rootDir: /home/training1
systemType: LINUX
tags: []
tenant: training
updated: 2023-07-17T23:24:40.491839Z
useProxy: False
uuid: 6793cee5-05fa-47e4-b897-00b12d313fd3

### Register Credentials for the VM system

After creating the system, you will need to register credentials for your username. These will be used by Tapis to
access the host. Various authentication methods can be used to access a system, such as PASSWORD and PKI_KEYS. For the
VM a password is used.

In [59]:
# Register credentials
client.systems.createUserCredential(systemId=system_id_vm, userName=user_id, password=password_vm)

{'result': None,
 'status': 'success',
 'message': 'SYSAPI_CRED_UPDATED Credential updated. jwtTenant: training jwtUser: training1 OboTenant: training OboUser: training1 System: tapis-vm-training1-test2 User: training1',
 'version': '1.3.3',
 'commit': '8c66599c',
 'build': '2023-06-01T13:10:24Z',
 'metadata': None}

Now you can use the client to list files on the system. This will confirm that the credentials are valid.

In [60]:
# List files at the rootDir for the system
client.files.listFiles(systemId=system_id_vm, path="/")

[
 group: 1052
 lastModified: 2023-06-28T22:08:46Z
 mimeType: None
 name: .bash_history
 nativePermissions: rw-------
 owner: 1052
 path: .bash_history
 size: 2039
 type: file
 url: tapis://tapis-vm-training1-test2/.bash_history,
 
 group: 1052
 lastModified: 2022-08-02T07:41:39Z
 mimeType: None
 name: .bash_logout
 nativePermissions: rw-r--r--
 owner: 1052
 path: .bash_logout
 size: 18
 type: file
 url: tapis://tapis-vm-training1-test2/.bash_logout,
 
 group: 1052
 lastModified: 2022-08-02T07:41:39Z
 mimeType: None
 name: .bash_profile
 nativePermissions: rw-r--r--
 owner: 1052
 path: .bash_profile
 size: 141
 type: file
 url: tapis://tapis-vm-training1-test2/.bash_profile,
 
 group: 1052
 lastModified: 2022-08-02T07:41:39Z
 mimeType: None
 name: .bashrc
 nativePermissions: rw-r--r--
 owner: 1052
 path: .bashrc
 size: 376
 type: file
 url: tapis://tapis-vm-training1-test2/.bashrc,
 
 group: 1052
 lastModified: 2023-06-28T17:54:21Z
 mimeType: None
 name: .viminfo
 nativePermissions: rw

### Create a system for the HPC cluster

With just a few changes to the system definition you can create a second system that can be used to run the
same application on an HPC type host. Note the minimal changes:

* **id** - A unique id is required
* **host** - Main hostname for the HPC system.
* **rootDir** - Using the root directory of the host gives us flexibility in setting **jobWorkingDir**.
  Note that you still need LINUX permissions.
* **jobWorkingDir** - Now determined dynamically using the Tapis v3 function HOST_EVAL()
* **jobRuntimes** - Most HPC systems support singularity and not docker
* **batchLogicalQueue.hpcQueueName** - HPC queue to use by default.
* **batchLogicalQueues** - HPC queue definitions for this HPC system.

In [72]:
user_id = username
system_id_hpc = "tapis-hpc-" + user_id 

# Create the system definition
exec_system_hpc = {
  "id": system_id_hpc,
  "description": "System for testing jobs on an HPC type host for tapis tutorial",
  "systemType": "LINUX",
  "host": host,
  "defaultAuthnMethod": "PASSWORD",
  "rootDir": "/home/"+user_id,
  "canExec": True,
  "jobRuntimes": [ { "runtimeType": "SINGULARITY" } ],
  "jobWorkingDir": "workdir",
  "canRunBatch": True,
  "batchScheduler": "SLURM",
  "batchSchedulerProfile": "tacc",
  "batchDefaultLogicalQueue": "tapisNormal",
  "batchLogicalQueues": [
    {
      "name": "tapisNormal",
      "hpcQueueName": "normal",
      "maxJobs": 50,
      "maxJobsPerUser": 10,
      "minNodeCount": 1,
      "maxNodeCount": 16,
      "minCoresPerNode": 1,
      "maxCoresPerNode": 68,
      "minMemoryMB": 1,
      "maxMemoryMB": 16384,
      "minMinutes": 1,
      "maxMinutes": 60
    }
  ]
}

# Use the client to create the system in Tapis
print("****************************************************")
print("Create system: " + system_id_hpc)
print("****************************************************")
client.systems.createSystem(**exec_system_hpc)

# If you need to update the system,
# - modify the above definition as needed
# - comment out the above line
# - uncomment the below line
# - re-run the cell
#client.systems.putSystem(**exec_system_hpc, systemId=system_id_hpc)


****************************************************
Create system: tapis-hpc-training1
****************************************************



url: http://training.tapis.io/v3/systems/tapis-hpc-training1

In [109]:
# List all systems available to you
print("****************************************************")
print("List all systems")
print("****************************************************")
client.systems.getSystems()

****************************************************
List all systems
****************************************************


[]

In [110]:
# Get details for the system you created
print("****************************************************")
print("Fetch system: " + system_id_hpc)
print("****************************************************")
client.systems.getSystem(systemId=system_id_hpc)

****************************************************
Fetch system: tapis-hpc-training1
****************************************************


### Register Credentials for the HPC system

As before, now you will need to register credentials for your username. These will be used by Tapis to
access the host.

In [75]:
password_hpc = password_vm
# Register credentials
client.systems.createUserCredential(systemId=system_id_hpc, userName=user_id, password=password_hpc)

{'result': None,
 'status': 'success',
 'message': 'SYSAPI_CRED_UPDATED Credential updated. jwtTenant: training jwtUser: training1 OboTenant: training OboUser: training1 System: tapis-hpc-training1 User: training1',
 'version': '1.3.3',
 'commit': '8c66599c',
 'build': '2023-06-01T13:10:24Z',
 'metadata': None}

Now you can use the client to list files on the system. This will confirm that the credentials are valid.

In [76]:
# List files at the rootDir for the system
path_to_list = "/"
client.files.listFiles(systemId=system_id_hpc, path=path_to_list)

[
 group: 1052
 lastModified: 2023-06-28T22:08:46Z
 mimeType: None
 name: .bash_history
 nativePermissions: rw-------
 owner: 1052
 path: .bash_history
 size: 2039
 type: file
 url: tapis://tapis-hpc-training1/.bash_history,
 
 group: 1052
 lastModified: 2022-08-02T07:41:39Z
 mimeType: None
 name: .bash_logout
 nativePermissions: rw-r--r--
 owner: 1052
 path: .bash_logout
 size: 18
 type: file
 url: tapis://tapis-hpc-training1/.bash_logout,
 
 group: 1052
 lastModified: 2022-08-02T07:41:39Z
 mimeType: None
 name: .bash_profile
 nativePermissions: rw-r--r--
 owner: 1052
 path: .bash_profile
 size: 141
 type: file
 url: tapis://tapis-hpc-training1/.bash_profile,
 
 group: 1052
 lastModified: 2022-08-02T07:41:39Z
 mimeType: None
 name: .bashrc
 nativePermissions: rw-r--r--
 owner: 1052
 path: .bashrc
 size: 376
 type: file
 url: tapis://tapis-hpc-training1/.bashrc,
 
 group: 1052
 lastModified: 2023-06-28T17:54:21Z
 mimeType: None
 name: .viminfo
 nativePermissions: rw-------
 owner: 1052

## Application

In order to run a job on a system you will need to create a Tapis application.

### Create an application that can be run on the VM host or the HPC cluster

In [62]:
user_id = username
app_id = "mpm-docker-" + user_id + "-test2"

# Create the application definition
app_def = {
    "id": app_id,
    "version": "dev",
    "jobType": "FORK",
    "runtime": "DOCKER",
    "description": "High-Performance Material Point Method (CB-Geo mpm) DEVELOPMENT version.",
    "containerImage": "tapis/mpm:dev",
    "jobAttributes": {
        "isMpi": False,
        "parameterSet": {
            "appArgs": [
                {"name": "directoryInputFlag", "arg": "-f", "inputMode": "FIXED"},
                {"name": "directoryInput", "arg": "/home/cbgeo/research/mpm-benchmarks/2d/uniaxial_stress/", "inputMode": "REQUIRED"}
            ] 
        },
        "fileInputs": [
            {
                "name": "directoryInput",
                "inputMode": "OPTIONAL",
                "targetPath": ".",
                "description": "Input directory that contains the MPM congiguration file as well as any other required files. Note that to utilize this attribute one must also set the directoryInput parameter to mbe the value of the name of the directory. Also note that if this directory is not provided, a default (included in the appliation container image) will be used."
            }
        ]
    }

}

# Use the client to create the application in Tapis
print("****************************************************")
print("Create application: " + app_id)
print("****************************************************")
client.apps.createAppVersion(**app_def)

# If you need to update the application,
# - modify the above definition as needed
# - comment out the above line
# - uncomment the below line
# - re-run the cell
#client.apps.putApp(**app_def, appId=app_id, appVersion="0.0.1")

****************************************************
Create application: mpm-docker-training1-test2
****************************************************



url: http://training.tapis.io/v3/apps/mpm-docker-training1-test2

In [63]:
# List all applications available to you
print("****************************************************")
print("List all applications")
print("****************************************************")
client.apps.getApps()

****************************************************
List all applications
****************************************************


[
 id: mpm-docker-scblack-training1
 owner: training1
 version: dev,
 
 id: mpm-docker-training1-test2
 owner: training1
 version: dev,
 
 id: mpm-docker-training1-tsi123
 owner: training1
 version: dev,
 
 id: mpm-docker-training1
 owner: training1
 version: dev]

In [64]:
# Get details for the application you created
print("****************************************************")
print("Fetch application: " + app_id)
print("****************************************************")
client.apps.getAppLatestVersion(appId=app_id)

****************************************************
Fetch application: mpm-docker-training1-test2
****************************************************



containerImage: tapis/mpm:dev
created: 2023-07-17T23:26:17.608201Z
deleted: False
description: High-Performance Material Point Method (CB-Geo mpm) DEVELOPMENT version.
enabled: True
id: mpm-docker-training1-test2
isPublic: False
jobAttributes: 
archiveOnAppError: False
archiveSystemDir: None
archiveSystemId: None
cmdPrefix: None
coresPerNode: 1
description: None
dynamicExecSystem: False
execSystemConstraints: None
execSystemExecDir: None
execSystemId: None
execSystemInputDir: None
execSystemLogicalQueue: None
execSystemOutputDir: None
fileInputArrays: []
fileInputs: [
autoMountLocal: True
description: Input directory that contains the MPM congiguration file as well as any other required files. Note that to utilize this attribute one must also set the directoryInput parameter to mbe the value of the name of the directory. Also note that if this directory is not provided, a default (included in the appliation container image) will be used.
inputMode: OPTIONAL
name: directoryInput
source

## Jobs

We will run two jobs, one on the VM host using FORK and one on the HPC type host using BATCH.

We will use the same Tapis application to run both jobs.

### Part 1: Run Material Point Method (MPM) app on a Virtual Machine.


In [65]:
# Run MPM app on a Virtual Machine

# Submit a job
job_response_vm=client.jobs.submitJob(name='mpm-job-vm',description='material point method',appId=app_id,execSystemId=system_id_vm,appVersion= 'dev')

### Get Job submission response


In [67]:
# Get Job submission response
print("****************************************************")
print("Job Submitted: " + app_id)
print("****************************************************")
print(job_response_vm)

****************************************************
Job Submitted: mpm-docker-training1-test2
****************************************************

_fileInputsSpec: None
_parameterSetModel: None
appId: mpm-docker-training1-test2
appVersion: dev
archiveCorrelationId: None
archiveOnAppError: False
archiveSystemDir: /workdir/jobs/a3ebb02f-fefe-405b-af09-5680f01f1a34-007/output
archiveSystemId: tapis-vm-training1-test2
archiveTransactionId: None
blockedCount: 0
cmdPrefix: None
coresPerNode: 1
created: 2023-07-17T23:26:42.967785771Z
createdby: training1
createdbyTenant: training
description: material point method
dtnMountPoint: None
dtnMountSourcePath: None
dtnSystemId: None
dynamicExecSystem: False
ended: None
execSystemConstraints: None
execSystemExecDir: /workdir/jobs/a3ebb02f-fefe-405b-af09-5680f01f1a34-007
execSystemId: tapis-vm-training1-test2
execSystemInputDir: /workdir/jobs/a3ebb02f-fefe-405b-af09-5680f01f1a34-007
execSystemLogicalQueue: None
execSystemOutputDir: /workdir/jobs/a3e

### Get Jobs Listings


In [68]:
# Get Jobs listings
client.jobs.getJobList()

[
 appId: mpm-docker-training1
 appVersion: dev
 archiveSystemId: tapis-vm-training1
 created: 2023-06-29T18:57:53.292621Z
 ended: 2023-06-29T18:58:25.074876Z
 execSystemId: tapis-vm-training1
 lastUpdated: 2023-06-29T18:58:25.074876Z
 name: mpm-job-vm
 owner: training1
 remoteStarted: 2023-06-29T18:58:07.696322Z
 status: FINISHED
 tenant: training
 uuid: e0ba3f5e-6238-416f-8a51-41f454ca78a0-007,
 
 appId: mpm-docker-training1
 appVersion: dev
 archiveSystemId: tapis-hpc-training1
 created: 2023-06-29T19:00:01.063780Z
 ended: 2023-06-29T19:00:32.304812Z
 execSystemId: tapis-hpc-training1
 lastUpdated: 2023-06-29T19:00:32.304812Z
 name: mpm-hpc
 owner: training1
 remoteStarted: 2023-06-29T19:00:15.075089Z
 status: FINISHED
 tenant: training
 uuid: 8fc4cee0-5b45-48ac-ba0a-79efcb9ea700-007,
 
 appId: sgx3-simple-sentiment-analysis-2
 appVersion: 0.1.0
 archiveSystemId: stampede2.sgx3.nathandf.test2
 created: 2023-06-06T20:16:30.975407Z
 ended: 2023-06-06T20:17:09.338452Z
 execSystemId: st

### Get Job UUID from the submission response


In [69]:
# Get job uuid from the job submission response
print("****************************************************")
job_uuid_vm=job_response_vm.uuid
print("Job UUID: " + job_uuid_vm)
print("****************************************************")

****************************************************
Job UUID: a3ebb02f-fefe-405b-af09-5680f01f1a34-007
****************************************************


### Check the status of the job


In [70]:
# Check the status of the job
print("****************************************************")
print(client.jobs.getJobStatus(jobUuid=job_uuid_vm))
print("****************************************************")

****************************************************

status: FINISHED
****************************************************


### Download output of the job


In [71]:
# Once the job is in the FINISHED state, you can download output of the job
print("Job Output file:")

print("****************************************************")
jobs_output_vm= client.jobs.getJobOutputDownload(jobUuid=job_uuid_vm,outputPath='stdout')
print(jobs_output_vm)
print("****************************************************")

Job Output file:
****************************************************
****************************************************


### Cancel a job


In [None]:
# If necessary, you can cancel a long running job.
# To cancel a running job
# client.jobs.cancelJob(jobUuid=job_uuid_vm)

## Part 2: Run a Batch Job on HPC type host

Using the same Tapis application we can also run the image classifier as a batch job on an HPC type host


In [87]:
# Run MPM app on the HPC Machine

# Submit a job
job_response_hpc=client.jobs.submitJob(name='mpm-hpc',description='mpm',appId=app_id,execSystemId=system_id_hpc,appVersion= 'dev')

### Get Job submission response


In [88]:
print("****************************************************")
print("Job Submitted: " + app_id)
print("****************************************************")
print(job_response_hpc)

****************************************************
Job Submitted: mpm-docker-training1
****************************************************

_fileInputsSpec: None
_parameterSetModel: None
appId: mpm-docker-training1
appVersion: dev
archiveCorrelationId: None
archiveOnAppError: False
archiveSystemDir: /workdir/jobs/8fc4cee0-5b45-48ac-ba0a-79efcb9ea700-007/output
archiveSystemId: tapis-hpc-training1
archiveTransactionId: None
blockedCount: 0
cmdPrefix: None
coresPerNode: 1
created: 2023-06-29T19:00:01.063780167Z
createdby: training1
createdbyTenant: training
description: mpm
dtnMountPoint: None
dtnMountSourcePath: None
dtnSystemId: None
dynamicExecSystem: False
ended: None
execSystemConstraints: None
execSystemExecDir: /workdir/jobs/8fc4cee0-5b45-48ac-ba0a-79efcb9ea700-007
execSystemId: tapis-hpc-training1
execSystemInputDir: /workdir/jobs/8fc4cee0-5b45-48ac-ba0a-79efcb9ea700-007
execSystemLogicalQueue: None
execSystemOutputDir: /workdir/jobs/8fc4cee0-5b45-48ac-ba0a-79efcb9ea700-007/ou

### Check job status


In [93]:
# Check the status of the job
print("****************************************************")
job_uuid_hpc=job_response_hpc.uuid
print(client.jobs.getJobStatus(jobUuid=job_uuid_hpc))
print("****************************************************")

****************************************************

status: FINISHED
****************************************************


### Download output of the HPC job


In [94]:
# Download output of the job
print("Job Output file:")

print("****************************************************")
jobs_output_hpc= client.jobs.getJobOutputDownload(jobUuid=job_uuid_hpc,outputPath='stdout')
print(jobs_output_hpc)
print("****************************************************")

Job Output file:
****************************************************
****************************************************


## Workflows

In this section, we are going to use tapipy to construct a pipeline that builds and HPC application container image, pushes it to a remote image registry, then run some tests in a container using the HPC application 

### Dockerhub Credentials

First we need to set our Dockerhub credentials. This will be used to give the image builder permissions to push to your Dockerhub account.

#### NOTE:
Your Dockerhub credentials will be encrypted and safely stored in the Tapis Security Kernel (backed by HasiCorp Vault)

In [50]:
dockerhub_username = input('Dockerhub username: ')
dockerhub_personal_access_token = getpass.getpass(prompt='Dockerhub Access Token: ', stream=None)

### Create a Group
All workflow resources must exist within a group. A group is collection of users that have access to workflow resources such as Pipelines and Tasks. Anyone that belongs to a group can create their own pipelines and run pipelines owned by that group.

In [72]:
# Create the group
group_id = "test1-pearc23-group-" + username

print("****************************************************")
create_group_resp = client.workflows.createGroup(id=group_id)
print(create_group_resp)
print("****************************************************")

****************************************************


BaseTapyException: message: A Group already exists with the id 'test1-pearc23-group-training1'

### Create a Pipeline
Pipelines are simply collections of tasks. Tasks can be added to a pipeline after it is created or directly in the pipeline definition itself. For this demonstration we will be creating everything at once.

The first task in this pipeline is an image build task. Image build tasks require a "context", which is the source control repository which contains the Dockerfile we want to build from.

The next two tasks run jobs on an HPC system to ensure that there are no errors with the image. The first test ensures that MPM was compiled correctly and the second run a test script called uniaxial traction

In [73]:
# Create the group
pipeline_id = "test1-pearc23-pipeline-" + username

print("****************************************************")
create_pipeline_resp = client.workflows.createPipeline(**{
    "id": pipeline_id,
    "group_id": group_id,
    "type": "workflow",
    "execution_profile": {
        "max_retries": 0,
        "invocation_mode": "async",
        "duplicate_submission_policy": "terminate", # Terminates the current running pipeline if another is submitted
        "max_exec_time": 3600 # in seconds
    },
    "tasks": [
        {
            "id": "build-mpm-image",
            "pipeline_id": pipeline_id,
            "group_id": group_id,
            "type": "image_build",
            "builder": "kaniko", # Alternative to docker that allows you to build containers in containers
            "context": {
                "type": "github",
                "branch": "main",
                "url": "tapis-project/application-repository",
                "build_file_path": "Dockerfile",
                "sub_path": "/material-point-method/mpm-dummy-src/docker_build",
                "visibility": "public"
            },
            "destination": {
                "type": "dockerhub",
                "url": "nathandf/dummy-mpm",
                "tag": "pearc-test",
                "credentials": {
                    "username": dockerhub_username,
                    "token": dockerhub_personal_access_token
                }
            }
        },
        {
            "id": "test-mpm-compiled",
            "type": "tapis_job",
            "tapis_job_def": {
                "name": 'mpm-compiled-correctly',
                "description": 'material point method',
                "appId": app_id,
                "execSystemId": system_id_vm,
                "appVersion": 'dev'
            },
            "depends_on": [
                {"id": "build-mpm-image"}
            ]
        },
        {
            "id": "test-mpm-uniaxial-traction",
            "type": "tapis_job",
            "tapis_job_def": {
                "name": "mpm-uniaxial-traction-test",
                "appId": app_id,
                "appVersion": "dev",
                "execSystemId": system_id_vm,
                "appArgs": {
                    "directoryInput": "./benchmarks/2d/uniaxial_traction/"
                }
            },
            "depends_on": [
                {"id": "build-mpm-image"}
            ]
        }
    ]
})
print(create_pipeline_resp)
print("****************************************************")

****************************************************

url: https://training.tapis.io/v3/workflows/groups/test1-pearc23-group-training1/pipelines/test1-pearc23-pipeline-training1
****************************************************


### Running a pipeline

Once a pipeline has been definined it can now be run with a simple call the runPipeline endpoint.

#### NOTE:

In our execution profile of our pipeline definition, we set the execution profile to `terminate` this means that once you run a pipeline, all subsequent submissions of a pipeline to workflow engine will terminate the one previously running, so only run the cell below a single time or the next run of the pipeline may end up delayed.


In [139]:
print("****************************************************")
runs = client.workflows.listPipelineRuns(group_id=group_id, pipeline_id=pipeline_id)
sorted_runs = sorted(runs, key=lambda run: run.started_at, reverse=True)
if len(sorted_runs) > 0 and sorted_runs[0].status not in ["pending", "active"]:
    run_pipeline_resp = client.workflows.runPipeline(group_id=group_id, pipeline_id=pipeline_id)
    print(run_pipeline_resp)
else:
    print(f"Pipeline currently {sorted_runs[0].status}")
print("****************************************************")

****************************************************
Pipeline currently pending
****************************************************


### Checking the PipelineRun Status

Your pipeline is now running. It will take some time for the HPC image to build. In the meantime, you can check the status of the run by running the cell below

In [124]:
print("****************************************************")
pipeline_runs = client.workflows.listPipelineRuns(group_id=group_id, pipeline_id=pipeline_id)
last = sorted(pipeline_runs, key=lambda run: run.started_at, reverse=True)[0]
print("Current Pipeline Run")
print("****************************************************")
print(last)
print("****************************************************")
print("Task Executions")
print("****************************************************")
task_executions = client.workflows.listTaskExecutions(group_id=group_id, pipeline_id=pipeline_id, pipeline_run_uuid=last.uuid)
for i, execution in enumerate(task_executions):
    print(f"[{i}]", execution.task_id, execution.status)
print("****************************************************")

****************************************************
Current Pipeline Run
****************************************************

last_modified: 2023-07-18 17:30:25
pipeline: ad2dd92b-ee51-4f7e-b4b5-f83b413335e9
started_at: 2023-07-18 17:30:19
status: active
uuid: 76e56560-0f7f-455a-a545-f5d2a90ef828
****************************************************
Task Executions
****************************************************
****************************************************
