# TACCSTER 2022 Hands-on

In this notebook, you will use Tapis v3 to create two systems and one application that will be used to run
an image classification job on both a VM using docker and an HPC type host using singularity.

To execute each `In[#]` cell, you can click inside the cell and press `Shift + Enter`

Install Tapis Python SDK.  After running the code below you need to restart the runtime - go to the Menu and select Runtime -> Restart runtime or use CTRL+M on the keyboard. Now you can execute the code in the notebook and follow the rest of the tutorial.

pip install tapipy

## Enter training account information

To get things started, please run the following and enter the training account information provided to you:

import getpass

tenant = 'training'
base_url = 'https://' + tenant + '.tapis.io'

# Enter Tapis Username. Example: trainingXX
username = input('Username: ')
# Enter Tapis password. Example: trainingXX
password = getpass.getpass(prompt='Password: ', stream=None)
# Enter VM password shared with you on email
password_vm = getpass.getpass(prompt='Password for VM: ', stream=None)
# IP address of VM 
host = input('Host: ')

## Authenticate and initialize Tapis v3 client

Using this information, you can now use `tapipy` to authenticate in the tenant and initialize the
Tapis v3 client. You should see your token information displayed. This may take a while to run but should take
no more than 30 seconds.

from tapipy.tapis import Tapis
#Create python Tapis client for user
client = Tapis(base_url= base_url, username=username, password=password)
# *** Tapis v3: Call to Tokens API
client.get_tokens()
# Print Tapis v3 token
client.access_token

## Systems

In this section we create two Tapis systems, one for running on a VM host using FORK and one for running on an HPC type host using BATCH.

Note that although it is possible, we have not provided any login credentials in the system definitions.
Well-crafted system definitions are likely to be copied and re-used, so, for security reasons, it is recommended that
login credentials be registered using separate API calls as discussed below.

### Create a system for the VM host

user_id = username
system_id_vm = "taccster-vm-" + user_id

# Create the system definition
exec_system_vm = {
  "id": system_id_vm,
  "description": "Test system",
  "systemType": "LINUX",
  "host": host,
  "effectiveUserId": user_id,
  "defaultAuthnMethod": "PASSWORD",
  "rootDir": "/home/"+user_id,
  "canExec": True,
  "jobRuntimes": [ { "runtimeType": "SINGULARITY" } ],
  "jobWorkingDir": "workdir",
}

# Use the client to create the system in Tapis
print("****************************************************")
print("Create system: " + system_id_vm)
print("****************************************************")
client.systems.createSystem(**exec_system_vm)


# If you need to update the system, you can modify the original definition and use the putSystem call.
# - modify the above definition as needed
# - comment out the above line with the call to createSystem()
# - uncomment the below line with the call to updateSystem()
# - re-run the cell
# Note that not all attributes may be updated.
#client.systems.putSystem(**exec_system_vm, systemId=system_id_vm)

# You can also update just a few attributes using the patchSystem call.
# Note that not all attributes may be updated and some attributes, such as *enabled*,
#   may only be updated using a specific call.
# For example, to update the description, first define the json to be used:
patch_system_vm = {
  "description": "System for testing jobs on a VM for TACCSTER"
}

# Then use the client to make the update:
client.systems.patchSystem(**patch_system_vm, systemId=system_id_vm)

# List all systems available to you
print("****************************************************")
print("List all systems")
print("****************************************************")
client.systems.getSystems()

# Get details for the system you created
print("****************************************************")
print("Fetch system: " + system_id_vm)
print("****************************************************")
client.systems.getSystem(systemId=system_id_vm)

### Register Credentials for the VM system

After creating the system, you will need to register credentials for your username. These will be used by Tapis to
access the host. Various authentication methods can be used to access a system, such as PASSWORD and PKI_KEYS. For the
VM a password is used.

# Register credentials
client.systems.createUserCredential(systemId=system_id_vm, userName=user_id, password=password_vm)

Now you can use the client to list files on the system. This will confirm that the credentials are valid.

# List files at the rootDir for the system
client.files.listFiles(systemId=system_id_vm, path="/")

### Create a system for the HPC cluster

With just a few changes to the system definition you can create a second system that can be used to run the
same application on an HPC type host. Note the minimal changes:

* **id** - A unique id is required
* **host** - Main hostname for the HPC system.
* **rootDir** - Using the root directory of the host gives us flexibility in setting **jobWorkingDir**.
  Note that you still need LINUX permissions.
* **jobWorkingDir** - Now determined dynamically using the Tapis v3 function HOST_EVAL()
* **jobRuntimes** - Most HPC systems support singularity and not docker
* **batchLogicalQueue.hpcQueueName** - HPC queue to use by default.
* **batchLogicalQueues** - HPC queue definitions for this HPC system.

user_id = username
system_id_hpc = "taccster-hpc-" + user_id

# Create the system definition
exec_system_hpc = {
  "id": system_id_hpc,
  "description": "System for testing jobs on an HPC type host for TACCSTER",
  "systemType": "LINUX",
  "host": host,
  "defaultAuthnMethod": "PASSWORD",
  "effectiveUserId": user_id,
  "rootDir": "/home/"+user_id,
  "canExec": True,
  "jobRuntimes": [ { "runtimeType": "SINGULARITY" } ],
  "jobWorkingDir": "workdir",
  "canRunBatch": True,
  "batchScheduler": "SLURM",
  "batchSchedulerProfile": "tacc",
  "batchDefaultLogicalQueue": "tapisNormal",
  "batchLogicalQueues": [
    {
      "name": "tapisNormal",
      "hpcQueueName": "normal",
      "maxJobs": 50,
      "maxJobsPerUser": 10,
      "minNodeCount": 1,
      "maxNodeCount": 16,
      "minCoresPerNode": 1,
      "maxCoresPerNode": 68,
      "minMemoryMB": 1,
      "maxMemoryMB": 16384,
      "minMinutes": 1,
      "maxMinutes": 60
    }
  ]
}

# Use the client to create the system in Tapis
print("****************************************************")
print("Create system: " + system_id_hpc)
print("****************************************************")
client.systems.createSystem(**exec_system_hpc)

# If you need to update the system,
# - modify the above definition as needed
# - comment out the above line
# - uncomment the below line
# - re-run the cell
#client.systems.putSystem(**exec_system_hpc, systemId=system_id_hpc)


# List all systems available to you
print("****************************************************")
print("List all systems")
print("****************************************************")
client.systems.getSystems()

# Get details for the system you created
print("****************************************************")
print("Fetch system: " + system_id_hpc)
print("****************************************************")
client.systems.getSystem(systemId=system_id_hpc)

### Register Credentials for the HPC system

As before, now you will need to register credentials for your username. These will be used by Tapis to
access the host.

password_hpc = password_vm
# Register credentials
client.systems.createUserCredential(systemId=system_id_hpc, userName=user_id, password=password_hpc)

Now you can use the client to list files on the system. This will confirm that the credentials are valid.

# List files at the rootDir for the system
path_to_list = "/"
client.files.listFiles(systemId=system_id_hpc, path=path_to_list)

## Application

In order to run a job on a system you will need to create a Tapis application.

### Create an application that can be run on the VM host or the HPC cluster

user_id = username
app_id = "taccster-img-classify-" + user_id

# Create the application definition
app_def = {
  "id": app_id,
  "version": "0.0.1",
  "description": "Image classifier application",
  "runtime": "SINGULARITY",
  "runtimeOptions": ["SINGULARITY_RUN"],
  "containerImage": "/tmp/img-classify_03.sif",
  "jobAttributes": {
    "parameterSet": {
      "archiveFilter": { "includeLaunchFiles": False }
    },
    "memoryMB": 1,
    "nodeCount": 1,
    "coresPerNode": 1,
    "maxMinutes": 10
  }
}

# Use the client to create the application in Tapis
print("****************************************************")
print("Create application: " + app_id)
print("****************************************************")
client.apps.createAppVersion(**app_def)

# If you need to update the application,
# - modify the above definition as needed
# - comment out the above line
# - uncomment the below line
# - re-run the cell
#client.apps.putApp(**app_def, appId=app_id, appVersion="0.0.1")

# List all applications available to you
print("****************************************************")
print("List all applications")
print("****************************************************")
client.apps.getApps()

# Get details for the application you created
print("****************************************************")
print("Fetch application: " + app_id)
print("****************************************************")
client.apps.getAppLatestVersion(appId=app_id)

## Jobs

We will run two jobs, one on the VM host using FORK and one on the HPC type host using BATCH.

We will use the same Tapis application to run both jobs.

### Part 1: Run Image classifier app on a Virtual Machine.


# Run Image classifier app on a Virtual Machine
# In the arguments pass a url of the image you would like to classify
pa = {
 "jobType": "FORK",
 "parameterSet": {
   "appArgs": [
     {"arg": "--image_file"},
     {"arg": "https://s3.amazonaws.com/cdn-origin-etr.akc.org/wp-content/uploads/2017/11/12231410/Labrador-Retriever-On-White-01.jpg"}
   ]
 }
}
# Submit a job
job_response_vm=client.jobs.submitJob(name='img-classifier-job-vm',description='image classifier',appId=app_id,execSystemId=system_id_vm,appVersion= '0.0.1',
  **pa)

### Get Job submission response


# Get Job submission response
print("****************************************************")
print("Job Submitted: " + app_id)
print("****************************************************")
print(job_response_vm)

### Get Jobs Listings


# Get Jobs listings
client.jobs.getJobList()

### Get Job UUID from the submission response


# Get job uuid from the job submission response
print("****************************************************")
job_uuid_vm=job_response_vm.uuid
print("Job UUID: " + job_uuid_vm)
print("****************************************************")

### Check the status of the job


# Check the status of the job
print("****************************************************")
print(client.jobs.getJobStatus(jobUuid=job_uuid_vm))
print("****************************************************")

### Download output of the job


# Once the job is in the FINISHED state, you can download output of the job
print("Job Output file:")

print("****************************************************")
jobs_output_vm= client.jobs.getJobOutputDownload(jobUuid=job_uuid_vm,outputPath='output.txt')
print(jobs_output_vm)
print("****************************************************")

### Setting Notifications on Job events

# Run Image classifier app on a Virtual Machine
# In the arguments pass a url of the image you would like to classify
pa = {
 "jobType": "FORK",
 "parameterSet": {
   "appArgs": [
     {"arg": "--image_file"},
     {"arg": "https://s3.amazonaws.com/cdn-origin-etr.akc.org/wp-content/uploads/2017/11/12231410/Labrador-Retriever-On-White-01.jpg"}
   ]
 }
}

# Submit a job
job_response_vm_email=client.jobs.submitJob(name='img-classifier-job-vm',description='image classifier',appId=app_id,execSystemId=system_id_vm,appVersion= '0.0.1',subscriptions= [ { "description": "Test subscriptions", "eventCategoryFilter": "ALL","deliveryTargets": [ { "deliveryMethod": "EMAIL","deliveryAddress":"ajamthe@tacc.utexas.edu"}] }],
  **pa)

# Get Job submission response
print("****************************************************")
print("Job Submitted: " + app_id)
print("****************************************************")
print(job_response_vm_email)

# Get job uuid from the job submission response
print("****************************************************")
job_uuid_vm_email=job_response_vm_email.uuid
print("Job UUID: " + job_uuid_vm_email)
print("****************************************************")

# Check the status of the job
print("****************************************************")
print(client.jobs.getJobStatus(jobUuid=job_uuid_vm_email))
print("****************************************************")

### Cancel a job


# If necessary, you can cancel a long running job.
# To cancel a running job
# client.jobs.cancelJob(jobUuid=job_uuid_vm)

## Part 2: Run a Batch Job on HPC type host

Using the same Tapis application we can also run the image classifier as a batch job on an HPC type host


# Run Image classifier app on the HPC Machine
# In the arg pass a url of the image you would like to classify
pa = {
 "parameterSet": {
      "appArgs": [
          {"arg": "--image_file"},
          {"arg": "'https://s3.amazonaws.com/cdn-origin-etr.akc.org/wp-content/uploads/2017/11/12231410/Labrador-Retriever-On-White-01.jpg'"}
      ]
 }
}
# Submit a job
job_response_hpc=client.jobs.submitJob(name='img-classifier-job-vm',description='image classifier',appId=app_id,execSystemId=system_id_hpc,appVersion= '0.0.1',
  **pa)

### Get Job submission response


print("****************************************************")
print("Job Submitted: " + app_id)
print("****************************************************")
print(job_response_hpc)

### Check job status


# Check the status of the job
print("****************************************************")
job_uuid_hpc=job_response_hpc.uuid
print(client.jobs.getJobStatus(jobUuid=job_uuid_hpc))
print("****************************************************")

### Download output of the HPC job


# Download output of the job
print("Job Output file:")

print("****************************************************")
jobs_output_hpc= client.jobs.getJobOutputDownload(jobUuid=job_uuid_hpc,outputPath='output.txt')
print(jobs_output_hpc)
print("****************************************************")

## Analyzing jobs output

print ("==============Image Classifier Scores ============================")
s = jobs_output_vm.split(b'\n')
# If you want to analyze the results of hpc output uncomment the line below and comment line above
# s = jobs_output_hpc.split(b'\n')
s.reverse()
scores=[]
for i in range(1,6):
    scores.append(s[i])
    print (s[i])

## Sharing Results
You can share your job results with other collaborators within the same tenant by sharing system and output files.


### Grant Permissions on System


client.systems.grantUserPerms(systemId=system_id_vm,userName='training**',permissions=['READ'])

### Grant Permissions on a specific folder


client.files.grantPermissions(systemId=system_id_vm, path='/workdir', username='training**', permission='READ')

### Once you have shared system and file with a certain training account, they can perform steps below
Add credentials on the system so they can access the shared files

Get the contents of shared files


# Get system details
client.systems.getSystem(systemId=system_id_vm)

client.systems.createUserCredential(systemId=system_id_vm, userName="training**", password="")

# File Download with train***
client.files.getContents(systemId=system_id_vm, path='/workdir/jobs/<jobUuid>/output/output.txt')

## Sharing Application
You can share your application with a group of users or share it publicly with all the users in a tenant


# Sharing application with specific users
client.apps.ShareApp(appId=app_id, users=["trainingXX"])

# Get Share info on the app
client.apps.getShareInfo(appId=app_id)

Users with whom you have shared the app, should be able to submit job on that app_id and download job output

# Unshare app with specific users
client.apps.unShareApp(appId=app_id, users=["trainingXX"])

# Get Share info on the app
client.apps.getShareInfo(appId=app_id)

# Making the app public
client.apps.shareAppPublic(appId=app_id)

# Get Share info on the app
client.apps.getShareInfo(appId=app_id)
# Now any user in the tenant should be able to run your application

# Unsharing public app
client.apps.unShareAppPublic(appId=app_id)