## Monitoring and Accessing topsStack jobs 

## Introduction

In this notebook we will show, how to:
 - **How to minitor jobs running in SDS**
 - **Find all the completed, failed, running or queued jobs**
 - **Download generated products from the cloud**

## Setup configuration for interaction with on-demand SDS
In this section, we setup the necessary python libraries and API endpoints needed to interact with the **on-demand SDS**. 

The `otello` python library provides high-level access to operations on the on-demand SDS. In particular for this demo, it provides us with access to and information about the job types registered on the SDS along with the capability to submit jobs, check for job statuses, and to query for products generated by the job.


### Establish an otello `Mozart` instance to communicate with the HySDS cluster controller.
#### It will be necessary to provide credentials the first time you initialise otello.

#### When prompted for "HySDS cluster authenticated", enter 'y' if the cluster requires a password to access.

In [None]:
import json
import os
import otello
import re
import shutil

from pathlib import Path
from pprint import pprint

if not os.path.exists(f"{Path.home()}/.config/otello/config.yml"):
    otello.client.initialize()

m = otello.mozart.Mozart()



### Use Otello to get job information 


#### Get status of all the jobs

In [None]:
js = m.get_jobs()
# for j in js:
#    #print(dir(j))
#    print("{} : {}".format(j.job_id, j.get_status()))
#   print("{}".format(j.job_id))

#### Get information about particular job type

In [None]:

job_type = "job-xing-topsstack_hamsar_pge:devel"
#topsStack = m.get_job_types()[job_type]
#topsStack.initialize()

print("FAILED TOPSSTACK JOBS")
js=m.get_failed_jobs(job_type=job_type)
print(len(js))
for j in js:
    #print(j)
    print("{} ({})".format(j.job_id["id"], j.job_id["tags"]))
    
print("\nCOMPLETED TOPSTACK JOBS")
js=m.get_completed_jobs(job_type=job_type)
print(len(js))

for j in js:
    #print(j)
    print("{} ({})".format(j.job_id["id"], j.job_id["tags"]))
    #print("{} ({})".format(j.job_id["id"], j.get_status("")))

# ACCESSING THE GENERATED PRODUCTS
### Get information about the generated topsApp products from the topsApp jobs
#### The generated topsApp products are stored in the cloud next to the on-demand SDS

In [None]:
job_set = m.get_completed_jobs(job_type=job_type)

prods = []
for job in job_set:
    try:
        prod = job.get_generated_products()
        print(json.dumps(prod, indent=2, sort_keys=True))
        prods.append(prod)
    except Exception as e:
        print(e)

### Download the generated standard topsApp products from the cloud into this notebook
#### Here we use the AWS CLI to download the generated datasets.

In [None]:
local_dirs = []
for prod in prods:
    try:
        prod_url = re.sub(r'^s3://.+?/(.+)$', r's3://\1', prod[0]["urls"][-1]) # get s3 url
        local_dir = os.path.basename(prod_url)
        if os.path.isdir(local_dir): shutil.rmtree(local_dir)
        !aws s3 sync $prod_url $local_dir
        local_dirs.append(local_dir)
    except Exception as e:
        print(e)
for local_dir in local_dirs:
    !ls $local_dir