# EPMT Query API

This workbook will illustrate the usage of the EPMT Query API. It assumes you have `EPMT`
installed.


## Table of Contents

 * [Import data for the study](#import-data)
 * [Import module](#import-module)
 * [API function categories](#api-categories)
 * [Getting documentation](#getting-docs)
 * [Job Queries](#job-query)
   * [output format and converting between formats](#output-formats)
   * [working with ORM objects (ADVANCED TOPIC)](#orm-objects)
   * [job tags](#job-tags)
   * [ordering and filtering jobs](#jobs-order-filter)
   * [failed jobs](#failed-jobs)
   * [process sums (ADVANCED TOPIC)](#proc-sums-field)
   
 * [Operations](#ops)
   * [select op processes](#select-op-procs)
   * [the Operation primitive](#operation-primitive)
   * [`get_ops` API call](#get-ops)
   * [aggregating operation metrics](#op-metrics)
   * [data-movement v. useful work](#dm-ops)
   * [op_metrics grouped by tag](#group-by-tag)
   * [cpu-time v. duration](#cpu-time-v-duration)
   * [operation costs v. total time](#ops-costs)

 * [Process Queries](#process-query)
   * [process tags](#process-tags)
     * [unique process tags in job (ADVANCED TOPIC)](#job-proc-tags)
   * [filter and ordering](#filter-processes)
   * [thread metrics aggregation (ADVANCED TOPIC)](#thread-metrics-aggregation)
   * [process tree and depth (ADVANCED TOPIC)](#proc-tree)
   
 * [Thread Query](#thread-query)
 
 * [Useful Queries](#useful-queries)
   * [process tree walk](#process-tree-walk)
   * [failed processes](#failed-procs)
   * [all process tags for job](#job-proc-tags)
   * [root process](#root-process)
   * [timeline](#timeline)
 * [Useful Attributes of Job/Process/Threads](#useful-attributes)
 * [Misc. queries](#misc-queries)
 * [Modifying Job Metadata](#modify-job-metadata)
   * [annotate jobs](#annotate-jobs)
   * [analyze jobs](#analyze-jobs)
   
 * [Deleting Jobs](#delete-jobs)
   * [delete jobs based on their start/finish times](#delete-jobs-time-based)


## <a name="import-data">Import the data for this study</a>

This workbook relies on importing the following data. We use an sqlite database 
in this study, but you can use another database such as `postgresql`.
See the `preset_settings` folder to pick up a template of your choice and edit it
if needed. Save the template in the `epmt` folder as `settings.py`

While not required to do so, it's recommended that you start in a fresh database
so as not to affect your existing data. The sqlite database path is controlled
in `settings.py`, and is typically a file in the user `HOME`.

```
# pick the database settings file of your choice
$ cp ../preset_settings/settings_sqlite_localfile.py settings.py

# backup your existing database
# The path might vary depending on your settings.py
# mv ~/EPMT_DB.sqlite ~/EPMT_DB.sqlite.backup

# now import the data
$ epmt submit test/data/query_notebook/*.tgz

# check the list of imported jobs
$ epmt list
['625172', '627922', '629337', '633144', '676007', '680181', '685000', '685016', '692544', '693147', '696127', '802954', '804285']
```

<a name="import-module"></a>

In [52]:
# import the query api module
import epmt_query as eq

import pandas as pd
import numpy as np

### <a name="api-categories">API Function Categories</a>

The API functions can be broadly put into a few categories based on the hierarchy of data they operate on: 

**Job-level queries**

These functions operate on individual jobs or a collections of jobs. For e.g., `get_jobs`, `get_job_tags`, `are_jobs_comparable`, and `comparable_job_partitions`.


**Operation-level queries**

These functions operate on the level of operations. An operation can be considered as a collection of processes that share some common tags. For e.g., `get_ops`, and `get_op_metrics`.


**Process-level queries**

These functions operate on individual processes. These tend to be more time-consuming, as the number of processes are many orders of magnitude more than the number of jobs. For e.g., `get_procs`, `job_proc_tags`, `conv_procs`, `rank_proc_tags_keys` and `timeline`.


**Thread-level queries**

At present we have only a single API call for this category: `get_thread_metrics`.


**Reference model calls**

These operate on user-defined models on jobs and operations. For e.g., `create_refmodel`, `delete_refmodels` and `get_refmodels`.


### <a name="getting-docs">Getting to the docs</a>

The module functions have embedded documentation in the form of docstrings. You can access it, 
as you would do for any Python module/function:

To get help for all functions in the module, do `help(<module-name)`:
```
help(eq)
```

To get documentation for a specific function, do something like:
```
help(eq.get_jobs)
```

On the command-line you can get a listing of API functions, by doing:
```
$ epmt help api
```

And, details on an individual function, by doing:
```
$ epmt help api <function-name>   # for example "epmt help api get_jobs"
```

### <a name="job-query">Job Query</a>

This perhaps the function that you will use most often! The job query usually takes a `tag` and returns a collection of jobs in the format specified by `fmt`. The returned list can be pruned and/or ordered using `fltr`, `limit` and `order`.

You can also pass in one or more jobs as a `jobs` parameter, most often for format conversion.

Let's get started!

In [2]:
# let's get jobs, we use the job tag to select the jobs
jobs = eq.get_jobs(tags='exp_name:ESM4_historical_D151;exp_component:ocean_month_rho2_1x1deg',fmt='terse')
jobs

['625172',
 '627922',
 '629337',
 '633144',
 '676007',
 '680181',
 '685016',
 '692544',
 '693147',
 '696127',
 '802954',
 '804285']

<a name="output-formats"></a>`fmt` can take one of the following values:
 * `terse` -- this returns a list of job ids
 * `pandas` -- this returns a pandas dataframe
 * `dict` -- for a list of python dictionaries
 * `orm` -- ORM object for maximum flexibility and speediest queries.

In [3]:
# above we got a list of job ids. sometimes we want to see more details
# than just the job id. We can use `conv_jobs` to convert between formats
# Or use `get_jobs` to get the specified format directly.
jobs_df = eq.conv_jobs(jobs, fmt='pandas')
display(jobs_df.columns.values)
jobs_df

array(['duration', 'updated_at', 'tags', 'info_dict', 'env_dict',
       'cpu_time', 'annotations', 'env_changes_dict', 'analyses',
       'submit', 'start', 'jobid', 'end', 'jobname', 'created_at',
       'exitcode', 'user', 'rchar', 'syscr', 'syscw', 'wchar', 'majflt',
       'minflt', 'rssmax', 'inblock', 'outblock', 'usertime', 'num_procs',
       'processor', 'vol_ctxsw', 'guest_time', 'read_bytes', 'systemtime',
       'time_oncpu', 'timeslices', 'invol_ctxsw', 'num_threads',
       'write_bytes', 'time_waiting', 'all_proc_tags', 'rdtsc_duration',
       'delayacct_blkio_time', 'cancelled_write_bytes',
       'PERF_COUNT_SW_CPU_CLOCK'], dtype=object)

Unnamed: 0,duration,updated_at,tags,info_dict,env_dict,cpu_time,annotations,env_changes_dict,analyses,submit,...,timeslices,invol_ctxsw,num_threads,write_bytes,time_waiting,all_proc_tags,rdtsc_duration,delayacct_blkio_time,cancelled_write_bytes,PERF_COUNT_SW_CPU_CLOCK
0,12630660000.0,2020-04-23 13:55:26.514946,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job625172', '...",959968389.0,{},{},{},2019-06-09 18:53:22.574059,...,3175850,30049,13471,134643470336,48105200287,"[{'op': 'cp', 'op_instance': '1', 'op_sequence...",99908669360503,0,2448543744,866184046169
1,6532174000.0,2020-04-23 13:55:28.794444,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job627922', '...",679701225.0,{},{},{},2019-06-10 06:23:14.388744,...,824685,11945,3596,72541306880,13802065001,"[{'op': 'cp', 'op_instance': '11', 'op_sequenc...",52576073050078,0,3044032512,649725614194
2,6696039000.0,2020-04-23 13:55:31.016861,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job629337', '...",623964730.0,{},{},{},2019-06-10 09:59:22.043793,...,805888,9577,3596,70252703744,19476877777,"[{'op': 'cp', 'op_instance': '11', 'op_sequenc...",54486853149126,0,1998835712,576733745205
3,6625638000.0,2020-04-23 13:55:33.252995,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job633144', '...",621768978.0,{},{},{},2019-06-10 16:49:06.802212,...,826722,29372,3596,70780125184,24198886180,"[{'op': 'cp', 'op_instance': '11', 'op_sequenc...",53485319494154,0,1385582592,582075035986
4,10080730000.0,2020-04-23 13:55:35.473007,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job676007', '...",640906121.0,{},{},{},2019-06-14 08:30:37.421228,...,849400,12819,3596,73864601600,23712986693,"[{'op': 'cp', 'op_instance': '11', 'op_sequenc...",90008938460454,0,1465577472,605349533801
5,6009934000.0,2020-04-23 13:55:37.682140,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job680181', '...",571561896.0,{},{},{},2019-06-14 16:34:15.052476,...,818718,21774,3596,70259195904,32443024755,"[{'op': 'cp', 'op_instance': '11', 'op_sequenc...",40101203774504,0,1998835712,522980958427
6,7005619000.0,2020-04-23 13:55:39.903867,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job685016', '...",427082965.0,{},{},{},2019-06-15 07:52:38.592038,...,812300,9667,3596,70246367232,11304090572,"[{'op': 'cp', 'op_instance': '11', 'op_sequenc...",49705960766145,0,3392450560,403054040862
7,709300900.0,2020-04-23 13:55:42.121584,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job692544', '...",593701277.0,{},{},{},2019-06-16 13:54:28.828890,...,797027,10348,3596,70606073856,18759054582,"[{'op': 'cp', 'op_instance': '11', 'op_sequenc...",12080626593893,0,3392417792,557685977561
8,3340305000.0,2020-04-23 13:55:44.295646,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job693147', '...",594222175.0,{},{},{},2019-06-16 16:20:31.601990,...,801396,14401,3614,70251761664,21984544439,"[{'op': 'cp', 'op_instance': '11', 'op_sequenc...",-28389475590895298,0,1998848000,553117186630
9,3676905000.0,2020-04-23 13:55:46.492903,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...","{'tz': 'US/Eastern', 'status': {'exit_code': 0...","{'PWD': '/vftmp/Jeffrey.Durachta/job696127', '...",607235263.0,{},{},{},2019-06-17 06:20:59.842457,...,813225,11854,3596,70476115968,17253128153,"[{'op': 'cp', 'op_instance': '11', 'op_sequenc...",33167362344222,0,2347225088,574686010894


In [4]:
# if you prefer dealing with python lists and dictionaries,
# you can set fmt='dict'. Here we get a list of dictionaries
# Notice we can use `get_jobs` to convert formats
eq.get_jobs(jobs = jobs, fmt='dict')

[{'duration': 12630660818.0,
  'updated_at': datetime.datetime(2020, 4, 23, 13, 55, 26, 514946),
  'tags': {'atm_res': 'c96l49',
   'ocn_res': '0.5l75',
   'exp_name': 'ESM4_historical_D151',
   'exp_time': '18540101',
   'script_name': 'ESM4_historical_D151_ocean_month_rho2_1x1deg_18540101',
   'exp_component': 'ocean_month_rho2_1x1deg'},
  'info_dict': {'tz': 'US/Eastern',
   'status': {'exit_code': 0,
    'exit_reason': 'none',
    'script_name': 'ESM4_historical_D151_ocean_month_rho2_1x1deg_18540101',
    'script_path': '/home/Jeffrey.Durachta/ESM4/DECK/ESM4_historical_D151/gfdl.ncrc4-intel16-prod-openmp/scripts/postProcess/ESM4_historical_D151_ocean_month_rho2_1x1deg_18540101.tags'},
   'post_processed': 1},
  'env_dict': {'PWD': '/vftmp/Jeffrey.Durachta/job625172',
   'TMP': '/vftmp/Jeffrey.Durachta/job625172',
   'EPMT': '/home/Jeffrey.Durachta/workflowDB/build//epmt/epmt',
   'HOME': '/home/Jeffrey.Durachta',
   'HOST': 'pp301',
   'LANG': 'en_US',
   'PATH': '/home/gfdl/bin2:/

<a name="orm-objects"></a>
There is a very useful format called ORM, this optimizes queries
and it lets you get the underlying Job (or Process) object directly

In [5]:
jobs_orm = eq.get_jobs(jobs, fmt='orm')
jobs_orm.count(), type(jobs_orm)

(12, sqlalchemy.orm.query.Query)

`jobs_orm` above is a `Query` object. The `Query` object can be iterated
over (like a Python list). You can convert it to a list by using the slice
operator -- `[:]`.

The ORM format is powerful as it minimizes the number of SQL queries and
lazy-evaluates queries where possible.

#### <a name="job-tags">Job Tags</a>

Each job has a `tags` field that is set during import time. The job tag is a stored
as dictionary of key/value pairs. The most common use of the job tag is for selecting
jobs. You can specify the tag either as a dictionary or as a string, with each key/value
pair separated by semicolons. All the key/value pairs must match for a job to be considered
a match.

In [6]:
exp_ESM4_historical_D151_jobs = eq.get_jobs(tags='exp_name:ESM4_historical_D151', fmt='orm')

In [7]:
exp_ESM4_historical_D151_jobs.count()

12

In [8]:
for j in exp_ESM4_historical_D151_jobs:
    print(j.jobid, j.tags)

625172 {'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'exp_name': 'ESM4_historical_D151', 'exp_time': '18540101', 'script_name': 'ESM4_historical_D151_ocean_month_rho2_1x1deg_18540101', 'exp_component': 'ocean_month_rho2_1x1deg'}
627922 {'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'exp_name': 'ESM4_historical_D151', 'exp_time': '18590101', 'script_name': 'ESM4_historical_D151_ocean_month_rho2_1x1deg_18590101', 'exp_component': 'ocean_month_rho2_1x1deg'}
629337 {'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'exp_name': 'ESM4_historical_D151', 'exp_time': '18640101', 'script_name': 'ESM4_historical_D151_ocean_month_rho2_1x1deg_18640101', 'exp_component': 'ocean_month_rho2_1x1deg'}
633144 {'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'exp_name': 'ESM4_historical_D151', 'exp_time': '18690101', 'script_name': 'ESM4_historical_D151_ocean_month_rho2_1x1deg_18690101', 'exp_component': 'ocean_month_rho2_1x1deg'}
676007 {'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'exp_name': 'ESM4_historical_D151', 'exp_time'

Sometimes you would like to know the job tags for a collection of jobs, and see the range of
values different fields in the job tags take. 

In [9]:
eq.get_job_tags(exp_ESM4_historical_D151_jobs)

{'atm_res': 'c96l49',
 'ocn_res': '0.5l75',
 'exp_name': 'ESM4_historical_D151',
 'exp_time': {'18540101',
  '18590101',
  '18640101',
  '18690101',
  '18740101',
  '18790101',
  '18840101',
  '18890101',
  '18940101',
  '18990101',
  '19040101',
  '19090101'},
 'script_name': {'ESM4_historical_D151_ocean_month_rho2_1x1deg_18540101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_18590101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_18640101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_18690101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_18790101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_18840101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_18890101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_18940101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_18990101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_19040101',
  'ESM4_historical_D151_ocean_month_rho2_1x1deg_19090101'},
 'exp_componen

The output says that among all the `exp_ESM4_historical_D151_jobs`, `ocean_res` had a singular value `0.5l75`, the `exp_name` had the same value `ESM4_historical_D151`, but the `exp_time` ranged over a set of values -- `{'18540101','18590101','18640101','18690101','18740101','18790101','18840101','18890101','18940101','18990101','19040101','19090101'}`. The jobs shared the same value for `exp_component` -- `ocean_month_rho2_1x1deg`.

#### <a name="jobs-order-filter">Ordering and Filtering Jobs</a>

You can use the `order`, `limit`, and `fltr` option with `get_jobs` to sort and filter the job list.
It is advisable to use `limit` when possible, as it sends a `LIMIT` option to the SQL query
and saves database load time.

In [10]:
# some other useful queries might be for instance to order the jobs
# by duration, and getting the top 5
df = eq.get_jobs(jobs, order=eq.desc(eq.Job.duration), fmt="pandas")
df[['jobid', 'tags', 'duration', 'exitcode']]

Unnamed: 0,jobid,tags,duration,exitcode
0,625172,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",12630660000.0,0
1,676007,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",10080730000.0,0
2,685016,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",7005619000.0,0
3,629337,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",6696039000.0,0
4,633144,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",6625638000.0,0
5,627922,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",6532174000.0,0
6,680181,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",6009934000.0,0
7,802954,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",3879024000.0,0
8,696127,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",3676905000.0,0
9,693147,"{'atm_res': 'c96l49', 'ocn_res': '0.5l75', 'ex...",3340305000.0,0


<a name="failed-jobs"></a>Let's figure out which if any jobs failed.

In [11]:
eq.get_jobs(jobs_orm, fltr=(eq.Job.exitcode != 0), fmt='terse')

[]

So, none of the jobs had a non-zero exit code.

#### <a name="proc-sums-field">Aggregation across job processes (ADVANCED TOPIC)</a>

Each job object has a `proc_sums` field that aggregates data across the 
processes of the job. The field itself is a dictionary of key/value pairs.
This field is an attribute in the Job object, and when converting from `orm` 
to the other formats, the underlying key/value pairs of the dictionary are made available 
as top-level fields of the `dict` or `pandas` dataframe. `proc_sums` represents aggregates across
the processes of a job:

In [12]:
j = jobs_orm.first()
sorted(j.proc_sums.keys())

['PERF_COUNT_SW_CPU_CLOCK',
 'all_proc_tags',
 'cancelled_write_bytes',
 'delayacct_blkio_time',
 'guest_time',
 'inblock',
 'invol_ctxsw',
 'majflt',
 'minflt',
 'num_procs',
 'num_threads',
 'outblock',
 'processor',
 'rchar',
 'rdtsc_duration',
 'read_bytes',
 'rssmax',
 'syscr',
 'syscw',
 'systemtime',
 'time_oncpu',
 'time_waiting',
 'timeslices',
 'usertime',
 'vol_ctxsw',
 'wchar',
 'write_bytes']

Now, the fields shown above become available in other formats (`dict` and `pandas`) as top-level fields, while the `proc_sums`
field itself is masked.

In [13]:
j_df = eq.get_jobs(j, fmt='pandas')
sorted(j_df.columns.values)

['PERF_COUNT_SW_CPU_CLOCK',
 'all_proc_tags',
 'analyses',
 'annotations',
 'cancelled_write_bytes',
 'cpu_time',
 'created_at',
 'delayacct_blkio_time',
 'duration',
 'end',
 'env_changes_dict',
 'env_dict',
 'exitcode',
 'guest_time',
 'inblock',
 'info_dict',
 'invol_ctxsw',
 'jobid',
 'jobname',
 'majflt',
 'minflt',
 'num_procs',
 'num_threads',
 'outblock',
 'processor',
 'rchar',
 'rdtsc_duration',
 'read_bytes',
 'rssmax',
 'start',
 'submit',
 'syscr',
 'syscw',
 'systemtime',
 'tags',
 'time_oncpu',
 'time_waiting',
 'timeslices',
 'updated_at',
 'user',
 'usertime',
 'vol_ctxsw',
 'wchar',
 'write_bytes']

## <a name="ops">Operations</a>

An operation is simply a collection of processes that share a tag. At tag, you will recall is
simply a dictionary of key/value pairs (it may be expressed in string form).
The collection of processes form a **forest**. In the degenerate case of a single root process, the forest will contain a single tree.

### <a name="select-op-procs">Selecting processes in an operation</a>

We can select the processes in an operation by passing a tag to `get_procs`.
You may limit the selection to a single job or multiple jobs using the
`jobs` parameter to `get_procs`.

In [14]:
# below we use the ORM format as we just want a count on the number of processes in the operation
hsmget_op_procs = eq.get_procs(jobs, tags='op:hsmget', fmt='orm')
hsmget_op_procs.count()

27720

### <a name="operation-primitive">The Operation primitive</a>

Using `get_procs` with a tag to select processes in a operation is somewhat
clumsy. The EPMT Query API defines an **Operation** primitive. The `Operation`
API call is passed one or more jobs, and a `tag`. Internally, it calls `get_procs`.
By using the `Operation` primitive, you get aggregated metrics across the
processes constituting the operation in a `proc_sums` attribute. You can specify a granular
tag such as `{'op': 'timavg', 'op_instance': 100, 'op_sequence': 5 }`, or a more
coarse tag, such as `{'op': 'timavg'}`. The important thing to understand is that
all the processes that constitute the operation will share *ALL* the keys of the tag.

In [15]:
op = eq.Operation(jobs, {'op': 'hsmget'})
(op.tags, op.processes.count(), op.proc_sums)

({'op': 'hsmget'},
 27720,
 {'syscr': 2446648,
  'cancelled_write_bytes': 1290240,
  'rchar': 6068618970,
  'wchar': 317449816715,
  'inblock': 197032,
  'write_bytes': 317461139456,
  'cpu_time': 2540237921.0,
  'timeslices': 11577749,
  'systemtime': 531837341,
  'syscw': 1674748,
  'processor': 0,
  'numtids': 30212,
  'num_procs': 27720,
  'outblock': 620041288,
  'minflt': 39641382,
  'PERF_COUNT_SW_CPU_CLOCK': 2242055598562,
  'vol_ctxsw': 11465035,
  'time_waiting': 172721616270,
  'usertime': 2008400580,
  'read_bytes': 100880384,
  'rdtsc_duration': -28107424259990747,
  'time_oncpu': 2556135725081,
  'guest_time': 0,
  'rssmax': 198999756,
  'majflt': 245,
  'invol_ctxsw': 82453,
  'delayacct_blkio_time': 0,
  'duration': 87996944703.0})

### <a name='get-ops'>`get_ops` API call</a>
We also have an API call `get_ops` (much like `get_jobs` and `get_procs`) that supports querying for a collection of operations, and multiple output formats, selected using the option -- `fmt`.

In [16]:
dm_ops = eq.get_ops(jobs, tags = ['op:hsmget', 'op:dmput', 'op:cp', 'op:rm', 'op:mv', 'op:untar'], fmt='pandas')
pd.set_option('display.max_colwidth', 150)
dm_ops[['tags','proc_sums', 'start', 'finish', 'duration']]

Unnamed: 0,tags,proc_sums,start,finish,duration
0,{'op': 'hsmget'},"{'syscr': 2446648, 'cancelled_write_bytes': 1290240, 'rchar': 6068618970, 'wchar': 317449816715, 'inblock': 197032, 'write_bytes': 317461139456, '...",2019-06-09 18:53:24.294728,2019-06-22 17:43:35.235059,87996940000.0
1,{'op': 'dmput'},"{'syscr': 55456, 'cancelled_write_bytes': 3903488, 'rchar': 68898988, 'wchar': 1209464, 'inblock': 136640, 'write_bytes': 76611584, 'cpu_time': 30...",2019-06-09 18:53:22.610123,2019-06-22 17:50:01.516486,70347800000.0
2,{'op': 'cp'},"{'syscr': 871999, 'cancelled_write_bytes': 52948992, 'rchar': 1583622456, 'wchar': 1295256442, 'inblock': 56976, 'write_bytes': 1298894848, 'cpu_t...",2019-06-09 22:17:47.314092,2019-06-22 17:48:32.888042,553131000.0
3,{'op': 'rm'},"{'syscr': 204650, 'cancelled_write_bytes': 22139564032, 'rchar': 119740938, 'wchar': 390348, 'inblock': 112, 'write_bytes': 3985408, 'cpu_time': 2...",2019-06-09 22:18:09.621773,2019-06-22 17:49:59.287992,47021610.0
4,{'op': 'mv'},"{'syscr': 134937, 'cancelled_write_bytes': 286720, 'rchar': 13787477891, 'wchar': 589442682, 'inblock': 729, 'write_bytes': 1404928, 'cpu_time': 1...",2019-06-09 22:18:43.539051,2019-06-22 17:49:59.331484,992812400.0
5,{'op': 'untar'},"{'syscr': 1592602, 'cancelled_write_bytes': 573440, 'rchar': 14556202373, 'wchar': 13279211324, 'inblock': 3565344, 'write_bytes': 13293514752, 'c...",2019-06-09 22:15:34.974112,2019-06-22 17:48:29.733792,99932220.0


The dataframe contains one row per `tag` (or operation). `proc_sums` contains aggregate metrics across the underlying processes constituting the operation.

`get_ops` includes a `full` option that includes additional fields in each row, such as `intervals`, which denotes the intervals during which a discontiguous operation ran. It also includes the processes constituting the operation. 

In [17]:
eq.get_ops(jobs, tags = ['op:hsmget'], full=True, fmt='pandas')

Unnamed: 0,jobs,tags,exact_tag_only,op_duration_method,duration,proc_sums,start,finish,intervals,contiguous,num_runs,processes
0,"[625172, 627922, 629337, 633144, 676007, 680181, 685016, 692544, 693147, 696127, 802954, 804285]",{'op': 'hsmget'},False,sum,87996940000.0,"{'syscr': 2446648, 'cancelled_write_bytes': 1290240, 'rchar': 6068618970, 'wchar': 317449816715, 'inblock': 197032, 'write_bytes': 317461139456, '...",2019-06-09 18:53:24.294728,2019-06-22 17:43:35.235059,"((2019-06-09 18:53:24.294728, 2019-06-09 19:04:32.832405), (2019-06-09 19:04:32.843397, 2019-06-09 19:04:32.846370), (2019-06-09 19:04:32.859709, ...",False,1337,"[{'updated_at': None, 'tags': {'op': 'hsmget', 'op_instance': '4', 'op_sequence': '20'}, 'exename': 'perl', 'exitcode': 0, 'info_dict': None, 'pat..."


As you can see the `hsmget` operation contains `1337` discontiguous intervals!

### <a name="op-metrics">Aggregating operation metrics</a>

The `Operation` primitive provides an easy way to obtain aggregates on metrics across
processes in an operation. Before `Operation`, the way to obtain metrics was to
use the `op_metrics` API call:

In [18]:
# widen width of column display width to show full tag
pd.set_option('display.max_colwidth', 200)

# get the operations with the top cpu_time summed across all processes. 
# Note, cpu_time is better measure of time spent in an operation than 
# 'duration', which might end up double-counting as in a 
# parent-child process scenario, where the parent waits on the time child.
ops_df = eq.op_metrics(['629337', '680181'], fmt='pandas').sort_values(by='cpu_time', ascending=False)
ops_df[['jobid', 'tags', 'duration', 'cpu_time']][:10]

Unnamed: 0,jobid,tags,duration,cpu_time
24,629337,"{'op': 'fregrid', 'op_instance': '7', 'op_sequence': '80'}",69376250.0,68409594.0
25,680181,"{'op': 'fregrid', 'op_instance': '7', 'op_sequence': '80'}",55815480.0,55867500.0
29,680181,"{'op': 'hsmget', 'op_instance': '1', 'op_sequence': '3'}",1348386000.0,53789735.0
31,680181,"{'op': 'hsmget', 'op_instance': '1', 'op_sequence': '5'}",1039550000.0,49622360.0
116,629337,"{'op': 'ncrcat', 'op_instance': '13', 'op_sequence': '76'}",48194520.0,48163676.0
117,680181,"{'op': 'ncrcat', 'op_instance': '13', 'op_sequence': '76'}",46517320.0,44601217.0
26,629337,"{'op': 'hsmget', 'op_instance': '1', 'op_sequence': '1'}",695662000.0,39960829.0
34,629337,"{'op': 'hsmget', 'op_instance': '1', 'op_sequence': '9'}",383194400.0,36986292.0
28,629337,"{'op': 'hsmget', 'op_instance': '1', 'op_sequence': '3'}",2833971000.0,35084576.0
32,629337,"{'op': 'hsmget', 'op_instance': '1', 'op_sequence': '7'}",280321800.0,34164711.0


#### <a name="dm-ops">Data movement operations</a>
The above call was slow to execute and resulted in a lot of operations. The `op_metrics` call can take a 
list of tags if one knows the operations one cares about. The pruning using the `tags` argument speeds up
the operation significantly. Let's figure out the time spent
in data movement operations</a> v. useful work.
In the call to `op_metrics` below, we pass in the *list of tags* that
represent the data-movement operations. As it's a list of tags, it's like
an OR-operation with the tags.

In [19]:
dm_tags = ['op:hsmget', 'op:cp', 'op:dmget', 'op:gcp', 'op:mv', 'op:untar', 'op:tar', 'op:rm']
dm_ops_df = eq.op_metrics(jobs, tags = dm_tags)
dm_ops_df[['jobid', 'tags', 'cpu_time', 'duration', 'num_procs']]

Unnamed: 0,jobid,tags,cpu_time,duration,num_procs
0,625172,{'op': 'hsmget'},525588783.0,16145110000.0,8860
1,627922,{'op': 'hsmget'},151229296.0,7160026000.0,1713
2,629337,{'op': 'hsmget'},221437661.0,7716459000.0,1713
3,633144,{'op': 'hsmget'},207422750.0,7509555000.0,1713
4,676007,{'op': 'hsmget'},187083822.0,14508170000.0,1713
5,680181,{'op': 'hsmget'},230162385.0,7836627000.0,1713
6,685016,{'op': 'hsmget'},123305670.0,7585973000.0,1713
7,692544,{'op': 'hsmget'},190831238.0,1526509000.0,1713
8,693147,{'op': 'hsmget'},199442956.0,5148361000.0,1730
9,696127,{'op': 'hsmget'},201617665.0,4595266000.0,1713


While the query above helps, we would like it to aggregate across jobs by tag. This
is easily accomplished by passing the <a name="group-by-tag">`group_by_tag`</a> 
argument to `op_metrics`:

In [20]:
dm_ops_df_grouped = eq.op_metrics(jobs, tags = dm_tags, group_by_tag = True)
dm_ops_df_grouped[['tags', 'cpu_time', 'duration', 'num_procs']]

Unnamed: 0,tags,cpu_time,duration,num_procs
0,{'op': 'cp'},125672200.0,553131000.0,12827
1,{'op': 'hsmget'},2540238000.0,87996940000.0,27720
2,{'op': 'mv'},142701400.0,992812400.0,900
3,{'op': 'rm'},26662950.0,47021610.0,2940
4,{'op': 'untar'},45750640.0,99932220.0,2513


So, the total time spent in all data-movement operations can be calculated easily.

In [21]:
dm_ops_df_grouped['cpu_time'].sum()/1e6

2881.025086

In [22]:
# total time spent in the jobs
s = 0
for j in jobs_orm: s += j.cpu_time
s/1e6

7351.686315

In [23]:
# data-movement as a percentage of total time
round((100*__/_), 2)

39.19

#### <a name="cpu-time-v-duration">cpu time v. duration</a>
So, the data-movement operations take about `39%` of the total cpu time across our jobs.
There is a reason we did not use `duration` for our calculation, and instead we used
`cpu_time` (a.k.a exclusive cpu time). The reason is that `duration` can get counted multiple
times if a process forks and waits for a child to terminate. The `duration` or `wall-clock` 
time will end up getting calculated both for the parent process and the child process. 
`cpu_time` on the other hand is the actual time spent on the cpu, and cannot be counted twice 
in such a scenario.

### <a name="ops-costs">Operation costs as a fraction of total time</a>

You might be wondering if there's a quicker way to determine the above data movement cost computation, and there is. We have an API call `ops_costs` to determine the fractional cost of operations, where the cost can be computed on a metric, such as `duration` or `cpu_time`.

In [24]:
(dm_percent, dm_ops_df, jobs_cpu_time, dm_agg_df_by_job) = eq.ops_costs(jobs, tags = dm_tags, features = ['cpu_time'])
dm_percent

39.189

In [25]:
dm_ops_df[['jobid', 'tags', 'num_procs', 'cpu_time', 'duration']]

Unnamed: 0,jobid,tags,num_procs,cpu_time,duration
0,625172,{'op': 'cp'},1343,10866143.0,57597100.0
1,625172,{'op': 'hsmget'},8860,525588783.0,16145110000.0
2,625172,{'op': 'mv'},75,6685913.0,61464740.0
3,625172,{'op': 'rm'},245,1931483.0,3206230.0
4,625172,{'op': 'untar'},1523,15601214.0,63272750.0
5,627922,{'op': 'cp'},1044,13160937.0,49746430.0
6,627922,{'op': 'hsmget'},1713,151229296.0,7160026000.0
7,627922,{'op': 'mv'},75,16288451.0,99449600.0
8,627922,{'op': 'rm'},245,2634351.0,3820788.0
9,627922,{'op': 'untar'},90,3668353.0,4424118.0


In [26]:
# this shows the costs aggregated by jobs (across tags)
dm_agg_df_by_job

Unnamed: 0,jobid,job_cpu_time,dm_cpu_time,dm_cpu_time%
0,625172,959968389.0,560673536.0,58.0
1,627922,679701225.0,186981388.0,28.0
2,629337,623964730.0,257558676.0,41.0
3,633144,621768978.0,239827284.0,39.0
4,676007,640906121.0,209620927.0,33.0
5,680181,571561896.0,261270304.0,46.0
6,685016,427082965.0,140949699.0,33.0
7,692544,593701277.0,223632776.0,38.0
8,693147,594222175.0,233669236.0,39.0
9,696127,607235263.0,226836335.0,37.0


The table above shows the percentage of `cpu_time` spent in data-movement operations by job. While, most of the jobs are in the `30-40%` range, the first job is an outlier. This might be because, being the first in the series, it might need it's data to be fetched from long-term storage. While, the jobs following it, would find their data in the cache.

### <a name="process-query">Process Query</a>

A process query returns a collection of one or more processes. The query is
passed a `jobs` parameter to restrict the process set to those that belong to a
specified set of `jobs`. 

Like the job query, the process query can take `tag`, `fmt`, 
`fltr`, `order` and `limit` to filter and format the output. `order` and `limit` become
particularly important in process queries as each job can have thousands of processes,
and that takes time to load from the database. In the same vein, using `fmt=orm` is a good
idea, in process queries as that minimizes the database overhead in certain cases.

In [56]:
# If you want to get the processes belonging to a job
# here each row in the pandas dataframe contains one job process
# again, you can use the 'terse' fmt option to get just the list of database ids of the processes
procs = eq.get_procs(['629337'], fmt='pandas')
display(sorted(procs.columns.values))
# remove columns with only None values and see the first 10 rows
procs.replace(to_replace=[None], value=np.nan).dropna(axis=1,how="all")[:10]

['PERF_COUNT_SW_CPU_CLOCK',
 'args',
 'cancelled_write_bytes',
 'cpu_time',
 'created_at',
 'delayacct_blkio_time',
 'depth',
 'duration',
 'end',
 'exename',
 'exitcode',
 'gen',
 'guest_time',
 'host',
 'id',
 'inblock',
 'inclusive_cpu_time',
 'info_dict',
 'invol_ctxsw',
 'job',
 'jobid',
 'majflt',
 'minflt',
 'numtids',
 'outblock',
 'parent',
 'path',
 'pgid',
 'pid',
 'ppid',
 'processor',
 'rchar',
 'rdtsc_duration',
 'read_bytes',
 'rssmax',
 'sid',
 'start',
 'syscr',
 'syscw',
 'systemtime',
 'tags',
 'time_oncpu',
 'time_waiting',
 'timeslices',
 'updated_at',
 'user',
 'usertime',
 'vol_ctxsw',
 'wchar',
 'write_bytes']

Unnamed: 0,tags,exename,exitcode,path,id,args,pid,jobid,numtids,ppid,...,systemtime,time_oncpu,timeslices,invol_ctxsw,write_bytes,time_waiting,rdtsc_duration,delayacct_blkio_time,cancelled_write_bytes,PERF_COUNT_SW_CPU_CLOCK
0,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",tcsh,0,/bin/tcsh,19355,-f /home/Jeffrey.Durachta/ESM4/DECK/ESM4_historical_D151/gfdl.ncrc4-intel16-prod-openmp/scripts/postProcess/ESM4_historical_D151_ocean_month_rho2_1x1deg_18640101.tags,16269,629337,1,16268,...,352946,808668773,3804,4,5451776,427323246,23153886255396,0,4096,755006778
1,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",perl,0,/home/fms/local/perlbrew/perls/perl-5.24.0/bin/perl,19354,/home/fms/local/opt/fre-commands/bronx-15/bin/frepp -A -x /home/Jeffrey.Durachta/ncrc/CMIP6/xml/ESM4/DECK/production/ESM4_historical_D151.xml -t 18640101 -s -v --platform gfdl.ncrc4-intel16 -T pro...,31308,629337,1,16269,...,66989,1852939885,426,73,69632,1398209,7579221013,0,4096,1825774202
2,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",bash,0,/bin/bash,19353,-c echo torque/6.0.2:moab/9.0.2:slurm/18.08:globus/system-or-6.0:hdf5/1.8.8:netcdf/4.2:nco/4.5.4:nccmp/1.8.2.0:hsm/1.2.2:perlbrew/5.24.0:gcp/2.3:ncarg/6.2.1:python/2.7.3:fre/bronx-15:fre-analysis/...,31347,629337,1,31308,...,2999,7958715,10,1,0,56548,63903043,0,0,1270150
3,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",grep,0,/bin/grep,19352,^fre/.+,31350,629337,1,31347,...,1999,5971828,6,0,0,4517378,17320923,0,0,288875
4,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",tr,0,/usr/bin/tr,19351,: n,31349,629337,1,31347,...,1999,4687213,6,1,0,8550701,265116,0,0,77395
5,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",bash,0,/bin/bash,19350,-c echo torque/6.0.2:moab/9.0.2:slurm/18.08:globus/system-or-6.0:hdf5/1.8.8:netcdf/4.2:nco/4.5.4:nccmp/1.8.2.0:hsm/1.2.2:perlbrew/5.24.0:gcp/2.3:ncarg/6.2.1:python/2.7.3:fre/bronx-15:fre-analysis/...,31348,629337,1,31347,...,0,1110693,1,0,0,4971699,361030,0,0,105049
6,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",which,0,/usr/bin/which,19349,fredb,31346,629337,1,31308,...,2999,7385108,9,2,0,88913,1085226,0,0,102192
7,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",id,0,/usr/bin/id,19348,-Gn,31345,629337,1,31308,...,2999,7249342,9,2,0,101952,1675261,0,0,299989
8,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",id,0,/usr/bin/id,19347,-Gn,31344,629337,1,31308,...,3999,7739709,9,2,0,152639,2394410,0,0,393361
9,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '89'}",id,0,/usr/bin/id,19346,-Gn,31343,629337,1,31308,...,3999,7273913,9,2,0,96956,2119768,0,0,319789


You could also pass in more than one job, in which case the returned processes
would be a superset of those across the jobs list. Here we use the `orm` format
to speed the query since we just want a `count` of processes.

In [58]:
procs = eq.get_procs(['629337', '625172'], fmt='orm')
procs.count()

15943

#### <a name="process-tags">Process Tags</a>

Each process saves a dictionary of key/value pairs, such as:
`{'op': "ncatted", 'op_instance': 12, 'op_sequence': 159}`

The process tag is commonly used to filter processes that constitute an **operation** using the `tag` option.

In [59]:
# below we get the processes in an operation that is identified by 'op_sequence=66'
op = eq.get_procs(jobs, tags='op:cp;op_instance:11;op_sequence:66', fmt='pandas')
len(op)

1914

##### <a name="job-proc-tags">Unique process tags in a job (ADVANCED TOPIC)</a>

For a job we can determine the unique set of process tags</a> across all its processes using the
`job_proc_tags` API call.

In [60]:
# suppose you want to figure out the unique set of operations
# across all the jobs of interest. We would pass in our list of
# jobs
eq.job_proc_tags(jobs_orm)

[{'op': 'cp', 'op_instance': '1', 'op_sequence': '119'},
 {'op': 'cp', 'op_instance': '1', 'op_sequence': '122'},
 {'op': 'cp', 'op_instance': '1', 'op_sequence': '123'},
 {'op': 'cp', 'op_instance': '11', 'op_sequence': '167'},
 {'op': 'cp', 'op_instance': '15', 'op_sequence': '180'},
 {'op': 'cp', 'op_instance': '3', 'op_sequence': '131'},
 {'op': 'cp', 'op_instance': '5', 'op_sequence': '140'},
 {'op': 'cp', 'op_instance': '7', 'op_sequence': '149'},
 {'op': 'cp', 'op_instance': '9', 'op_sequence': '158'},
 {'op': 'dmput', 'op_instance': '1', 'op_sequence': '126'},
 {'op': 'dmput', 'op_instance': '2', 'op_sequence': '190'},
 {'op': 'fregrid', 'op_instance': '1', 'op_sequence': '117'},
 {'op': 'fregrid', 'op_instance': '1', 'op_sequence': '121'},
 {'op': 'fregrid', 'op_instance': '2', 'op_sequence': '132'},
 {'op': 'fregrid', 'op_instance': '3', 'op_sequence': '141'},
 {'op': 'fregrid', 'op_instance': '4', 'op_sequence': '150'},
 {'op': 'fregrid', 'op_instance': '5', 'op_sequence': '

#### <a name="filter-processes">Filtering and Ordering Processes</a>

`order`, `limit` and `fltr` should be used where possible to reduce query time.

In [61]:
# now let's say we care about a particular operation. 
# Let's find the processes in the operation, and
# sort them by the cpu_time, and then see the top offenders
ncatted_procs = eq.get_procs(jobs_orm, \
                             tags = {'op': 'ncatted'}, \
                             order=eq.desc(eq.Process.cpu_time), \
                             limit=10, \
                             fmt='pandas')
ncatted_procs[['jobid', 'tags', 'exename', 'duration', 'cpu_time']]

Unnamed: 0,jobid,tags,exename,duration,cpu_time
0,680181,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1256.0,58990.0
1,680181,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncdump,1112.0,53991.0
2,693147,"{'op': 'ncatted', 'op_instance': '5', 'op_sequence': '41'}",ncatted,1118.0,48992.0
3,629337,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1143.0,48992.0
4,629337,"{'op': 'ncatted', 'op_instance': '3', 'op_sequence': '32'}",ncatted,1119.0,48991.0
5,627922,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1037.0,47992.0
6,696127,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1082.0,47992.0
7,692544,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1053.0,47991.0
8,633144,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1085.0,47991.0
9,693147,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1042.0,46991.0


We could have used a more precise tag, such as `{'op': 'ncatted', 'op_sequence': 85}`,
for more granular selection. And, maybe an exename, such as `ncatted`.

In [62]:
procs = eq.get_procs(jobs_orm, tags='op:ncatted;op_sequence:85', \
                     fltr=(eq.Process.exename == "ncatted"), \
                     order=(eq.desc(eq.Process.duration)), \
                     fmt='pandas')
procs[['jobid', 'tags', 'exename', 'duration', 'cpu_time', 'exitcode']]

Unnamed: 0,jobid,tags,exename,duration,cpu_time,exitcode
0,680181,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1256.0,58990.0,0
1,629337,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1143.0,48992.0,0
2,633144,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1085.0,47991.0,0
3,696127,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1082.0,47992.0,0
4,692544,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1053.0,47991.0,0
5,693147,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1042.0,46991.0,0
6,627922,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,1037.0,47992.0,0
7,804285,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,588.0,22995.0,0
8,676007,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,569.0,23995.0,0
9,802954,"{'op': 'ncatted', 'op_instance': '15', 'op_sequence': '85'}",ncatted,536.0,21996.0,0


#### <a name="thread-metrics-aggregation">Process contains aggregated thread metrics (ADVANCED TOPIC)</a>

The `pandas` (and the `dict`) formats, in addition to having process-level data in each row, also have fields that represent metrics aggregated across the underlying threads of the process, such, as
`rssmax`, `cpu_time`, and `rchar`. The ORM `Process` object instead has a `threads_sums` attribute, 
which is a dictionary containing the above fields.

In [33]:
procs.columns.values

array(['updated_at', 'tags', 'exename', 'exitcode', 'info_dict', 'path',
       'id', 'args', 'depth', 'pid', 'jobid', 'numtids', 'ppid', 'start',
       'cpu_time', 'pgid', 'created_at', 'end', 'inclusive_cpu_time',
       'sid', 'duration', 'gen', 'job', 'host', 'parent', 'user', 'rchar',
       'syscr', 'syscw', 'wchar', 'majflt', 'minflt', 'rssmax', 'inblock',
       'outblock', 'usertime', 'processor', 'vol_ctxsw', 'guest_time',
       'read_bytes', 'systemtime', 'time_oncpu', 'timeslices',
       'invol_ctxsw', 'write_bytes', 'time_waiting', 'rdtsc_duration',
       'delayacct_blkio_time', 'cancelled_write_bytes',
       'PERF_COUNT_SW_CPU_CLOCK'], dtype=object)

#### <a name="proc-tree">Process tree and depth (ADVANCED TOPIC)</a>

Every process in a job has a `depth` parameter that denotes its depth
in the process tree, with the root process having a `depth` of `0`.

As the process tree construction is an expensive process, we have disabled
automatic creation of the process tree during ingestion by setting `lazy_compute_process_tree`
to `True` in `settings.py`. This does mean that the `depth` parameter is
ordinarily left unset in the process ORM, dataframe or dictionaries. 
We automatically compute the process tree if it's needed for example
to determine the root process(es), or operation roots, etc.

If for any reason you would like to have the `depth` parameter populated,
you can call the `compute_process_trees` API call:

In [65]:
eq.compute_process_trees(['629337', '625172'])
procs = eq.get_procs(['629337', '625172'], fmt='pandas', order=eq.desc(eq.Process.depth))
procs[['exename', 'pid', 'depth']][:5]

Unnamed: 0,exename,pid,depth
0,sleep,75597,7
1,sleep,59905,7
2,sleep,74417,7
3,sleep,94149,7
4,globus-url-copy,59577,7


Above we compute the process trees for two jobs, and then select their processes ordered by
decreasing depth. As you can see the process trees have a maximum depth of `7`.

## <a name="thread-query">Thread Query</a>

The thread query requires passing one or more *process primary keys* or `Process`
objects to `get_thread_metrics`. Let's illustrate this with an example, where
we first obtain the <a name="root-process">root process</a> of a job:

In [66]:
# let's find the root process for a particular job
root = eq.root('629337', fmt='orm')
root.pid

16269

In [67]:
root_threads_df = eq.get_thread_metrics(root)
display(root_threads_df.columns.values)
root_threads_df[['process_pk', 'tid', 'usertime', 'systemtime', 'rssmax']]

array(['end', 'pid', 'sid', 'tid', 'args', 'path', 'pgid', 'ppid', 'tags',
       'rchar', 'start', 'syscr', 'syscw', 'wchar', 'majflt', 'minflt',
       'rssmax', 'exename', 'inblock', 'numtids', 'exitcode', 'hostname',
       'outblock', 'usertime', 'processor', 'starttime', 'vol_ctxsw',
       'generation', 'guest_time', 'read_bytes', 'systemtime',
       'time_oncpu', 'timeslices', 'invol_ctxsw', 'num_threads',
       'write_bytes', 'time_waiting', 'rdtsc_duration',
       'delayacct_blkio_time', 'cancelled_write_bytes',
       'PERF_COUNT_SW_CPU_CLOCK', 'process_pk'], dtype=object)

Unnamed: 0,process_pk,tid,usertime,systemtime,rssmax
0,19355,16269,454930,352946,5516


## <a name="useful-attributes">Useful attributes in Job, Process and Thread objects</a>

The following are some useful attributes of the job, process and thread objects.
They are accessible when using the `orm` format. They are available in the 
`pandas` and `dict` formats. There is one important thing to note:

`proc_sums` field of the Job object is masked for `pandas` and `dict` formats
and the underlying keys of the dictionary are exposed at the top-level.

`threads_sums` field of the Process object is masked for `pandas` and `dict` format
and the underlying keys of the dictionary are exposed at the top-level.

### Job Attributes
 - duration: this is the wallclock time in microseconds
 - cpu_time: user+system time aggregated across all processes of the job
 - start:    start time in microseconds since epoch
 - end:      end time in microseconds since epoch
 - jobid:    database id for job (unique)
 - exitcode: return code from job
 - tags:     dict of key/value pairs
 - processes:list of processes belonging to job
 - proc_sums: aggregates across processes of a job
 

### Process Attributes
 - duration: this is the wallclock time in microseconds
 - cpu_time: exclusive user+system time for process (aggregated across it's threads)
 - inclusive_cpu_time: user+system time for the process and *all its descendants*
 - start:    start time in microseconds since epoch
 - end:      end time in microseconds since epoch
 - tags:     dict of key/value pairs
 - threads_df: json serialized dataframe of process threads (ADVANCED)
 - threads_sums: key/value pairs consisting of sums of thread metrics (ADVANCED)
 - numtids:  number of threads
 - exename
 - args
 - pid
 - ppid
 - id:       database ID for process
 - exitcode
 - parent
 - children
 - ancestors
 - descendants
 
 
### Thread Attributes
 - usertime
 - systemtime
 - user+system
 - rssmax
 - majflt
 - read_bytes
 - write_bytes

## <a name="misc-queries">Miscellaneous queries</a>

Below we have some more queries to give you a flavor of how to use the API

In [36]:
# ordinarily we would first find the job and then probe downwards
# You can use tags or fltr arguments to find the job
# As we did not include job tags in this script, let's just find the job using
# its job id
job = eq.get_jobs('676007', fmt='dict')[0]
job

{'duration': 10080732883.0,
 'updated_at': datetime.datetime(2020, 4, 23, 13, 55, 35, 473007),
 'tags': {'atm_res': 'c96l49',
  'ocn_res': '0.5l75',
  'exp_name': 'ESM4_historical_D151',
  'exp_time': '18740101',
  'script_name': 'ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101',
  'exp_component': 'ocean_month_rho2_1x1deg'},
 'info_dict': {'tz': 'US/Eastern',
  'status': {'exit_code': 0,
   'exit_reason': 'none',
   'script_name': 'ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101',
   'script_path': '/home/Jeffrey.Durachta/ESM4/DECK/ESM4_historical_D151/gfdl.ncrc4-intel16-prod-openmp/scripts/postProcess/ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101.tags'},
  'post_processed': 1},
 'env_dict': {'PWD': '/vftmp/Jeffrey.Durachta/job676007',
  'TMP': '/vftmp/Jeffrey.Durachta/job676007',
  'EPMT': '/home/Jeffrey.Durachta/workflowDB/build//epmt/epmt',
  'HOME': '/home/Jeffrey.Durachta',
  'HOST': 'pp015',
  'LANG': 'en_US',
  'PATH': '/home/gfdl/bin2:/usr/local/bin:/bin:/u

In [37]:
# now get the processes that are part of this job, let's sort them by the inclusive time
# we need to pass in the job id to restrict the query to a particular job
# the inclusive_cpu_time sums all the cpu times of the process and its dependents
# in this case you can see that after the top-level 'bash', the 'find' with the
# -exec stat shows up
procs = eq.get_procs('676007', order = (eq.desc(eq.Process.inclusive_cpu_time)), fmt='pandas', limit=10)
procs[['exename', 'duration', 'inclusive_cpu_time', 'exitcode']]

Unnamed: 0,exename,duration,inclusive_cpu_time,exitcode
0,perl,2928445.0,,0
1,bash,39033.0,,0
2,grep,9136.0,,0
3,tr,91.0,,0
4,bash,135.0,,0
5,which,4753.0,,0
6,id,708.0,,0
7,id,609.0,,0
8,id,653.0,,0
9,tcsh,10080580000.0,,0


<a name="process-tree-walk"></a>Let's do a walk through the process tree.

In [68]:
# now let's walk through the process tree. To make this easy, we use the 'orm' format
# first we compute the process tree as we intend to walk down the tree
eq.compute_process_trees(['676007'])
# let's sort the processes by exclusive cpu time
# You will get a sorted list of ORM objects, let's see the top 10
procs = eq.get_procs('676007', order = (eq.desc(eq.Process.cpu_time)), fmt='orm')[:10]
[p.pid for p in procs]

[5488, 5218, 3238, 4196, 4560, 4036, 4837, 13027, 29936, 3809]

In [69]:
# lets pick up the first
p = procs[0]

In [70]:
p.exename

'fregrid'

In [71]:
p.exename, p.args, p.duration, len(p.children), p.numtids

('fregrid',
 '--standard_dimension --input_mosaic ocean_mosaic.nc --input_file all --interp_method conserve_order1 --remap_file .fregrid_remap_file_360_by_180.nc --nlon 360 --nlat 180 --scalar_field volcello,thkcello,vo,vmo,vhGM,vhml --output_file out.nc',
 72611586.0,
 0,
 1)

In [73]:
parent = p.parent
parent

Process[26179]

In [74]:
parent.exename, parent.args, parent.pid, len(parent.children), len(parent.descendants)

('tcsh',
 '-f /home/Jeffrey.Durachta/ESM4/DECK/ESM4_historical_D151/gfdl.ncrc4-intel16-prod-openmp/scripts/postProcess/ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101.tags',
 27339,
 735,
 3380)

In [75]:
# let's see the aggregate thread metrics for this process
p.threads_sums

{'rchar': 15140041925,
 'syscr': 1741380,
 'syscw': 33346,
 'wchar': 2185067697,
 'majflt': 3,
 'minflt': 9946,
 'rssmax': 58112,
 'inblock': 12968064,
 'outblock': 2141984,
 'usertime': 62258535,
 'processor': 0,
 'vol_ctxsw': 355,
 'guest_time': 0,
 'read_bytes': 6639648768,
 'systemtime': 5995088,
 'time_oncpu': 68265620909,
 'timeslices': 536,
 'invol_ctxsw': 180,
 'write_bytes': 1096695808,
 'time_waiting': 15819937,
 'rdtsc_duration': 251077780608,
 'delayacct_blkio_time': 0,
 'cancelled_write_bytes': 0,
 'PERF_COUNT_SW_CPU_CLOCK': 68221670024}

In [76]:
# let's get the thread dataframes for p
eq.get_thread_metrics(p)

Unnamed: 0,end,pid,sid,tid,args,path,pgid,ppid,tags,rchar,...,timeslices,invol_ctxsw,num_threads,write_bytes,time_waiting,rdtsc_duration,delayacct_blkio_time,cancelled_write_bytes,PERF_COUNT_SW_CPU_CLOCK,process_pk
0,1560525488864590,5488,27282,5488,"--standard_dimension --input_mosaic ocean_mosaic.nc --input_file all --interp_method conserve_order1 --remap_file .fregrid_remap_file_360_by_180.nc --nlon 360 --nlat 180 --scalar_field volcello,th...",/home/fms/local/opt/fre-nctools/bronx-14/gfdl/bin/fregrid,27303,27339,op:fregrid;op_instance:7;op_sequence:80,15140041925,...,536,180,0,1096695808,15819937,251077780608,0,0,68221670024,26052


In [77]:
# Let's explore a particular operation in a job, and see which processes took the 
# top *inclusive* cpu time.
# Let's limit the output to the top 5 results
# and let's get a pandas dataframe
procs = eq.get_procs(jobs, tags = 'op_sequence:159', order=eq.desc(eq.Process.inclusive_cpu_time), limit=5, fmt='pandas')
procs[['exename', 'args', 'cpu_time', 'inclusive_cpu_time', 'duration']]

Unnamed: 0,exename,args,cpu_time,inclusive_cpu_time,duration
0,fregrid,--standard_dimension --input_mosaic ocean_mosaic.nc --input_file annual --interp_method conserve_order1 --remap_file .fregrid_remap_file_360_by_180.nc --nlon 360 --nlat 180 --scalar_field volcello...,10237442.0,10237442.0,10219947.0
1,mv,out.nc annual.nc,462929.0,462929.0,456714.0
2,mv,annual.nc ocean_month_rho2_1x1deg.1851.ann.nc,43992.0,43992.0,36877.0


<a name="failed-procs"></a>Let's see if there are any failed processes in our job selection.

In [78]:
# Let's find the failed processes across our jobs subset
failed_procs = eq.get_procs(jobs_orm, fltr=(eq.Process.exitcode > 0), fmt='pandas')
failed_procs[['jobid', 'exename', 'args', 'exitcode', 'tags']]

Unnamed: 0,jobid,exename,args,exitcode,tags
0,676007,tcsh,-f /home/Jeffrey.Durachta/ESM4/DECK/ESM4_historical_D151/gfdl.ncrc4-intel16-prod-openmp/scripts/postProcess/ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101.tags,1,"{'op': 'hsmget', 'op_instance': '7', 'op_sequence': '25'}"
1,676007,tcsh,-f /home/Jeffrey.Durachta/ESM4/DECK/ESM4_historical_D151/gfdl.ncrc4-intel16-prod-openmp/scripts/postProcess/ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101.tags,1,"{'op': 'rm', 'op_instance': '16', 'op_sequence': '75'}"
2,676007,tcsh,-f /home/Jeffrey.Durachta/ESM4/DECK/ESM4_historical_D151/gfdl.ncrc4-intel16-prod-openmp/scripts/postProcess/ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101.tags,1,"{'op': 'rm', 'op_instance': '16', 'op_sequence': '75'}"
3,676007,tcsh,-f /home/Jeffrey.Durachta/ESM4/DECK/ESM4_historical_D151/gfdl.ncrc4-intel16-prod-openmp/scripts/postProcess/ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101.tags,1,"{'op': 'rm', 'op_instance': '16', 'op_sequence': '75'}"
4,676007,tcsh,-f /home/Jeffrey.Durachta/ESM4/DECK/ESM4_historical_D151/gfdl.ncrc4-intel16-prod-openmp/scripts/postProcess/ESM4_historical_D151_ocean_month_rho2_1x1deg_18740101.tags,1,"{'op': 'rm', 'op_instance': '16', 'op_sequence': '75'}"
...,...,...,...,...,...
1537,629337,which,globus-ftp-client-delete-test,1,"{'op': 'hsmget', 'op_instance': '1', 'op_sequence': '9'}"
1538,629337,which,globus-ftp-client-cksm-test,1,"{'op': 'hsmget', 'op_instance': '3', 'op_sequence': '10'}"
1539,629337,which,globus-ftp-client-mlst-test,1,"{'op': 'hsmget', 'op_instance': '3', 'op_sequence': '10'}"
1540,629337,which,globus-ftp-client-ascii-verbose-list-test,1,"{'op': 'hsmget', 'op_instance': '3', 'op_sequence': '10'}"


Let's focus only on a particular operation, and prune the list a bit

In [79]:
failed_procs = eq.get_procs(jobs, tags='op_sequence:79', fltr=(eq.Process.exitcode > 0), fmt='pandas')
failed_procs[['jobid', 'id', 'exename', 'args', 'exitcode']]

Unnamed: 0,jobid,id,exename,args,exitcode
0,676007,25890,which,globus-ftp-client-cksm-test,1
1,676007,25891,which,globus-ftp-client-mlst-test,1
2,676007,25892,which,globus-ftp-client-ascii-verbose-list-test,1
3,676007,25893,which,globus-ftp-client-delete-test,1
4,627922,15654,which,globus-ftp-client-cksm-test,1
5,627922,15655,which,globus-ftp-client-mlst-test,1
6,627922,15656,which,globus-ftp-client-ascii-verbose-list-test,1
7,627922,15657,which,globus-ftp-client-delete-test,1
8,633144,22478,which,globus-ftp-client-cksm-test,1
9,633144,22479,which,globus-ftp-client-mlst-test,1


In [80]:
# let's explore one of the failed processes
p = eq.Process[int(failed_procs.loc[0,'id'])]
p.exename, p.exitcode, p.args, p.duration, p.parent.pid

('which', 1, 'globus-ftp-client-cksm-test', 7951.0, 5308)

### <a name="timeline">Timeline</a>
Sometimes you want to get a timeline of the processes in the order they were spawned

In [81]:
procs = eq.timeline(jobs, fmt='pandas', limit=25)
procs[['exename', 'tags', 'start', 'duration']]

Unnamed: 0,exename,tags,start,duration
0,tcsh,"{'op': 'dmput', 'op_instance': '2', 'op_sequence': '190'}",2019-06-09 18:53:22.610123,12630590000.0
1,tcsh,{},2019-06-09 18:53:22.614091,113.0
2,mkdir,{},2019-06-09 18:53:22.623899,131.0
3,modulecmd,{},2019-06-09 18:53:22.664680,3656.0
4,test,{},2019-06-09 18:53:22.678745,54.0
5,modulecmd,{},2019-06-09 18:53:22.689498,1551.0
6,test,{},2019-06-09 18:53:22.701312,41.0
7,modulecmd,{},2019-06-09 18:53:22.711901,358694.0
8,perl,{},2019-06-09 18:53:22.745150,15821.0
9,perl,{},2019-06-09 18:53:22.770251,4346.0


In [82]:
# The orm also gives an easy way to navigate the process hierarchy
# Let's use the ORM directly to walk through the job
j = eq.get_jobs('629337', fmt='orm').first()
j

Job['629337']

In [83]:
# Notice we have a Job object. The processes in the job
# are available as j.processes
j.duration, j.cpu_time, j.exitcode, j.tags

(6696039124.0,
 623964730.0,
 0,
 {'atm_res': 'c96l49',
  'ocn_res': '0.5l75',
  'exp_name': 'ESM4_historical_D151',
  'exp_time': '18640101',
  'script_name': 'ESM4_historical_D151_ocean_month_rho2_1x1deg_18640101',
  'exp_component': 'ocean_month_rho2_1x1deg'})

In [84]:
# first we ask for the aggregate metrics for single job
# Here, we don't specify any tags. For single jobs, when
# we don't specify the operation/tags, they are queried from the job
op_sums = eq.op_metrics(jobs='629337', fmt='pandas')
display(op_sums.columns.values)
op_sums[['jobid', 'tags', 'duration', 'cpu_time']]

array(['syscr', 'write_bytes', 'outblock', 'minflt',
       'PERF_COUNT_SW_CPU_CLOCK', 'vol_ctxsw', 'read_bytes',
       'rdtsc_duration', 'time_oncpu', 'guest_time', 'rssmax',
       'delayacct_blkio_time', 'cancelled_write_bytes', 'rchar', 'wchar',
       'inblock', 'timeslices', 'systemtime', 'syscw', 'processor',
       'time_waiting', 'usertime', 'majflt', 'invol_ctxsw', 'job',
       'jobid', 'tags', 'num_procs', 'numtids', 'cpu_time', 'duration'],
      dtype=object)

Unnamed: 0,jobid,tags,duration,cpu_time
0,629337,"{'op': 'cp', 'op_instance': '11', 'op_sequence': '66'}",8055334.0,2078506.0
1,629337,"{'op': 'cp', 'op_instance': '15', 'op_sequence': '79'}",7426402.0,1699557.0
2,629337,"{'op': 'cp', 'op_instance': '3', 'op_sequence': '30'}",8307672.0,2191485.0
3,629337,"{'op': 'cp', 'op_instance': '5', 'op_sequence': '39'}",7696808.0,2229484.0
4,629337,"{'op': 'cp', 'op_instance': '7', 'op_sequence': '48'}",7868858.0,2439456.0
...,...,...,...,...
84,629337,"{'op': 'untar', 'op_instance': '3', 'op_sequence': '38'}",702927.0,623891.0
85,629337,"{'op': 'untar', 'op_instance': '4', 'op_sequence': '47'}",691543.0,620889.0
86,629337,"{'op': 'untar', 'op_instance': '5', 'op_sequence': '56'}",708131.0,629884.0
87,629337,"{'op': 'untar', 'op_instance': '6', 'op_sequence': '65'}",832244.0,629890.0


## <a name="modify-job-metadata">Modifying Job Metadata</a>

Every job stores metadata such as job name, `jobid`, `tags`. Most metadata
fields are not designed to be mutable. However, there are fields such
as `annotations` that are permitted to be modfied using the API.

### <a name="annotate-jobs">Annotate Jobs</a>

Job annotations allow you to store arbitrary key/value pairs
in a persistent manner. They may be retrieved later, modified, and
saved again. There is no semantic meaning associated with annotations
other than what the user ascribes to them. 

In [85]:
eq.get_job_annotations('629337')

{}

In [86]:
eq.annotate_job('629337', {'abc': 100})

{'abc': 100}

In [87]:
eq.get_job_annotations('629337')

{'abc': 100}

In [88]:
eq.annotate_job('629337', {'def': 200})

{'abc': 100, 'def': 200}

In [89]:
eq.get_job_annotations('629337')

{'abc': 100, 'def': 200}

You will notice that `annotate_job` *merges* (or *updates*) the dictionary. 
It doesn't remove existing keys unless you overwrite them with a new key.

In [90]:
eq.annotate_job('629337', {'abc': 500})

{'abc': 500, 'def': 200}

If you wish to replace the existing annotations completely, you should
set the `replace` flag to `annotate_job`.

In [91]:
eq.annotate_job('629337', {'my_key': "something new"}, replace = True)

{'my_key': 'something new'}

In [92]:
eq.get_job_annotations('629337')

{'my_key': 'something new'}

To remove all annotations:

In [93]:
eq.remove_job_annotations('629337')

{}

In [94]:
eq.get_job_annotations('629337')

{}

### <a name="analyze-jobs">Analyze Jobs</a>

We have developed API calls to run simple analyses pipelines, such
as outlier detection on jobs. The output from such analyses is 
persisted a dictionary along with the job in the database, and
can be retrieved later. 

Please note that the current analyses
pipeline is simplistic and will be improved. It is meant solely
for illustrative purposes and to facilitate feedback to spur
subsequent development.

In [None]:
# request list of *all* unanalyzed jobs
eq.get_unanalyzed_jobs()

In [None]:
# usually we care about a subset of recent jobs rather than
# querying the whole database
eq.get_unanalyzed_jobs(['625172', '627922','629337','633144'])

You may also care about a specific analysis, such as outlier detection
in which case you can pass an `analyses_filter` to define what an 
"analyzed" job means. We will cover this later in an example once we
have some jobs that we have analyzed.

First let's run an analysis pipeline on some jobs:

In [None]:
# here we pass a list of jobs we want to analyze. If don't pass
# an argument (or an empty list) all pending jobs will be analyzed.
# That could take very long. So here we retrict the active set..
num_filters_executed = eq.analyze_pending_jobs(['625172', '627922', '629337', '633144'])

`analyze_pending_jobs` returns the number of analysis methods executed on the
jobs. At present we only have `outlier_detection` in our pipeline. So the
`num_filters_executed` will be `1`. If you run the the same call again, the 
outlier detection algorithm will not be run again. Now, let's see what 
results came from our analysis.

In [None]:
eq.get_job_analyses('625172')

Since we ran outlier detection on 4 jobs, each of the jobs has the same
analyses saved. We queried one of the jobs. We could have queried one
of the other jobs and got the same output.

In [None]:
eq.get_job_analyses('627922')

The results suggest that based on each of the features -- `num_procs`,
`duration`, and `cpu_time` -- job `625172` is an outlier, while
the other three jobs `['633144', '629337', '627922']` form an 
equivalent set.

Now if we query the list of unanalyzed jobs, these 4 jobs will be
absent.

In [None]:
eq.get_unanalyzed_jobs()

### <a name="delete-jobs">Deleting Jobs</a>

In [None]:
len(eq.get_jobs(fmt='terse'))

In [None]:
# to delete a single job
# the function returns the number of jobs deleted
eq.delete_jobs('804285')

In [None]:
# to delete multiple jobs, you need to set a second `force` argument
eq.delete_jobs(['693147','696127'], force=True)

#### <a name="delete-jobs-time-based">Time-based Job Deletion</a>

In [None]:
# to delete all jobs older than 30 days
# Very useful in a cron job!
eq.delete_jobs([], force=True, before=-30)

In [None]:
# to delete jobs that were executed in the last week
eq.delete_jobs([], force=True, after=-7)

In [None]:
# to delete jobs older than a specific date
# delete jobs before Jan 21, 2018 09:55
eq.delete_jobs([], force=True, before='01/21/2018 09:55')