Skip to content

REST API

Marina Golosova edited this page May 19, 2021 · 26 revisions

DKB REST API is available (from CERN network) at: http://api.atlas-dkb.cern.ch:5080/

Table of Contents

Methods

Common parameters (for all methods):

  • ?rtype=... -- response format. Allowed values: json (default), img (only for /task/hist);
  • ?pretty -- pretty-print JSON.
JSON response format:
{
  "status": "OK"|"Fail",
  "took_total_ms": <ms>,
  "took_storage_ms": <ms>,
  "data": <method response>,
  "error": <error details>,
  "warning": <warning or list of warnings>,
  "errors": <warning or list of warnings>,
  "exception": <exception name>,
  "details": <exception details>
}

NOTE: not all fields are presented simultaneously; e.g. if "status" is "Fail", "data" clause will be missed. And vice versa: if "status" is "OK", "error" details won't be provided (while "warning" or "errors" can be presented).

NOTE: "errors" field will eventually be renamed to "warning".

NOTE: field "exception" and its "details" appears for unhandled exceptions; some of them are OK (as "MethodNotFound" here), but if you see some native Python exception -- please open an issue.

↑up

Server info

Method name: /server_info

URL: http://api.atlas-dkb.cern.ch:5080/server_info?pretty

URL (nested demo): http://api.atlas-dkb.cern.ch:5080/nested/server_info?pretty

Parameters: none

Response format:

{
  "status": "OK", 
  "took_total_ms": 0, 
  "data": {
    "version": 0.3.1, 
    "name": "DKB API server"
  }
}

↑up

Method/category info

NOTE: currently available for all categories, but not for methods.

Method name: [/path/to/category]/info

URL: http://api.atlas-dkb.cern.ch:5080/info?pretty

URL: http://api.atlas-dkb.cern.ch:5080/task/info?pretty

URL (nested demo): http://api.atlas-dkb.cern.ch:5080/nested/task/info?pretty

Parameters: none

Response format:

{
  "status": "OK", 
  "took_total_ms": 0, 
  "data": {
    "path": <path/to/category>, 
    "methods": ["method_name1", ...]
    "categories": ["subcategory1", ...]
  }
}

↑up

Tasks keyword search

Method name: /task/kwsearch

URL: http://api.atlas-dkb.cern.ch:5080/task/kwsearch?production=true&analysis=false&kw=data16_13tev&kw=r11969&kw=32501&pretty

URL (nested demo): http://api.atlas-dkb.cern.ch:5080/nested/task/kwsearch?production=true&analysis=false&kw=data16_13tev&kw=r11969&kw=32501&pretty

Parameters:

  • ?kw=...&kw=... -- one or multiple keywords that must be present in the task's metadata:
    • NOTE: if ES wildcard symbols (*, ?) are presented in the keyword, it will be checked only against "taskname" field.
    • NOTE: in terms of task name, plain keywords are splitted by dots and underscores and checked against corresponding pieces of the name, while for wildcard ones underscore is not a separator, but a normal symbol. It means that ?kw=data16.r11784 is equivalent to ?kw=data16&kw=r11784 and both of them will match taskname data16_13TeV.00310691.physics_Main.merge.r11969_r11784_p4072, but ?kw=data16.*_r11784 will only match tasks with tags part of name ending with r11784.
  • ?analysis -- if analysis tasks should be included (default: true; to exclude: ?analysis=false);
  • ?production -- if production tasks should be included (default: true; to exclude: ?production=false);
  • ?size=2000 -- number of tasks in response (default: 2000);
  • ?ds_size=20 -- number of datasets in response (default: 20):
    • NOTE: ignored in case of /nested method;
  • ?timeout=120 -- ES query timeout (sec) (default: 120).
Response format:
{
  "status": "OK",
  "took_total_ms": 94, 
  "took_storage_ms": 53, 
  "total": 159, 
  "data": [
    {
      "taskname": <task_name>, 
      ...,
      "output_dataset": [
        {"name": <ds_name>, ...},
        ...
      ]
    },
    ...
]}

↑up

Task chain reconstruction

Method name: /task/chain

URL: http://api.atlas-dkb.cern.ch:5080/task/chain?tid=21774573&pretty

URL (nested demo): http://api.atlas-dkb.cern.ch:5080/nested/task/chain?tid=21774573&pretty

Parameters:

  • ?tid=... -- task ID for which chain will be reconstructed.
Response format:
{
  "status": "OK", 
  "took_total_ms": 5653, 
  "data": {
    <p_tid1>: [<child_tid1>, ...], 
    ...
}}

, where p_tidXs are IDs of tasks that belong to the chain, and child_tidYs are a given task's successors' IDs.

NOTE: took_storage_ms field is not missed here, it's not provided yet for this method.

↑up

Tasks distribution over time (by steps)

Method name: /task/hist

URL: http://api.atlas-dkb.cern.ch:5080/task/hist?htags=mc16e_cp&pretty

URL (img): http://api.atlas-dkb.cern.ch:5080/task/hist?htags=mc16e_cp&rtype=img

URL (demo nested): http://api.atlas-dkb.cern.ch:5080/nested/task/hist?htags=returnofrpvllreprocessingdata16

Parameters:

  • ?rtype=... -- response format: JSON (json) or PNG image (img);
  • ?start=... -- left border of time interval (supported formats: YYYY-MM-DD, YYYY-MM-DDTHH:MM:SS);
  • ?end=... -- right border of time interval (supported formats: YYYY-MM-DD, YYYY-MM-DDTHH:MM:SS);
  • ?detailed -- do not join * Merge steps into a single Merge step (default: false);
  • ?bins=... -- number of bins in histogram (default: number of data points, but not more than 400).
Response format:
{
  "status": "OK",
  "took_total_ms": 22, 
  "took_storage_ms": 12, 
  "total": 159, 
  "data": {
    "data": {
      "x": [
        [<x1_1_YYYY-MM-DD>, ...],
        [<x2_1_YYYY-MM-DD>, ...],
        ...
      ],
      "y": [
        [<y1_1_NN>, ...],
        [<y2_1_NN>, ...],
        ...
      ]
    },
    "legend": [<series1_name>, <series2_name>, ...]
}}

↑up

Derivation efficiency

Method name: /task/deriv

URL: http://api.atlas-dkb.cern.ch:5080/task/deriv?project=mc16_13TeV&amitag=r11748&pretty

URL (demo nested): http://api.atlas-dkb.cern.ch:5080/task/deriv?project=mc16_13TeV&amitag=r11748&pretty

Parameters:

  • ?amitag=...&amitag=... -- tasks selection parameter: one or multiple current (last) AMI tag(s) (ctag field value);
  • ?project=... -- tasks selection parameter: project name.
NOTE: only tasks with primary_input's data format AOD and RPVLL are selected for statistics calculation.

Response format:

{
  "status": "OK", 
  "took_total_ms": 240, 
  "data": [
    {
      "task_ids": [<tid_1>, ...], 
      "output": <output data format>, 
      "tasks": <N tasks>, 
      "ratio": <output/input datasets size ratio>, 
      "events_ratio": <output/input datasets events number ratio>
    }, 
    ...
]}

NOTE: took_storage_ms field is not missed here, it's not provided yet for this method.

↑up

Campaign statistics

Method name: /campaign/stat

URL: http://api.atlas-dkb.cern.ch:5080/campaign/stat?pretty&htag=returnofrpvllreprocessingdata16&step_type=ctag_format&events_src=task

URL (demo nested): http://api.atlas-dkb.cern.ch:5080/nested/campaign/stat?pretty&htag=returnofrpvllreprocessingdata16&step_type=ctag_format&events_src=task

Parameters:

  • ?step_type=step -- step definition type (default: step):
    • step for MC steps,
    • ctag_format for steps definde by task's ctag field (current AMI tag) plus output dataset data format;
  • ?events_src=ds -- way to calculate number of output events (default: ds):
    • ds -- number of events in output dataset,
    • task -- number of processed events of tasks in state: done, finished,
    • all -- provide all possible values as hash (output/input events ratio in response will be null);
  • ?<param>=<val>&<param>=... -- task selection parameters:
    • <param> -- task attribute name according to mapping (project, ctag, taskid, pr_id, ...),
    • <val> -- parameter value:
      • exact field value;
      • exact field value with logical prefix:
        • & -- field must have this value (use %26 or urlencode to pass this symbol) (not implemented);
        • | -- field must have one of values marked with this prefix (default);
        • ! -- field must not have this value.
Response format:
{
  "status": "OK", 
  "took_total_ms": 182, 
  "took_storage_ms": 171,
  "total": <number of matched documents>, 
  "data": {
    "last_update": <last registered task timestamp>,
    "date_format": <datetime format>,
    "tasks_processing_summary": {
      <step>: {
        <status>: <N tasks>, ...,
        "start": <earliest start time>,
        "end": <latest end time>
      },
      ...
    },
    "overall_events_processing_summary": {
      <step>: {
        "input": <N events>,
        "output": <N events>,
        "ratio": <output>/<input> /* null if 'events_src' is 'all' */
      },
      ...
    },
    "tasks_updated_24h": {
      <step>: {
        <status>: {
          "total": <N tasks>,
          "updated": <N tasks>
        },
        ...
      },
      ...
    },
    "events_daily_progress": {
      <step>: {
        <date>: <N events processed by tasks finished at the <date> day>,
        ...
      },
      ...
}}}

↑up

Steps statistics

Method name: /step/stat

URL: http://api.atlas-dkb.cern.ch:5080/step/stat?pretty&htag=returnofrpvllreprocessingdata16&step_type=ctag_format

URL (demo nested): http://api.atlas-dkb.cern.ch:5080/nested/step/stat?pretty&htag=returnofrpvllreprocessingdata16&step_type=ctag_format

Parameters:

  • ?step_type=step -- step definition type (default: step):
    • step for MC steps,
    • ctag_format for steps definde by task's ctag field (current AMI tag) plus output dataset data format;
  • ?<param>=<val>&<param>=... -- task selection parameters:
    • <param> -- task attribute name according to mapping (project, ctag, taskid, pr_id, ...),
    • <val> -- parameter value:
      • exact field value;
      • exact field value with logical prefix:
        • & -- field must have this value (use %26 or urlencode to pass this symbol) (not implemented);
        • | -- field must have one of values marked with this prefix (default);
        • ! -- field must not have this value.
Response format:
{
  "status": "OK", 
  "warning": "Formats 'DAOD', 'DRAW' are excluded from the statistics.", 
  "took_total_ms": 3706, 
  "took_storage_ms": 3592, 
  "total": <number of matched documents>, 
  "data": [
    {
      "name": <step name (MC step or "FORMAT:ctag")>, 
      "step_status": <"Unknown"|"StepDone"|"StepProgressing"|"StepNotStarted">,
      "percent_done": <% of already processed events (in <input_events>); 100% only if _all_ events are processed>, 
      "percent_running": <% of not yet processed events for tasks in state: "running" (in <input_events>)>, 
      "percent_pending": <% of not processed and not running events (in <input_events>)>, 
      "duration": <N days>, 
      "total_tasks": <N tasks>,
      "finished_tasks": <N tasks in state: "done", "finished">, 
      "total_events": <sum of total_events field>, 
      "processed_events": <N processed events (total_events for MC steps, processed_events for ctag_format steps)>, 
      "input_bytes": <size of primary input datasets>, 
      "input_events": <N input_events>, 
      "finished_bytes": <total size of primary input datasets for tasks in state: "done", "finished">, 
      "output_bytes": <size of output datasets>, 
      "hs06": <CPU usage>, 
      "cpu_failed": <hs06_failed>, 
      "input_not_removed_tasks": <N tasks, primary input of which is not yet removed (to know how precise are numbers based on DS metadata)>, 
      "output_not_removed_tasks": <N tasks, output dataset of which is not yet removed (to know how precise are numbers based on DS metadata)>
    },
    ...
]}

↑up