-
Notifications
You must be signed in to change notification settings - Fork 15
Auto Queue Configuration with AGIS
https://docs.google.com/document/d/1E97LoQNWgukV0dUatCbIOxnZROJjMPV36LLwt6t0-90/edit?usp=sharing
https://docs.google.com/presentation/d/1dt2Fe2pkN-3F3xYJJ-HBVZyCCrvmKzOGFiXVky86VFc/edit?usp=sharing
See docs.
Configuration on Harvester To make harvester work with auto queue configuration, one needs some lines in harvester.cfg:
[qconf]
configFromCacher = True
queueList =
DYNAMIC
resolverModule = pandaharvester.harvestermisc.info_utils
resolverClass = PandaQueuesDict
[cacher]
data =
...
panda_queues.json||(URL of remote schedconfig JSON)
queues_config_file||(URL of remote queueconfig JSON)
On CERN_central_A,B these URLs are:
- http://atlas-agis-api.cern.ch/request/pandaqueue/query/list/?json&preset=schedconf.all&vo_name=atlas
- https://raw.githubusercontent.com/PanDAWMS/harvester_configurations/master/GRID/common_grid_queueconfig_template.json
One can skip the line of “queues_config_file” if no remote queueconfig needed.
In Harvester queue configurations, an object (~JSON object) can be either a queue or a template.
- Queue: A queue (or configuration of a queue) corresponds to the name of a real PanDA queue that Harvester works for. One can set a template of the queue in order to inherit all attributes (parameters) and values written in the template.
- Template: An abstract template of queue configuration meant to be reused in queues. Harvester does not store a template in DB and does not submit workers for a template.
- The object is a template if its name (key) ends up in "_TEMPLATE" or it has attribute
isTemplateQueue
set to beTrue
. Otherwise, the object is a queue. - A queue written in local or remote file takes a template with the attribute
templateQueueName
of the queue set to be the name of the template. - A queue set on AGIS takes a template with name from PQ field
harvester_template
, or a default template name<type>.<workflow>
(e.g.production.push
) if fieldharvester_template
is blank. - Queue and template are exclusive to each others. A queue cannot be a template simultaneously and vice versa.
- A queue will be invalid if its
templateQueueName
is set to be an non-existing template or another queue. Harvester will ignore invalid queues. - Nested templates is not allowed (and not possible) according to rules above.
There are two types of attributes in harvester queue configurations:
For general setup of the PQ. E.g. maxWorkers
, mapType
, etc.
For certain harvester plugin of the PQ. Contain subkey, subvalues. E.g. monitor
, submitter
, etc. and common
(which applies to all plugins)
Queue configurations can come from three kinds of sources:
- Local: Static JSON describing queue and/or template in a local file on harvester, say panda_queueconfig.json (filename defined in harvester.cfg)
- Remote: Static JSON describing queue and/or template in a remote file shared with HTTP URL (URL and related setup defined in harvester.cfg . See how)
- Dynamic: Queues (only queues, no template) generated according to information on AGIS (related setup defined in harvester.cfg . See how)
- LT: Local template, written in local queueconfig file (panda_queueconfig.json)
- RT: Remote template, on http source fetched by cacher (e.g. on GitHub)
- FT: Final template derived from RT and LT.
- LQ: Local queue configuration, written in local queueconfig file (panda_queueconfig.json)
- RQ: Remote queue configuration, on http source fetched by cacher, static (e.g. on GitHub)
- DQ: Dynamic queue configuration, configured with information from resolver (e.g. coming from AGIS)
- FQ: Final queue configuration of a PanDA queue derived from RQ, DQ, and LQ.
- Templates: LT > RT
- Queues: LQ > DQ > RQ
This priority rule for templates/queues with the same name from multiple sources will be taken in following steps.
...to be continued...
Here explains how a configuration A will be "updated" with another configuration B:
- For generic attributes in B but not in A: Add the attribute/value of B to A
- For generic attributes in both: Take the value of the same attribute in B
- For plugin attributes in B but not in A: Add the attribute and all keys/values of this attribute of B to A
- For plugin attributes in both: "Update" the attribute with B. That is, for all keys/values in the attribute of B, add the key/value to the attributes of A if the key does not exist in A's, or take the value of B's for the key if the key exists in A's.
- Some special attributes (say
isTemplateQueue
,templateQueueName
) will be handled separately and not included during the update process (i.e. skipped).
-
Collect configurations from all sources:
- Get RTs and RQs from remote resource (e.g. GitHub, http URL)
- Get LTs and LQs from local queueconfig file
- Get DQs (only queue name, its template, and associate parameters) from AGIS
-
Generate final templates (FTs) via the rules:
- If a RT (among RTs) and a LT (among LTs) have the same name, only the LT will be added to FTs. (following the priority rule)
- Otherwise, all RTs and LTs without duplication in name will all be added to FTs.
- That is, for any specific template name, FT = LT if LT exists else RT .
-
Define the template of each queue among all queues (RQ, DQ, LQ). Rules:
- The template name for a queue will be defined by the queue with highest priority among all existing queue/queues among RQ, DQ, LQ with the same name AND taking a template. (following the priority rule)
- If none of the queues with the same name takes template, then no template for this queue name.
- That is, for any specific template name, its template name will be defined by: LQ if LQ exists and LQ takes template else (DQ if DQ exists and DQ takes template else RQ)
-
Generate configuration of each queue via steps: 0. Start from an empty configuration object (say a JSON object
{}
)- If the queue takes a template (decided in 3. above), then update (see update) the configuration object with the configurations of the template. If the queue takes an invalid template (not in FTs), then this queue will be skipped/unavailable in harvester. Otherwise, if no template taken, skip this step.
- If RQ exists, update the configuration object with RQ.
- If DQ exists, update the configuration object with DQ (only associate parameters count here).
- If LQ exists, update the configuration object with LQ.
- Then the configuration object is the FQ. In short, FQ = (template defined among RQ,DQ,LQ) updated with RQ, next updated with DQ, then updated with LQ .
- Go through some sanity checks, addition adjustments of FQ. If FQ ever gets checked as invalid (e.g. missing mandatory attributes like
submitter
), this queue will be skipped/unavailable in harvester. - If FQ survives, it will be updated to harvester DB and harvester will submit workers for it.
See slides for some examples.
See docs.
...to be continued
Getting started |
---|
Installation and configuration |
Testing and running |
Debugging |
Work with Middleware |
Admin FAQ |
Development guides |
---|
Development workflow |
Tagging |
Production & commissioning |
---|
Condor experiences |
Commissioning on the grid |
Production servers |
Service monitoring |
Auto Queue Configuration with AGIS |
GCE setup |
Kubernetes setup |
SSH+RPC middleware setup |