-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WMAgent: continuous update of the pileup availability #11619
Comments
@amaltaro I am looking at working on this issue, but I might need more information about MSPileup. I looked at the wiki page on MSPileup, but I don't think I really understand what MSPileup does. Is there another page where I can get more information? Would implementing this be another MSPileup task? |
I am inclined to say that this ticket is no longer relevant. Instead of writing a new algorithm to perform the data location through Rucio, I think we should solely rely on MSPileup information in order to keep up-to-date pileup location across the WM system. |
Among the more detailed tasks to be performed and/or implemented in this component, are:
Observation:
Combinatorics of this process is a function of: one mspileup call + number of pileups UPDATE: other questions that were asked/answered during our conversation (Valentin and Alan):
Location for the workflow sandboxes are defined by the WorkQueueManager configuration, e.g.:
Module that creates the sandbox: Apparently, the sandbox location can be defined as: |
Impact of the new feature
MSPileup
Is your feature request related to a problem? Please describe.
The library implemented in this ticket will have to be incorporated into this new WMComponent: #11733
Due to the huge size of Premix pileup datasets held on disk we are forced to start supporting a mechanism for partial data placement for those. Currently they are held mostly in CERN and FNAL. A typical size is of the order of 0.5-1PB.
In order to implement this, we need to allow
MSPileup
, to be configured to place only a fraction of the full dataset on disk. That means, remove the constraint that a full container is locked under the same RSE, supporting Rucio dataset-based data placement. This fraction depends only on the minimally allowed ratio between number of unique events in the Premix pileup and the number of events processed by the workflow (R). Currently the commonly accepted ratio is 0.5 and this should be a number to be provided by PPD.This number should also be dynamic - meaning we should allow to be altered during a campaign's lifetime.
Describe the solution you'd like
UPDATE: The new WMAgent component (WorkflowUpdater) will contact MSPileup to fetch an up-to-date location for each pileup that is currently active in the agent (aka, which is requested by an active workflow).
If there are changes between the pileup configuration in the sandbox and the pileup configuration/location in MSPileup (which is a live'ish data), then this component is supposed to update the workflow sandbox (including the sandbox tarball).
More in-depth details are provided in the following comment: #11619 (comment)
OLD description: Create a module/function that will resolve the pileup data location based on the rucio dataset names. The high level logic is:
OK
need to be loggedPotential input parameters: pileup name (str), rucio account (str), rucio auth url (str), rucio url (str)
Given that this will have to resolve every single Rucio dataset in a container, we need to be able to run concurrent requests in a timely fashion manner.
Describe alternatives you've considered
None
Additional context
This is part of the meta issue: #11537
The text was updated successfully, but these errors were encountered: