Split Deletable Blocks checks in two stages. #11127

todor-ivanov · 2022-05-04T10:57:14Z

Status

ready

Description

With the current PR we split the deletableBlocks checks from RucioInjector in two steps - first we fetch all the block in closed state as usual, but we do not require the whole workflow information to be cleaned from WMBS. In the following step we fetch another list of workflows which are suitable for deletion, but will eventually fall under the conditions ruled by the archiveDelayHours configuration parameter. Upon that we check for which of the so found blocks actually have been produced by an already 'deletable' workflow. Once the intersection between those two lists is made the final set of blocks to be deleted is provided to the rest of the code as before.

During the above check we also apply another requirement for the block. Before we add it for deletion we check if its lifetime is bigger than a certain configurable value. We set that from the agent configuration with:
config.RucioInjector.blockDeletionDelayHours and the clock for measuring the block lifetime is started with the blockCreateTime. It would have been better to have this started with the workflow completion instead, but there is no cheap way of fetching that information from RucioInjector or from the agent as a whole actually.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

None

We do have a new configuration variable introduced with the PR:
config.RucioInjector.blockDeletionDelayHours
But this one is to be provided from the agent configuration, so no service_config PR has been created for it.

External dependencies / deployment changes

None

cmsdmwmbot · 2022-05-04T11:10:29Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
Python3 Pylint check: failed
- 18 warnings and errors that must be fixed
- 6 warnings
- 13 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13127/artifact/artifacts/PullRequestReport.html

todor-ivanov · 2022-05-04T12:04:11Z

And here is the output from the CompletedBlocks query: [1]. Which I believe is the expected output.
This is for a replay wich is in the following state:

Repack released && Completed but not archived -> Its blocks are in Completed && NOT Deleted
PromptReco is Released nut NOT Comleted -> No output blocks yet

As of the output from the DeletableWorkflows query I've split the query in pieces:
[2] - All the completed workflows from this replay - it is only the Repack visible, PromptReco is not yet completed.
[3] - All the completed workflows with child workflows NOT completed - again only the repack visible - because it has a PromptReco workflow associated with it
[4] - The full query - no workflows satisfy the full intersection
[5] - The full query but without the constraint for child workflows completion

FYI @germanfgv @amaltaro @hufnagel @drkovalskyi

[1]

SQL> SELECT dbsbuffer_block.blockname,
  2         dbsbuffer_location.pnn,
  3         dbsbuffer_dataset.path,
  4         dbsbuffer_dataset_subscription.site,
  5         dbsbuffer_workflow.name
  6  FROM dbsbuffer_dataset_subscription
  7  INNER JOIN dbsbuffer_dataset ON
  8    dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id
  9  INNER JOIN dbsbuffer_block ON
 10    dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id
 11  INNER JOIN dbsbuffer_file ON
 12    dbsbuffer_file.block_id = dbsbuffer_block.id
 13  INNER JOIN dbsbuffer_workflow ON
 14    dbsbuffer_workflow.id = dbsbuffer_file.workflow
 15  INNER JOIN dbsbuffer_location ON
 16    dbsbuffer_location.id = dbsbuffer_block.location
 17  WHERE dbsbuffer_dataset_subscription.delete_blocks = 1
 18  AND dbsbuffer_dataset_subscription.subscribed = 1
 19  AND dbsbuffer_block.status = 'Closed'
 20  AND dbsbuffer_block.deleted = 0
 21  GROUP BY dbsbuffer_block.blockname,
 22           dbsbuffer_location.pnn,
 23           dbsbuffer_dataset.path,
 24           dbsbuffer_dataset_subscription.site,
 25           dbsbuffer_workflow.name
 26  HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed);

BLOCKNAME								     PNN	     PATH				      SITE	      NAME
---------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- ------------------------------------------------------------------------------------------
/Cosmics/Tier0_REPLAY_2022-v425/RAW#559fa049-e3aa-4ba0-8ce0-333da56e5537     T0_CH_CERN_Disk /Cosmics/Tier0_REPLAY_2022-v425/RAW      T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
/HLTPhysics/Tier0_REPLAY_2022-v425/RAW#c143e216-03b9-4a22-b1f1-45a136e819b8  T0_CH_CERN_Disk /HLTPhysics/Tier0_REPLAY_2022-v425/RAW   T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
/MinimumBias/Tier0_REPLAY_2022-v425/RAW#5c56ac08-5792-48f5-b6e0-a12addd0f0cb T0_CH_CERN_Disk /MinimumBias/Tier0_REPLAY_2022-v425/RAW  T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
/HcalNZS/Tier0_REPLAY_2022-v425/RAW#56efc611-056d-46bb-ae38-ed1311e58d2a     T0_CH_CERN_Disk /HcalNZS/Tier0_REPLAY_2022-v425/RAW      T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
/NoBPTX/Tier0_REPLAY_2022-v425/RAW#a7961d75-df91-4fd4-a2ac-248c980819fe      T0_CH_CERN_Disk /NoBPTX/Tier0_REPLAY_2022-v425/RAW       T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114

[2]

SQL> SELECT Distinct name FROM wmbs_workflow
  2              WHERE name NOT IN (SELECT DISTINCT ww.name FROM wmbs_workflow ww
  3                                 INNER JOIN wmbs_subscription ws ON
  4                                            ws.workflow = ww.id
  5                                 WHERE ws.finished =0);

NAME
--------------------------------------------------------------------------------
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114

[3]

SQL> SELECT DISTINCT ww.name FROM wmbs_workflow ww
  2             INNER JOIN wmbs_subscription ws ON
  3                        ws.workflow = ww.id
  4             INNER JOIN wmbs_fileset wfs ON
  5                        wfs.id = ws.fileset
  6             INNER JOIN wmbs_fileset_files wfsf ON
  7                        wfsf.fileset = wfs.id
  8             INNER JOIN wmbs_file_parent wfp ON
  9                        wfp.parent = wfsf.fileid
 10             INNER JOIN wmbs_fileset_files child_fileset ON
 11                        child_fileset.fileid = wfp.child
 12             INNER JOIN wmbs_subscription child_subscription ON
 13                        child_subscription.fileset = child_fileset.fileset
 14             WHERE child_subscription.finished = 0;


NAME
--------------------------------------------------------------------------------
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114

[4]

SQL> SELECT DISTINCT wmbs_workflow.name,
  2                  wmbs_workflow.spec,
  3                  wmbs_workflow.id AS workflow_id,
  4                  wmbs_subscription.id AS sub_id
  5  FROM wmbs_subscription
  6  INNER JOIN wmbs_workflow ON
  7             wmbs_workflow.id = wmbs_subscription.workflow
  8  INNER JOIN (SELECT name FROM wmbs_workflow
  9              WHERE name NOT IN (SELECT DISTINCT ww.name FROM wmbs_workflow ww
 10                                 INNER JOIN wmbs_subscription ws ON
 11                                            ws.workflow = ww.id
 12                                 WHERE ws.finished =0)) complete_workflow ON
 13             complete_workflow.name = wmbs_workflow.name
 14  WHERE wmbs_workflow.name NOT IN (
 15             SELECT DISTINCT ww.name FROM wmbs_workflow ww
 16             INNER JOIN wmbs_subscription ws ON
 17                        ws.workflow = ww.id
 18             INNER JOIN wmbs_fileset wfs ON
 19                        wfs.id = ws.fileset
 20             INNER JOIN wmbs_fileset_files wfsf ON
 21                        wfsf.fileset = wfs.id
 22             INNER JOIN wmbs_file_parent wfp ON
 23                        wfp.parent = wfsf.fileid
 24             INNER JOIN wmbs_fileset_files child_fileset ON
 25                        child_fileset.fileid = wfp.child
 26             INNER JOIN wmbs_subscription child_subscription ON
 27                        child_subscription.fileset = child_fileset.fileset
 28             WHERE child_subscription.finished = 0);

no rows selected

[5]

SQL> SELECT DISTINCT wmbs_workflow.name, wmbs_workflow.spec,
  2             wmbs_workflow.id AS workflow_id, wmbs_subscription.id AS sub_id
  3      FROM wmbs_subscription
  4          INNER JOIN wmbs_workflow ON
  5              wmbs_workflow.id = wmbs_subscription.workflow
  6          INNER JOIN (SELECT name FROM wmbs_workflow
  7                      WHERE name NOT IN (
  8                          SELECT DISTINCT ww.name FROM wmbs_workflow ww
  9                                INNER JOIN wmbs_subscription ws
 10                                   ON ws.workflow = ww.id
 11                          WHERE ws.finished =0)) complete_workflow ON
 12              complete_workflow.name = wmbs_workflow.name;

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     45 	45
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     48 	48
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     46 	46
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     47 	47
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     50 	50
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     56 	56
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     58 	58
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     42 	42
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     51 	51
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     54 	54
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     55 	55
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     43 	43
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     49 	49
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     52 	52
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     53 	53
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     57 	57
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     44 	44

17 rows selected.

cmsdmwmbot · 2022-05-04T13:47:33Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
- 3 changes in unstable tests
Python3 Pylint check: failed
- 15 warnings and errors that must be fixed
- 6 warnings
- 13 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13128/artifact/artifacts/PullRequestReport.html

todor-ivanov · 2022-05-04T14:46:38Z

And here is the new state:

Express released && Completed
Repack released && Completed but not archived -> Its blocks are in Completed && NOT Deleted
PromptReco is Released nut NOT Comleted -> No output blocks yet

Full set of completed blocks:

SQL> SELECT dbsbuffer_block.blockname,
  2         dbsbuffer_location.pnn,
  3         dbsbuffer_dataset.path,
  4         dbsbuffer_dataset_subscription.site,
  5         dbsbuffer_workflow.name
  6  FROM dbsbuffer_dataset_subscription
  7  INNER JOIN dbsbuffer_dataset ON
  8    dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id
  9  INNER JOIN dbsbuffer_block ON
 10    dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id
 11  INNER JOIN dbsbuffer_file ON
 12    dbsbuffer_file.block_id = dbsbuffer_block.id
 13  INNER JOIN dbsbuffer_workflow ON
 14    dbsbuffer_workflow.id = dbsbuffer_file.workflow
 15  INNER JOIN dbsbuffer_location ON
 16    dbsbuffer_location.id = dbsbuffer_block.location
 17  WHERE dbsbuffer_dataset_subscription.delete_blocks = 1
 18  AND dbsbuffer_dataset_subscription.subscribed = 1
 19  AND dbsbuffer_block.status = 'Closed'
 20  AND dbsbuffer_block.deleted = 0
 21  GROUP BY dbsbuffer_block.blockname,
 22           dbsbuffer_location.pnn,
 23           dbsbuffer_dataset.path,
 24           dbsbuffer_dataset_subscription.site,
 25           dbsbuffer_workflow.name
 26  HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed);

BLOCKNAME								     PNN	     PATH				      SITE	      NAME
---------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- --------------------------------------------------------------------------------
/Cosmics/Tier0_REPLAY_2022-v425/RAW#559fa049-e3aa-4ba0-8ce0-333da56e5537     T0_CH_CERN_Disk /Cosmics/Tier0_REPLAY_2022-v425/RAW      T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
/HLTPhysics/Tier0_REPLAY_2022-v425/RAW#c143e216-03b9-4a22-b1f1-45a136e819b8  T0_CH_CERN_Disk /HLTPhysics/Tier0_REPLAY_2022-v425/RAW   T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
/StreamExpressCosmics/Tier0_REPLAY_2022-TkAlCosmics0T-Express-v425/ALCARECO# T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T0_CH_CERN_Disk Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220
/StreamExpressCosmics/Tier0_REPLAY_2022-SiStripCalZeroBias-Express-v425/ALCA T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T0_CH_CERN_Disk Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220
/StreamExpressCosmics/Tier0_REPLAY_2022-PromptCalibProdSiStrip-Express-v425/ T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T0_CH_CERN_Disk Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220
/MinimumBias/Tier0_REPLAY_2022-v425/RAW#5c56ac08-5792-48f5-b6e0-a12addd0f0cb T0_CH_CERN_Disk /MinimumBias/Tier0_REPLAY_2022-v425/RAW  T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
/HcalNZS/Tier0_REPLAY_2022-v425/RAW#56efc611-056d-46bb-ae38-ed1311e58d2a     T0_CH_CERN_Disk /HcalNZS/Tier0_REPLAY_2022-v425/RAW      T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
/NoBPTX/Tier0_REPLAY_2022-v425/RAW#a7961d75-df91-4fd4-a2ac-248c980819fe      T0_CH_CERN_Disk /NoBPTX/Tier0_REPLAY_2022-v425/RAW       T0_CH_CERN_Disk Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
/StreamExpressCosmics/Tier0_REPLAY_2022-SiPixelCalZeroBias-Express-v425/ALCA T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T0_CH_CERN_Disk Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220
/StreamExpressCosmics/Tier0_REPLAY_2022-SiStripPCLHistos-Express-v425/ALCARE T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T0_CH_CERN_Disk Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220
/ExpressCosmics/Tier0_REPLAY_2022-Express-v425/FEVT#21800b3c-a4b9-41e6-bcbf- T0_CH_CERN_Disk /ExpressCosmics/Tier0_REPLAY_2022-Expres T0_CH_CERN_Disk Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220
/StreamExpressCosmics/Tier0_REPLAY_2022-Express-v425/DQMIO#decad3e6-9576-4fd T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T0_CH_CERN_Disk Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220

12 rows selected.

Fully completed workflows with dependent workflows NOT completed:

SQL> SELECT DISTINCT ww.name FROM wmbs_workflow ww
  2             INNER JOIN wmbs_subscription ws ON
  3                        ws.workflow = ww.id
  4             INNER JOIN wmbs_fileset wfs ON
  5                        wfs.id = ws.fileset
  6             INNER JOIN wmbs_fileset_files wfsf ON
  7                        wfsf.fileset = wfs.id
  8             INNER JOIN wmbs_file_parent wfp ON
  9                        wfp.parent = wfsf.fileid
 10             INNER JOIN wmbs_fileset_files child_fileset ON
 11                        child_fileset.fileid = wfp.child
 12             INNER JOIN wmbs_subscription child_subscription ON
 13                        child_subscription.fileset = child_fileset.fileset
 14             WHERE child_subscription.finished = 0;

NAME
--------------------------------------------------------------------------------
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503

As expected both Express and Repack present.

Full set of trully Deletable Workflows:

SQL> SELECT DISTINCT wmbs_workflow.name,
  2                  wmbs_workflow.spec,
  3                  wmbs_workflow.id AS workflow_id,
  4                  wmbs_subscription.id AS sub_id
  5  FROM wmbs_subscription
  6  INNER JOIN wmbs_workflow ON
  7             wmbs_workflow.id = wmbs_subscription.workflow
  8  INNER JOIN (SELECT name FROM wmbs_workflow
  9              WHERE name NOT IN (SELECT DISTINCT ww.name FROM wmbs_workflow ww
 10                                 INNER JOIN wmbs_subscription ws ON
 11                                            ws.workflow = ww.id
 12                                 WHERE ws.finished =0)) complete_workflow ON
 13             complete_workflow.name = wmbs_workflow.name
 14  WHERE wmbs_workflow.name NOT IN (
 15             SELECT DISTINCT ww.name FROM wmbs_workflow ww
 16             INNER JOIN wmbs_subscription ws ON
 17                        ws.workflow = ww.id
 18             INNER JOIN wmbs_fileset wfs ON
 19                        wfs.id = ws.fileset
 20             INNER JOIN wmbs_fileset_files wfsf ON
 21                        wfsf.fileset = wfs.id
 22             INNER JOIN wmbs_file_parent wfp ON
 23                        wfp.parent = wfsf.fileid
 24             INNER JOIN wmbs_fileset_files child_fileset ON
 25                        child_fileset.fileid = wfp.child
 26             INNER JOIN wmbs_subscription child_subscription ON
 27                        child_subscription.fileset = child_fileset.fileset
 28             WHERE child_subscription.finished = 0);

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     22 	22
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     29 	29
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     30 	30
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     31 	31
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     33 	33
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     23 	23
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     26 	26
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     28 	28
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     32 	32
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     35 	35
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     20 	20
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     24 	24
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     34 	34
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     36 	36
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     19 	19
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     25 	25
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     21 	21
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     27 	27

18 rows selected.

Only Express present. Repack is hold as it's output is still an input for PromptReco.

FYI @amaltaro @germanfgv

cmsdmwmbot · 2022-05-04T15:47:34Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
- 3 changes in unstable tests
Python3 Pylint check: failed
- 16 warnings and errors that must be fixed
- 6 warnings
- 26 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 3 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13131/artifact/artifacts/PullRequestReport.html

todor-ivanov · 2022-05-04T16:15:58Z

We now have the following replay status:

Express released && Completed
Repack released && Completed && archived -> Its blocks are in Completed && Deleted
PromptReco is Released && Comleted -> blocks deleted

Unfortunately we found out we the repack workflow being archived much earlier than we thought - because were were running with those two patches in place:
#11122
#11127

One is targeting early archival and late CouchCleanup, while the other was targeting late archival and early block level deletions.
And indeed in the logs we find:

2022-05-04 11:36:43,432:139787802531584:DEBUG:CleanCouchPoller:Setting T0 workflow: Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 to status: normal-archive
d at central CouchDB.Local CouchDB data will be cleaned after 100 hours.

But this affected only the visibility of Repack in wmstats. We will rerun the tests once again to be 100% sure of the end result. But the essential part which was the early block deletions did follow the expected behavior. Here follow the states in WMBS:

Full set of completed workflows:

SQL> SELECT distinct name FROM wmbs_workflow
  2              WHERE name NOT IN (SELECT DISTINCT ww.name FROM wmbs_workflow ww
  3                                 INNER JOIN wmbs_subscription ws ON
  4                                            ws.workflow = ww.id
  5                                 WHERE ws.finished =0)
  6  ;

NAME
--------------------------------------------------------------------------------
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503

PromptReco being Completed, meaning also Parent workflows - Repack :

SQL> SELECT DISTINCT ww.name FROM wmbs_workflow ww
  2             INNER JOIN wmbs_subscription ws ON
  3                        ws.workflow = ww.id
  4             INNER JOIN wmbs_fileset wfs ON
  5                        wfs.id = ws.fileset
  6             INNER JOIN wmbs_fileset_files wfsf ON
  7                        wfsf.fileset = wfs.id
  8             INNER JOIN wmbs_file_parent wfp ON
  9                        wfp.parent = wfsf.fileid
 10             INNER JOIN wmbs_fileset_files child_fileset ON
 11                        child_fileset.fileid = wfp.child
 12             INNER JOIN wmbs_subscription child_subscription ON
 13                        child_subscription.fileset = child_fileset.fileset
 14             WHERE child_subscription.finished = 0;

NAME
--------------------------------------------------------------------------------
PromptReco_Run350944_HLTPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_111
PromptReco_Run350944_NoBPTX_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118
PromptReco_Run350944_MinimumBias_Tier0_REPLAY_2022_ID220503111413_v425_220503_11
PromptReco_Run350944_HcalNZS_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118

All Completed blocks + workflows producing them - ready for deletion:

SQL> SELECT DISTINCT wmbs_workflow.name,
  2                  wmbs_workflow.spec,
  3                  wmbs_workflow.id AS workflow_id,
  4                  wmbs_subscription.id AS sub_id
  5  FROM wmbs_subscription
  6  INNER JOIN wmbs_workflow ON
  7             wmbs_workflow.id = wmbs_subscription.workflow
  8  INNER JOIN (SELECT name FROM wmbs_workflow
  9              WHERE name NOT IN (SELECT DISTINCT ww.name FROM wmbs_workflow ww
 10                                 INNER JOIN wmbs_subscription ws ON
 11                                            ws.workflow = ww.id
 12                                 WHERE ws.finished =0)) complete_workflow ON
 13             complete_workflow.name = wmbs_workflow.name
 14  WHERE wmbs_workflow.name NOT IN (
 15             SELECT DISTINCT ww.name FROM wmbs_workflow ww
 16             INNER JOIN wmbs_subscription ws ON
 17                        ws.workflow = ww.id
 18             INNER JOIN wmbs_fileset wfs ON
 19                        wfs.id = ws.fileset
 20             INNER JOIN wmbs_fileset_files wfsf ON
 21                        wfsf.fileset = wfs.id
 22             INNER JOIN wmbs_file_parent wfp ON
 23                        wfp.parent = wfsf.fileid
 24             INNER JOIN wmbs_fileset_files child_fileset ON
 25                        child_fileset.fileid = wfp.child
 26             INNER JOIN wmbs_subscription child_subscription ON
 27                        child_subscription.fileset = child_fileset.fileset
 28             WHERE child_subscription.finished = 0);

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     45 	45
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     48 	48
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     61 	61
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     69 	69
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	      3 	 3
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	      5 	 5
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	      9 	 9
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	     10 	10
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	     15 	15
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     22 	22
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     93 	93

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     60 	60
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     88 	88
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     29 	29
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     30 	30
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     31 	31
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     33 	33
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     46 	46
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     47 	47
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     50 	50
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     56 	56
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     58 	58

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     76 	76
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     77 	77
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     84 	84
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     85 	85
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	      8 	 8
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     23 	23
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     26 	26
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     28 	28
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     32 	32
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     35 	35
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     42 	42

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     51 	51
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     64 	64
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     70 	70
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     82 	82
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     83 	83
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     90 	90
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	      2 	 2
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     20 	20
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     24 	24
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     92 	92
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     34 	34

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     36 	36
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     54 	54
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     55 	55
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     62 	62
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     68 	68
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     73 	73
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     80 	80
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     81 	81
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     86 	86
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     87 	87
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	      1 	 1

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	      7 	 7
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	     13 	13
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	     14 	14
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     19 	19
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     25 	25
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     43 	43
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     49 	49
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     52 	52
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     53 	53
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     57 	57
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     63 	63

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     66 	66
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     67 	67
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     72 	72
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     74 	74
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     79 	79
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     89 	89
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	      4 	 4
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	     17 	17
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     21 	21
Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID220503111413_v425_220 /data/tier0/admin/Specs/Express_Run350944_StreamExpressCosmics_Tier0_REPLAY_2022_ID2205031	     27 	27
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     91 	91

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     95 	95
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     59 	59
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     65 	65
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     71 	71
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	      6 	 6
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	     12 	12
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	     16 	16
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     94 	94
Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114 /data/tier0/admin/Specs/Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v4	     44 	44
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     75 	75
PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1118	 /data/tier0/admin/Specs/PromptReco_Run350944_Cosmics_Tier0_REPLAY_2022_ID220503111413_v425	     78 	78

NAME										 SPEC											    WORKFLOW_ID     SUB_ID
-------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------ ----------- ----------
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	     11 	11
Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID220503111413_v425_220503 /data/tier0/admin/Specs/Express_Run350944_StreamCalibration_Tier0_REPLAY_2022_ID2205031114	     18 	18

90 rows selected.

And the final list of deletableBlocks fetched from RucioInjectorLogs:

2022-05-04 17:46:04,425:140503937758976:DEBUG:RucioInjectorPoller:Final deletable blocks dict: {'/Cosmics/Tier0_REPLAY_2022-v425/RAW#559fa049-e3aa-4ba0-8ce0-333da56e5537': {'dataset': '/Cosm
ics/Tier0_REPLAY_2022-v425/RAW',
                                                                              'location': 'T0_CH_CERN_Disk',
                                                                              'sites': {'T0_CH_CERN_Disk'},
                                                                              'workflowName': 'Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114'},
 '/HLTPhysics/Tier0_REPLAY_2022-v425/RAW#c143e216-03b9-4a22-b1f1-45a136e819b8': {'dataset': '/HLTPhysics/Tier0_REPLAY_2022-v425/RAW',
                                                                                 'location': 'T0_CH_CERN_Disk',
                                                                                 'sites': {'T0_CH_CERN_Disk'},
                                                                                 'workflowName': 'Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114'},
 '/HcalNZS/Tier0_REPLAY_2022-v425/RAW#56efc611-056d-46bb-ae38-ed1311e58d2a': {'dataset': '/HcalNZS/Tier0_REPLAY_2022-v425/RAW',
                                                                              'location': 'T0_CH_CERN_Disk',
                                                                              'sites': {'T0_CH_CERN_Disk'},
                                                                              'workflowName': 'Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114'},
 '/MinimumBias/Tier0_REPLAY_2022-v425/RAW#5c56ac08-5792-48f5-b6e0-a12addd0f0cb': {'dataset': '/MinimumBias/Tier0_REPLAY_2022-v425/RAW',
                                                                                  'location': 'T0_CH_CERN_Disk',
                                                                                  'sites': {'T0_CH_CERN_Disk'},
                                                                                  'workflowName': 'Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114'},
 '/NoBPTX/Tier0_REPLAY_2022-v425/RAW#a7961d75-df91-4fd4-a2ac-248c980819fe': {'dataset': '/NoBPTX/Tier0_REPLAY_2022-v425/RAW',
                                                                             'location': 'T0_CH_CERN_Disk',
                                                                             'sites': {'T0_CH_CERN_Disk'},
                                                                             'workflowName': 'Repack_Run350944_StreamPhysics_Tier0_REPLAY_2022_ID220503111413_v425_220503_1114'}}
2022-05-04 17:46:04,425:140503937758976:INFO:RucioInjectorPoller:Handeling 5 candidate blocks
2022-05-04 17:46:04,454:140503937758976:DEBUG:connectionpool:http://cms-rucio.cern.ch:80 "POST /replicas/datasets_bulk HTTP/1.1" 200 None
2022-05-04 17:46:04,456:140503937758976:DEBUG:RucioInjectorPoller:BlockName: /Cosmics/Tier0_REPLAY_2022-v425/RAW#559fa049-e3aa-4ba0-8ce0-333da56e5537
2022-05-04 17:46:04,456:140503937758976:DEBUG:RucioInjectorPoller:Needed: {'T0_CH_CERN_Disk'} / Available: {'T0_CH_CERN_Disk'}
2022-05-04 17:46:04,487:140503937758976:DEBUG:connectionpool:http://cms-rucio.cern.ch:80 "POST /replicas/datasets_bulk HTTP/1.1" 200 None
2022-05-04 17:46:04,487:140503937758976:DEBUG:RucioInjectorPoller:BlockName: /HLTPhysics/Tier0_REPLAY_2022-v425/RAW#c143e216-03b9-4a22-b1f1-45a136e819b8
...

@amaltaro @germanfgv

amaltaro

@todor-ivanov some of this review was done live on slack, but please clean those pylint and pep8/pycodestyle in the new modules. You can find also a few comments along the line. Thanks

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

src/python/WMComponent/RucioInjector/Database/Oracle/GetCompletedBlocks.py

germanfgv

@todor-ivanov As you may remember, we need a separate config parameter to control how long we keep the block on disk. I think this should be straight forward change. Let me know if you have any issues with this.

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

cmsdmwmbot · 2022-05-16T12:56:44Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
- 2 changes in unstable tests
Python3 Pylint check: failed
- 16 warnings and errors that must be fixed
- 6 warnings
- 26 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 3 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13215/artifact/artifacts/PullRequestReport.html

todor-ivanov · 2022-05-16T13:00:30Z

Hi @amaltaro I requested your review yet again, but it was kind of early. I am also working on the pylint tests and one last configuration parameter @germanfgv was mentioning is still needed. But please take a quick look and check if what I have done for changing the DAO parsing method looks sound.

cmsdmwmbot · 2022-05-16T13:04:43Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
- 3 changes in unstable tests
Python3 Pylint check: failed
- 19 warnings and errors that must be fixed
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 4 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13217/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-05-16T13:24:15Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
- 4 changes in unstable tests
Python3 Pylint check: failed
- 19 warnings and errors that must be fixed
- 6 warnings
- 26 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 4 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13216/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-05-17T13:00:35Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
- 1 changes in unstable tests
Python3 Pylint check: failed
- 19 warnings and errors that must be fixed
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13223/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-05-17T14:24:47Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
- 2 changes in unstable tests
Python3 Pylint check: failed
- 17 warnings and errors that must be fixed
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13224/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-05-17T16:11:15Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
- 1 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13227/artifact/artifacts/PullRequestReport.html

todor-ivanov · 2022-05-17T16:19:48Z

@amaltaro please go ahead and proceed with your review.

cmsdmwmbot · 2022-05-17T19:50:35Z

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests added
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13230/artifact/artifacts/PullRequestReport.html

amaltaro

This is looking good Todor. However I have many comments and questions that might need further follow up. Please also have a look at the usual Jenkins report, there are a few minor things that you should take into account for the new modules.

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

amaltaro · 2022-05-18T10:57:54Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+             AND dbsbuffer_dataset_subscription.subscribed = 1
+             AND dbsbuffer_block.status = 'Closed'
+             AND dbsbuffer_block.deleted = 0
+             GROUP BY dbsbuffer_block.blockname,


Do we need to group the output for faster post-processing? Or is it just a live visualization enhancement?

No visualization is involved at all.
The Group By clause ensures no duplicate records are returned for any of the grouping columns. That is needed for the Having clause bellow.

By construction, I would say we will never have a duplicate row for a given blockname + pnn. I am not an SQL expert though and I could be wrong. But if I am right, this would make this query faster and cleaner.

No. Removing the group by statement will make that query dangerous!

amaltaro · 2022-05-18T10:59:21Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+                      dbsbuffer_dataset_subscription.site,
+                      dbsbuffer_workflow.name,
+                      dbsbuffer_block.create_time
+             HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed)


Can you please educate me on why we need this check as well? If I understand it properly, it says: the amount of dbsbuffer blocks matching all these constraints must be equal to the amount of completed workflows in dbsbuffer. Is that correct?
And a completed workflow in dbsbuffer comes from the fact that all work units (wmbs records) have been processed in that workflow, right?

the amount of dbsbuffer blocks matching all these constraints must be equal to the amount of completed workflows in dbsbuffer. Is that correct?

Yes

And a completed workflow in dbsbuffer comes from the fact that all work units (wmbs records) have been processed in that workflow, right?

Again, Yes.

My understanding here is: This basically assures all the records returned would include only blocks related to workflows marked as completed in dbsbuffer_workflow table. This one I did not change it is from the old DAO. So just to be 100% sure we are not messing things here I'd like to hear @hufnagel 's opinion as well.

Yes, I think you are right! But given that we already join dbsbuffer_workflow, I wonder why not having an extra AND clause in the WHERE block with this constraint dbsbuffer_workflow.completed=1 (or whatever value it's supposed to have).

Thanks for reiterating through this once again @amaltaro

Lets talk with examples here
Here is what this query returns as it is right now for a completely finished replay.

SQL> SELECT 2 count(*), 3 SUM(dbsbuffer_workflow.completed), 4 dbsbuffer_block.blockname, 5 dbsbuffer_location.pnn, 6 dbsbuffer_dataset.path, 7 dbsbuffer_dataset_subscription.site, 8 dbsbuffer_workflow.name 9 FROM dbsbuffer_dataset_subscription 10 INNER JOIN dbsbuffer_dataset ON 11 dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id 12 INNER JOIN dbsbuffer_block ON 13 dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id 14 INNER JOIN dbsbuffer_file ON 15 dbsbuffer_file.block_id = dbsbuffer_block.id 16 INNER JOIN dbsbuffer_workflow ON 17 dbsbuffer_workflow.id = dbsbuffer_file.workflow 18 INNER JOIN dbsbuffer_location ON 19 dbsbuffer_location.id = dbsbuffer_block.location 20 WHERE dbsbuffer_dataset_subscription.delete_blocks = 1 21 AND dbsbuffer_dataset_subscription.subscribed = 1 22 AND dbsbuffer_block.status = 'Closed' 23 AND dbsbuffer_block.deleted = 0 24 GROUP BY dbsbuffer_block.blockname, 25 dbsbuffer_location.pnn, 26 dbsbuffer_dataset.path, 27 dbsbuffer_dataset_subscription.site, 28 dbsbuffer_workflow.name 29 HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed); COUNT(*) SUM(DBSBUFFER_WORKFLOW.COMPLETED) BLOCKNAME PNN PATH SITE NAME ---------- --------------------------------- -------------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- ------------------------------- 1 1 /TestEnablesEcalHcal/Tier0_REPLAY_2022-Express-v425/RAW#c0ee5da6-3616-419b-81ba- T0_CH_CERN_Disk /TestEnablesEcalHcal/Tier0_REPLAY_2022-E T2_CH_CERN Express_Run351572_StreamCalibra 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiStripPCLHistos-Express-v425/ALCARECO#e T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /MinimumBias/Tier0_REPLAY_2022-v425/RAW#961b6dcb-7514-4adc-b26c-372792c977fc T0_CH_CERN_Disk /MinimumBias/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /L1Accept/Tier0_REPLAY_2022-v425/RAW#575557fc-e92c-45c6-9103-c93572d3538b T0_CH_CERN_Disk /L1Accept/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamNanoDST_ 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiPixelCalZeroBias-Express-v425/ALCARECO T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /HcalNZS/Tier0_REPLAY_2022-v425/RAW#32396414-57a4-4188-a6f6-75a208bd4c8f T0_CH_CERN_Disk /HcalNZS/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /NoBPTX/Tier0_REPLAY_2022-v425/RAW#0bbb5f07-c8c2-4725-8251-79c89354e272 T0_CH_CERN_Disk /NoBPTX/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /StreamCalibration/Tier0_REPLAY_2022-PromptCalibProdEcalPedestals-Express-v425/A T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Pro T2_CH_CERN Express_Run351572_StreamCalibra 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-PromptCalibProdSiStrip-Express-v425/ALCA T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /HLTPhysics/Tier0_REPLAY_2022-v425/RAW#0f75ea20-d421-4946-b366-63655c5bb3a3 T0_CH_CERN_Disk /HLTPhysics/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /StreamCalibration/Tier0_REPLAY_2022-Express-v425/DQMIO#6082423a-08ad-44b3-8f9e- T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Exp T2_CH_CERN Express_Run351572_StreamCalibra COUNT(*) SUM(DBSBUFFER_WORKFLOW.COMPLETED) BLOCKNAME PNN PATH SITE NAME ---------- --------------------------------- -------------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- ------------------------------- 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiStripCalZeroBias-Express-v425/ALCARECO T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 4 4 /StreamExpressCosmics/Tier0_REPLAY_2022-Express-v425/DQMIO#bfdf5c4d-003a-41a1-bd T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /StreamCalibration/Tier0_REPLAY_2022-EcalTestPulsesRaw-Express-v425/ALCARECO#184 T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Eca T2_CH_CERN Express_Run351572_StreamCalibra 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-TkAlCosmics0T-Express-v425/ALCARECO#880f T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 5 5 /ExpressCosmics/Tier0_REPLAY_2022-Express-v425/FEVT#4a678f2c-3945-41cf-b06c-faef T0_CH_CERN_Disk /ExpressCosmics/Tier0_REPLAY_2022-Expres T2_CH_CERN Express_Run351572_StreamExpress 1 1 /StreamExpressCosmics/Tier0_REPLAY_2022-Express-v425/DQMIO#cec89541-78f7-44a4-aa T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /Cosmics/Tier0_REPLAY_2022-v425/RAW#9d334e74-6687-4670-8f38-f95c720f8287 T0_CH_CERN_Disk /Cosmics/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /RPCMonitor/Tier0_REPLAY_2022-v425/RAW#c0abae4e-71a4-4b43-8043-e5e5ee1d6a90 T0_CH_CERN_Disk /RPCMonitor/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamRPCMON_T 19 rows selected.

And If I understand your suggestion correctly the change should be something like:

SQL> SELECT 2 count(*), 3 SUM(dbsbuffer_workflow.completed), 4 dbsbuffer_block.blockname, 5 dbsbuffer_location.pnn, 6 dbsbuffer_dataset.path, 7 dbsbuffer_dataset_subscription.site, 8 dbsbuffer_workflow.name 9 FROM dbsbuffer_dataset_subscription 10 INNER JOIN dbsbuffer_dataset ON 11 dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id 12 INNER JOIN dbsbuffer_block ON 13 dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id 14 INNER JOIN dbsbuffer_file ON 15 dbsbuffer_file.block_id = dbsbuffer_block.id 16 INNER JOIN dbsbuffer_workflow ON 17 dbsbuffer_workflow.id = dbsbuffer_file.workflow 18 INNER JOIN dbsbuffer_location ON 19 dbsbuffer_location.id = dbsbuffer_block.location 20 WHERE dbsbuffer_dataset_subscription.delete_blocks = 1 21 AND dbsbuffer_dataset_subscription.subscribed = 1 22 AND dbsbuffer_block.status = 'Closed' 23 AND dbsbuffer_block.deleted = 0 24 AND dbsbuffer_workflow.completed = 1 25 GROUP BY dbsbuffer_block.blockname, 26 dbsbuffer_location.pnn, 27 dbsbuffer_dataset.path, 28 dbsbuffer_dataset_subscription.site, 29 dbsbuffer_workflow.name ; COUNT(*) SUM(DBSBUFFER_WORKFLOW.COMPLETED) BLOCKNAME PNN PATH SITE NAME ---------- --------------------------------- -------------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- ------------------------------- 1 1 /TestEnablesEcalHcal/Tier0_REPLAY_2022-Express-v425/RAW#c0ee5da6-3616-419b-81ba- T0_CH_CERN_Disk /TestEnablesEcalHcal/Tier0_REPLAY_2022-E T2_CH_CERN Express_Run351572_StreamCalibra 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiStripPCLHistos-Express-v425/ALCARECO#e T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /MinimumBias/Tier0_REPLAY_2022-v425/RAW#961b6dcb-7514-4adc-b26c-372792c977fc T0_CH_CERN_Disk /MinimumBias/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /L1Accept/Tier0_REPLAY_2022-v425/RAW#575557fc-e92c-45c6-9103-c93572d3538b T0_CH_CERN_Disk /L1Accept/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamNanoDST_ 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiPixelCalZeroBias-Express-v425/ALCARECO T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /HcalNZS/Tier0_REPLAY_2022-v425/RAW#32396414-57a4-4188-a6f6-75a208bd4c8f T0_CH_CERN_Disk /HcalNZS/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /NoBPTX/Tier0_REPLAY_2022-v425/RAW#0bbb5f07-c8c2-4725-8251-79c89354e272 T0_CH_CERN_Disk /NoBPTX/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /StreamCalibration/Tier0_REPLAY_2022-PromptCalibProdEcalPedestals-Express-v425/A T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Pro T2_CH_CERN Express_Run351572_StreamCalibra 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-PromptCalibProdSiStrip-Express-v425/ALCA T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /HLTPhysics/Tier0_REPLAY_2022-v425/RAW#0f75ea20-d421-4946-b366-63655c5bb3a3 T0_CH_CERN_Disk /HLTPhysics/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /StreamCalibration/Tier0_REPLAY_2022-Express-v425/DQMIO#6082423a-08ad-44b3-8f9e- T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Exp T2_CH_CERN Express_Run351572_StreamCalibra COUNT(*) SUM(DBSBUFFER_WORKFLOW.COMPLETED) BLOCKNAME PNN PATH SITE NAME ---------- --------------------------------- -------------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- ------------------------------- 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiStripCalZeroBias-Express-v425/ALCARECO T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 4 4 /StreamExpressCosmics/Tier0_REPLAY_2022-Express-v425/DQMIO#bfdf5c4d-003a-41a1-bd T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /StreamCalibration/Tier0_REPLAY_2022-EcalTestPulsesRaw-Express-v425/ALCARECO#184 T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Eca T2_CH_CERN Express_Run351572_StreamCalibra 5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-TkAlCosmics0T-Express-v425/ALCARECO#880f T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 5 5 /ExpressCosmics/Tier0_REPLAY_2022-Express-v425/FEVT#4a678f2c-3945-41cf-b06c-faef T0_CH_CERN_Disk /ExpressCosmics/Tier0_REPLAY_2022-Expres T2_CH_CERN Express_Run351572_StreamExpress 1 1 /StreamExpressCosmics/Tier0_REPLAY_2022-Express-v425/DQMIO#cec89541-78f7-44a4-aa T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress 1 1 /Cosmics/Tier0_REPLAY_2022-v425/RAW#9d334e74-6687-4670-8f38-f95c720f8287 T0_CH_CERN_Disk /Cosmics/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_ 1 1 /RPCMonitor/Tier0_REPLAY_2022-v425/RAW#c0abae4e-71a4-4b43-8043-e5e5ee1d6a90 T0_CH_CERN_Disk /RPCMonitor/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamRPCMON_T 19 rows selected.

The output for a fully completed replay does look equivalent, indeed. Needs to be tested for a running one with a PromptReco paused though.

The difference in those tow queries to me sounds to be the following:

in the former the requirement for matching a completed=1 requirement for the rows returned is applied upon aggregation on all of the groups generated by the group by statement,

while in the later the condition is applied during the selection statement before the grouping and aggregation.

This could matter for a running workflow.

amaltaro · 2022-05-18T10:59:39Z

src/python/WMComponent/RucioInjector/Database/Oracle/GetCompletedBlocks.py

+
+"""
+
+from __future__ import division


same comment here

I guess you missed this (please remove both future imports).

another ping

src/python/WMComponent/RucioInjector/RucioInjectorPoller.py

todor-ivanov · 2022-05-19T13:38:32Z

Thanks @amaltaro for your review. I think I have addressed all your comments. Please take another look.

cmsdmwmbot · 2022-05-19T13:48:22Z

Jenkins results:

Python3 Unit tests: succeeded
- 2 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13240/artifact/artifacts/PullRequestReport.html

amaltaro

Todor, please remove the future imports (comments along the code).
I also have further comments/questions to properly understand it.

amaltaro · 2022-05-19T14:03:35Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+
+"""
+
+from __future__ import print_function


This should have been deleted as well. In short, we no longer need to import anything from __future__ or future

amaltaro · 2022-05-19T14:05:20Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+
+class GetCompletedBlocks(DBFormatter):
+    """
+    Retrieves a list of blocks that are closed but NOT sure yet if they are deletedable:


Typo here (and a few misspelling in the lines below).

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

amaltaro · 2022-05-19T14:10:03Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+             AND dbsbuffer_dataset_subscription.subscribed = 1
+             AND dbsbuffer_block.status = 'Closed'
+             AND dbsbuffer_block.deleted = 0
+             GROUP BY dbsbuffer_block.blockname,


By construction, I would say we will never have a duplicate row for a given blockname + pnn. I am not an SQL expert though and I could be wrong. But if I am right, this would make this query faster and cleaner.

amaltaro · 2022-05-19T14:13:14Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+                      dbsbuffer_dataset_subscription.site,
+                      dbsbuffer_workflow.name,
+                      dbsbuffer_block.create_time
+             HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed)


Yes, I think you are right! But given that we already join dbsbuffer_workflow, I wonder why not having an extra AND clause in the WHERE block with this constraint dbsbuffer_workflow.completed=1 (or whatever value it's supposed to have).

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

amaltaro · 2022-05-19T14:15:03Z

src/python/WMComponent/RucioInjector/Database/Oracle/GetCompletedBlocks.py

+
+"""
+
+from __future__ import division


I guess you missed this (please remove both future imports).

cmsdmwmbot · 2022-05-20T13:54:11Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13241/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-05-20T14:05:28Z

Jenkins results:

Python3 Unit tests: succeeded
- 2 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13242/artifact/artifacts/PullRequestReport.html

todor-ivanov · 2022-05-20T14:08:20Z

Thanks @amaltaro other than the general change of the sql query structure (which I thing needs further discussion) I did fllow your comments. Please take another look.

amaltaro

I left another comment or two along the code for your consideration. It looks like you haven't looked at the pycodestyle report as well, please fix this:

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
    Line 88, E225 missing whitespace around operator

Regarding the SQL query, it's been tested and we do not know how much an improvement would buy us. So let's leave that open for a possible future discussion (use of the GROUP and HAVING clauses).

Once you make further changes, if you want to get this merged, then please:

squash your commits accordingly
remove those labels.

amaltaro · 2022-05-20T17:24:49Z

src/python/WMComponent/RucioInjector/Database/Oracle/GetCompletedBlocks.py

+
+"""
+
+from __future__ import division


another ping

amaltaro · 2022-05-20T17:27:45Z

src/python/WMComponent/RucioInjector/RucioInjectorPoller.py

@@ -82,6 +82,8 @@ def __init__(self, config):

        self.useDsetReplicaDeep = getattr(config.RucioInjector, "useDsetReplicaDeep", False)
        self.delBlockSlicesize = getattr(config.RucioInjector, "delBlockSlicesize", 100)
+        self.blockDeletionDelayHours = getattr(config.RucioInjector, "blockDeletionDelayHours", 0)


Just a note, you don't need to update the code, but given that you don't use this variable (blockDeletionDelayHours) anywhere in the class, you could simply remove the self (not making it an object instance attribute)

Fix log message Typo Change DAO parsing method. Include files left behind from previous commit. Fix docstring Add blockDeletionDelayHours. Typo Pylint changes Bugfix - missed plural in variable name mapping Review comments Review comments 2 Review comments 2

todor-ivanov · 2022-05-20T21:50:45Z

Hi @amaltaro the changes are finalized and the Commits squashed. Please go ahead and merge at your convenience.

cmsdmwmbot · 2022-05-20T21:58:25Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13243/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-05-20T21:59:50Z

Jenkins results:

Python3 Unit tests: succeeded
- 2 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13244/artifact/artifacts/PullRequestReport.html

todor-ivanov requested a review from amaltaro May 4, 2022 10:57

todor-ivanov requested a review from germanfgv May 4, 2022 11:14

todor-ivanov force-pushed the feature_T0_DisentangleBlockDelition_fix-11042 branch from 929ce18 to 8cc2ac6 Compare May 4, 2022 13:36

todor-ivanov force-pushed the feature_T0_DisentangleBlockDelition_fix-11042 branch from 8cc2ac6 to 933fb2d Compare May 4, 2022 15:36

amaltaro requested changes May 5, 2022

View reviewed changes

germanfgv suggested changes May 10, 2022

View reviewed changes

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py Outdated Show resolved Hide resolved

amaltaro added PR: Work in progress PR: Do not merge yet labels May 12, 2022

todor-ivanov mentioned this pull request May 16, 2022

Increase persistence of T0 monitoring data in T0 WMStats #10904

Closed

todor-ivanov force-pushed the feature_T0_DisentangleBlockDelition_fix-11042 branch from a1929e8 to 2c9f770 Compare May 16, 2022 12:50

todor-ivanov requested a review from amaltaro May 16, 2022 12:50

todor-ivanov force-pushed the feature_T0_DisentangleBlockDelition_fix-11042 branch from 2c9f770 to 4cfc184 Compare May 16, 2022 12:54

todor-ivanov force-pushed the feature_T0_DisentangleBlockDelition_fix-11042 branch from b5fbac0 to bc00367 Compare May 17, 2022 14:13

amaltaro reviewed May 18, 2022

View reviewed changes

todor-ivanov requested a review from amaltaro May 19, 2022 13:38

amaltaro requested changes May 19, 2022

View reviewed changes

todor-ivanov force-pushed the feature_T0_DisentangleBlockDelition_fix-11042 branch from 0d29758 to 4ef1e42 Compare May 20, 2022 13:56

todor-ivanov requested a review from amaltaro May 20, 2022 14:08

amaltaro requested changes May 20, 2022

View reviewed changes

amaltaro added the PR: squashing needed label May 20, 2022

todor-ivanov added 2 commits May 20, 2022 23:46

Pylint Fixes.

51306b0

todor-ivanov force-pushed the feature_T0_DisentangleBlockDelition_fix-11042 branch from 2c4c5e7 to 51306b0 Compare May 20, 2022 21:47

todor-ivanov requested a review from amaltaro May 20, 2022 21:48

todor-ivanov removed PR: Do not merge yet PR: Work in progress PR: squashing needed labels May 20, 2022

amaltaro approved these changes May 21, 2022

View reviewed changes

amaltaro merged commit 43824d7 into dmwm:master May 21, 2022

Split Deletable Blocks checks in two stages. #11127

Split Deletable Blocks checks in two stages. #11127

Conversation

todor-ivanov commented May 4, 2022 • edited Loading

Status

Description

Is it backward compatible (if not, which system it affects?)

Related PRs

External dependencies / deployment changes

cmsdmwmbot commented May 4, 2022

todor-ivanov commented May 4, 2022 • edited Loading

cmsdmwmbot commented May 4, 2022

todor-ivanov commented May 4, 2022 • edited Loading

cmsdmwmbot commented May 4, 2022

todor-ivanov commented May 4, 2022 • edited Loading

amaltaro left a comment

Choose a reason for hiding this comment

germanfgv left a comment

Choose a reason for hiding this comment

cmsdmwmbot commented May 16, 2022

todor-ivanov commented May 16, 2022

cmsdmwmbot commented May 16, 2022

cmsdmwmbot commented May 16, 2022

cmsdmwmbot commented May 17, 2022

cmsdmwmbot commented May 17, 2022

cmsdmwmbot commented May 17, 2022

todor-ivanov commented May 17, 2022

cmsdmwmbot commented May 17, 2022

amaltaro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

todor-ivanov May 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amaltaro May 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

todor-ivanov commented May 19, 2022

cmsdmwmbot commented May 19, 2022

amaltaro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amaltaro May 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsdmwmbot commented May 20, 2022

cmsdmwmbot commented May 20, 2022

todor-ivanov commented May 20, 2022

amaltaro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

todor-ivanov commented May 20, 2022

cmsdmwmbot commented May 20, 2022

cmsdmwmbot commented May 20, 2022

todor-ivanov commented May 4, 2022 •

edited

Loading

todor-ivanov commented May 4, 2022 •

edited

Loading

todor-ivanov commented May 4, 2022 •

edited

Loading

todor-ivanov commented May 4, 2022 •

edited

Loading

todor-ivanov May 19, 2022 •

edited

Loading

amaltaro May 19, 2022 •

edited

Loading

amaltaro May 19, 2022 •

edited

Loading