-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split Deletable Blocks checks in two stages. #11127
Split Deletable Blocks checks in two stages. #11127
Conversation
Jenkins results:
|
And here is the output from the
As of the output from the FYI @germanfgv @amaltaro @hufnagel @drkovalskyi [1]
[2]
[3]
[4]
[5]
|
929ce18
to
8cc2ac6
Compare
Jenkins results:
|
And here is the new state:
Full set of completed blocks:
Fully completed workflows with dependent workflows NOT completed:
As expected both Express and Repack present. Full set of trully Deletable Workflows:
Only FYI @amaltaro @germanfgv |
8cc2ac6
to
933fb2d
Compare
Jenkins results:
|
We now have the following replay status:
Unfortunately we found out we the repack workflow being archived much earlier than we thought - because were were running with those two patches in place: One is targeting early archival and late CouchCleanup, while the other was targeting late archival and early block level deletions.
But this affected only the visibility of Full set of completed workflows:
All Completed blocks + workflows producing them - ready for deletion:
And the final list of
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@todor-ivanov some of this review was done live on slack, but please clean those pylint and pep8/pycodestyle in the new modules. You can find also a few comments along the line. Thanks
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Outdated
Show resolved
Hide resolved
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Outdated
Show resolved
Hide resolved
src/python/WMComponent/RucioInjector/Database/Oracle/GetCompletedBlocks.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@todor-ivanov As you may remember, we need a separate config parameter to control how long we keep the block on disk. I think this should be straight forward change. Let me know if you have any issues with this.
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Outdated
Show resolved
Hide resolved
a1929e8
to
2c9f770
Compare
2c9f770
to
4cfc184
Compare
Jenkins results:
|
Hi @amaltaro I requested your review yet again, but it was kind of early. I am also working on the pylint tests and one last configuration parameter @germanfgv was mentioning is still needed. But please take a quick look and check if what I have done for changing the DAO parsing method looks sound. |
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
b5fbac0
to
bc00367
Compare
Jenkins results:
|
Jenkins results:
|
@amaltaro please go ahead and proceed with your review. |
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good Todor. However I have many comments and questions that might need further follow up. Please also have a look at the usual Jenkins report, there are a few minor things that you should take into account for the new modules.
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Outdated
Show resolved
Hide resolved
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Outdated
Show resolved
Hide resolved
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Show resolved
Hide resolved
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Show resolved
Hide resolved
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Outdated
Show resolved
Hide resolved
AND dbsbuffer_dataset_subscription.subscribed = 1 | ||
AND dbsbuffer_block.status = 'Closed' | ||
AND dbsbuffer_block.deleted = 0 | ||
GROUP BY dbsbuffer_block.blockname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to group the output for faster post-processing? Or is it just a live visualization enhancement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No visualization is involved at all.
The Group By
clause ensures no duplicate records are returned for any of the grouping columns. That is needed for the Having
clause bellow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By construction, I would say we will never have a duplicate row for a given blockname + pnn. I am not an SQL expert though and I could be wrong. But if I am right, this would make this query faster and cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Removing the group by
statement will make that query dangerous!
dbsbuffer_dataset_subscription.site, | ||
dbsbuffer_workflow.name, | ||
dbsbuffer_block.create_time | ||
HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please educate me on why we need this check as well? If I understand it properly, it says: the amount of dbsbuffer blocks matching all these constraints must be equal to the amount of completed workflows in dbsbuffer. Is that correct?
And a completed workflow in dbsbuffer comes from the fact that all work units (wmbs records) have been processed in that workflow, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the amount of dbsbuffer blocks matching all these constraints must be equal to the amount of completed workflows in dbsbuffer. Is that correct?
Yes
And a completed workflow in dbsbuffer comes from the fact that all work units (wmbs records) have been processed in that workflow, right?
Again, Yes.
My understanding here is: This basically assures all the records returned would include only blocks related to workflows marked as completed in dbsbuffer_workflow
table. This one I did not change it is from the old DAO. So just to be 100% sure we are not messing things here I'd like to hear @hufnagel 's opinion as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think you are right! But given that we already join dbsbuffer_workflow, I wonder why not having an extra AND clause in the WHERE block with this constraint dbsbuffer_workflow.completed=1
(or whatever value it's supposed to have).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reiterating through this once again @amaltaro
Lets talk with examples here
Here is what this query returns as it is right now for a completely finished replay.
SQL> SELECT
2 count(*),
3 SUM(dbsbuffer_workflow.completed),
4 dbsbuffer_block.blockname,
5 dbsbuffer_location.pnn,
6 dbsbuffer_dataset.path,
7 dbsbuffer_dataset_subscription.site,
8 dbsbuffer_workflow.name
9 FROM dbsbuffer_dataset_subscription
10 INNER JOIN dbsbuffer_dataset ON
11 dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id
12 INNER JOIN dbsbuffer_block ON
13 dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id
14 INNER JOIN dbsbuffer_file ON
15 dbsbuffer_file.block_id = dbsbuffer_block.id
16 INNER JOIN dbsbuffer_workflow ON
17 dbsbuffer_workflow.id = dbsbuffer_file.workflow
18 INNER JOIN dbsbuffer_location ON
19 dbsbuffer_location.id = dbsbuffer_block.location
20 WHERE dbsbuffer_dataset_subscription.delete_blocks = 1
21 AND dbsbuffer_dataset_subscription.subscribed = 1
22 AND dbsbuffer_block.status = 'Closed'
23 AND dbsbuffer_block.deleted = 0
24 GROUP BY dbsbuffer_block.blockname,
25 dbsbuffer_location.pnn,
26 dbsbuffer_dataset.path,
27 dbsbuffer_dataset_subscription.site,
28 dbsbuffer_workflow.name
29 HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed);
COUNT(*) SUM(DBSBUFFER_WORKFLOW.COMPLETED) BLOCKNAME PNN PATH SITE NAME
---------- --------------------------------- -------------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- -------------------------------
1 1 /TestEnablesEcalHcal/Tier0_REPLAY_2022-Express-v425/RAW#c0ee5da6-3616-419b-81ba- T0_CH_CERN_Disk /TestEnablesEcalHcal/Tier0_REPLAY_2022-E T2_CH_CERN Express_Run351572_StreamCalibra
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiStripPCLHistos-Express-v425/ALCARECO#e T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /MinimumBias/Tier0_REPLAY_2022-v425/RAW#961b6dcb-7514-4adc-b26c-372792c977fc T0_CH_CERN_Disk /MinimumBias/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /L1Accept/Tier0_REPLAY_2022-v425/RAW#575557fc-e92c-45c6-9103-c93572d3538b T0_CH_CERN_Disk /L1Accept/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamNanoDST_
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiPixelCalZeroBias-Express-v425/ALCARECO T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /HcalNZS/Tier0_REPLAY_2022-v425/RAW#32396414-57a4-4188-a6f6-75a208bd4c8f T0_CH_CERN_Disk /HcalNZS/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /NoBPTX/Tier0_REPLAY_2022-v425/RAW#0bbb5f07-c8c2-4725-8251-79c89354e272 T0_CH_CERN_Disk /NoBPTX/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /StreamCalibration/Tier0_REPLAY_2022-PromptCalibProdEcalPedestals-Express-v425/A T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Pro T2_CH_CERN Express_Run351572_StreamCalibra
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-PromptCalibProdSiStrip-Express-v425/ALCA T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /HLTPhysics/Tier0_REPLAY_2022-v425/RAW#0f75ea20-d421-4946-b366-63655c5bb3a3 T0_CH_CERN_Disk /HLTPhysics/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /StreamCalibration/Tier0_REPLAY_2022-Express-v425/DQMIO#6082423a-08ad-44b3-8f9e- T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Exp T2_CH_CERN Express_Run351572_StreamCalibra
COUNT(*) SUM(DBSBUFFER_WORKFLOW.COMPLETED) BLOCKNAME PNN PATH SITE NAME
---------- --------------------------------- -------------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- -------------------------------
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiStripCalZeroBias-Express-v425/ALCARECO T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
4 4 /StreamExpressCosmics/Tier0_REPLAY_2022-Express-v425/DQMIO#bfdf5c4d-003a-41a1-bd T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /StreamCalibration/Tier0_REPLAY_2022-EcalTestPulsesRaw-Express-v425/ALCARECO#184 T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Eca T2_CH_CERN Express_Run351572_StreamCalibra
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-TkAlCosmics0T-Express-v425/ALCARECO#880f T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
5 5 /ExpressCosmics/Tier0_REPLAY_2022-Express-v425/FEVT#4a678f2c-3945-41cf-b06c-faef T0_CH_CERN_Disk /ExpressCosmics/Tier0_REPLAY_2022-Expres T2_CH_CERN Express_Run351572_StreamExpress
1 1 /StreamExpressCosmics/Tier0_REPLAY_2022-Express-v425/DQMIO#cec89541-78f7-44a4-aa T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /Cosmics/Tier0_REPLAY_2022-v425/RAW#9d334e74-6687-4670-8f38-f95c720f8287 T0_CH_CERN_Disk /Cosmics/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /RPCMonitor/Tier0_REPLAY_2022-v425/RAW#c0abae4e-71a4-4b43-8043-e5e5ee1d6a90 T0_CH_CERN_Disk /RPCMonitor/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamRPCMON_T
19 rows selected.
And If I understand your suggestion correctly the change should be something like:
SQL> SELECT
2 count(*),
3 SUM(dbsbuffer_workflow.completed),
4 dbsbuffer_block.blockname,
5 dbsbuffer_location.pnn,
6 dbsbuffer_dataset.path,
7 dbsbuffer_dataset_subscription.site,
8 dbsbuffer_workflow.name
9 FROM dbsbuffer_dataset_subscription
10 INNER JOIN dbsbuffer_dataset ON
11 dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id
12 INNER JOIN dbsbuffer_block ON
13 dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id
14 INNER JOIN dbsbuffer_file ON
15 dbsbuffer_file.block_id = dbsbuffer_block.id
16 INNER JOIN dbsbuffer_workflow ON
17 dbsbuffer_workflow.id = dbsbuffer_file.workflow
18 INNER JOIN dbsbuffer_location ON
19 dbsbuffer_location.id = dbsbuffer_block.location
20 WHERE dbsbuffer_dataset_subscription.delete_blocks = 1
21 AND dbsbuffer_dataset_subscription.subscribed = 1
22 AND dbsbuffer_block.status = 'Closed'
23 AND dbsbuffer_block.deleted = 0
24 AND dbsbuffer_workflow.completed = 1
25 GROUP BY dbsbuffer_block.blockname,
26 dbsbuffer_location.pnn,
27 dbsbuffer_dataset.path,
28 dbsbuffer_dataset_subscription.site,
29 dbsbuffer_workflow.name ;
COUNT(*) SUM(DBSBUFFER_WORKFLOW.COMPLETED) BLOCKNAME PNN PATH SITE NAME
---------- --------------------------------- -------------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- -------------------------------
1 1 /TestEnablesEcalHcal/Tier0_REPLAY_2022-Express-v425/RAW#c0ee5da6-3616-419b-81ba- T0_CH_CERN_Disk /TestEnablesEcalHcal/Tier0_REPLAY_2022-E T2_CH_CERN Express_Run351572_StreamCalibra
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiStripPCLHistos-Express-v425/ALCARECO#e T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /MinimumBias/Tier0_REPLAY_2022-v425/RAW#961b6dcb-7514-4adc-b26c-372792c977fc T0_CH_CERN_Disk /MinimumBias/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /L1Accept/Tier0_REPLAY_2022-v425/RAW#575557fc-e92c-45c6-9103-c93572d3538b T0_CH_CERN_Disk /L1Accept/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamNanoDST_
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiPixelCalZeroBias-Express-v425/ALCARECO T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /HcalNZS/Tier0_REPLAY_2022-v425/RAW#32396414-57a4-4188-a6f6-75a208bd4c8f T0_CH_CERN_Disk /HcalNZS/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /NoBPTX/Tier0_REPLAY_2022-v425/RAW#0bbb5f07-c8c2-4725-8251-79c89354e272 T0_CH_CERN_Disk /NoBPTX/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /StreamCalibration/Tier0_REPLAY_2022-PromptCalibProdEcalPedestals-Express-v425/A T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Pro T2_CH_CERN Express_Run351572_StreamCalibra
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-PromptCalibProdSiStrip-Express-v425/ALCA T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /HLTPhysics/Tier0_REPLAY_2022-v425/RAW#0f75ea20-d421-4946-b366-63655c5bb3a3 T0_CH_CERN_Disk /HLTPhysics/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /StreamCalibration/Tier0_REPLAY_2022-Express-v425/DQMIO#6082423a-08ad-44b3-8f9e- T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Exp T2_CH_CERN Express_Run351572_StreamCalibra
COUNT(*) SUM(DBSBUFFER_WORKFLOW.COMPLETED) BLOCKNAME PNN PATH SITE NAME
---------- --------------------------------- -------------------------------------------------------------------------------- --------------- ---------------------------------------- --------------- -------------------------------
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-SiStripCalZeroBias-Express-v425/ALCARECO T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
4 4 /StreamExpressCosmics/Tier0_REPLAY_2022-Express-v425/DQMIO#bfdf5c4d-003a-41a1-bd T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /StreamCalibration/Tier0_REPLAY_2022-EcalTestPulsesRaw-Express-v425/ALCARECO#184 T0_CH_CERN_Disk /StreamCalibration/Tier0_REPLAY_2022-Eca T2_CH_CERN Express_Run351572_StreamCalibra
5 5 /StreamExpressCosmics/Tier0_REPLAY_2022-TkAlCosmics0T-Express-v425/ALCARECO#880f T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
5 5 /ExpressCosmics/Tier0_REPLAY_2022-Express-v425/FEVT#4a678f2c-3945-41cf-b06c-faef T0_CH_CERN_Disk /ExpressCosmics/Tier0_REPLAY_2022-Expres T2_CH_CERN Express_Run351572_StreamExpress
1 1 /StreamExpressCosmics/Tier0_REPLAY_2022-Express-v425/DQMIO#cec89541-78f7-44a4-aa T0_CH_CERN_Disk /StreamExpressCosmics/Tier0_REPLAY_2022- T2_CH_CERN Express_Run351572_StreamExpress
1 1 /Cosmics/Tier0_REPLAY_2022-v425/RAW#9d334e74-6687-4670-8f38-f95c720f8287 T0_CH_CERN_Disk /Cosmics/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamPhysics_
1 1 /RPCMonitor/Tier0_REPLAY_2022-v425/RAW#c0abae4e-71a4-4b43-8043-e5e5ee1d6a90 T0_CH_CERN_Disk /RPCMonitor/Tier0_REPLAY_2022-v425/RAW T2_CH_CERN Repack_Run351572_StreamRPCMON_T
19 rows selected.
The output for a fully completed replay does look equivalent, indeed. Needs to be tested for a running one with a PromptReco paused though.
The difference in those tow queries to me sounds to be the following:
- in the former the requirement for matching a
completed=1
requirement for the rows returned is applied upon aggregation on all of the groups generated by thegroup by
statement, - while in the later the condition is applied during the selection statement before the grouping and aggregation.
This could matter for a running workflow.
|
||
""" | ||
|
||
from __future__ import division |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you missed this (please remove both future imports).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another ping
Thanks @amaltaro for your review. I think I have addressed all your comments. Please take another look. |
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todor, please remove the future imports (comments along the code).
I also have further comments/questions to properly understand it.
|
||
""" | ||
|
||
from __future__ import print_function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have been deleted as well. In short, we no longer need to import anything from __future__
or future
|
||
class GetCompletedBlocks(DBFormatter): | ||
""" | ||
Retrieves a list of blocks that are closed but NOT sure yet if they are deletedable: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo here (and a few misspelling in the lines below).
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Outdated
Show resolved
Hide resolved
AND dbsbuffer_dataset_subscription.subscribed = 1 | ||
AND dbsbuffer_block.status = 'Closed' | ||
AND dbsbuffer_block.deleted = 0 | ||
GROUP BY dbsbuffer_block.blockname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By construction, I would say we will never have a duplicate row for a given blockname + pnn. I am not an SQL expert though and I could be wrong. But if I am right, this would make this query faster and cleaner.
dbsbuffer_dataset_subscription.site, | ||
dbsbuffer_workflow.name, | ||
dbsbuffer_block.create_time | ||
HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think you are right! But given that we already join dbsbuffer_workflow, I wonder why not having an extra AND clause in the WHERE block with this constraint dbsbuffer_workflow.completed=1
(or whatever value it's supposed to have).
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Show resolved
Hide resolved
|
||
""" | ||
|
||
from __future__ import division |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you missed this (please remove both future imports).
Jenkins results:
|
0d29758
to
4ef1e42
Compare
Jenkins results:
|
Thanks @amaltaro other than the general change of the sql query structure (which I thing needs further discussion) I did fllow your comments. Please take another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left another comment or two along the code for your consideration. It looks like you haven't looked at the pycodestyle report as well, please fix this:
src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py
Line 88, E225 missing whitespace around operator
Regarding the SQL query, it's been tested and we do not know how much an improvement would buy us. So let's leave that open for a possible future discussion (use of the GROUP and HAVING clauses).
Once you make further changes, if you want to get this merged, then please:
- squash your commits accordingly
- remove those labels.
|
||
""" | ||
|
||
from __future__ import division |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another ping
@@ -82,6 +82,8 @@ def __init__(self, config): | |||
|
|||
self.useDsetReplicaDeep = getattr(config.RucioInjector, "useDsetReplicaDeep", False) | |||
self.delBlockSlicesize = getattr(config.RucioInjector, "delBlockSlicesize", 100) | |||
self.blockDeletionDelayHours = getattr(config.RucioInjector, "blockDeletionDelayHours", 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note, you don't need to update the code, but given that you don't use this variable (blockDeletionDelayHours) anywhere in the class, you could simply remove the self (not making it an object instance attribute)
Fix log message Typo Change DAO parsing method. Include files left behind from previous commit. Fix docstring Add blockDeletionDelayHours. Typo Pylint changes Bugfix - missed plural in variable name mapping Review comments Review comments 2 Review comments 2
2c4c5e7
to
51306b0
Compare
Hi @amaltaro the changes are finalized and the Commits squashed. Please go ahead and merge at your convenience. |
Jenkins results:
|
Jenkins results:
|
Fixes #11042
Status
ready
Description
With the current PR we split the deletableBlocks checks from RucioInjector in two steps - first we fetch all the block in closed state as usual, but we do not require the whole workflow information to be cleaned from WMBS. In the following step we fetch another list of workflows which are suitable for deletion, but will eventually fall under the conditions ruled by the
archiveDelayHours
configuration parameter. Upon that we check for which of the so found blocks actually have been produced by an already 'deletable' workflow. Once the intersection between those two lists is made the final set of blocks to be deleted is provided to the rest of the code as before.During the above check we also apply another requirement for the block. Before we add it for deletion we check if its lifetime is bigger than a certain configurable value. We set that from the agent configuration with:
config.RucioInjector.blockDeletionDelayHours
and the clock for measuring the block lifetime is started with theblockCreateTime
. It would have been better to have this started with the workflow completion instead, but there is no cheap way of fetching that information fromRucioInjector
or from the agent as a whole actually.Is it backward compatible (if not, which system it affects?)
YES
Related PRs
None
We do have a new configuration variable introduced with the PR:
config.RucioInjector.blockDeletionDelayHours
But this one is to be provided from the agent configuration, so no service_config PR has been created for it.
External dependencies / deployment changes
None