New DBSConcurrency module for concurrent execution of HTTP queries to DBS via pycurl manager #11913

vkuznet · 2024-02-26T18:17:19Z

Fixes #11899

Status

ready

Description

New DBSConcurrency module to hold helper functions to DBS via concurrent execution of HTTP calls.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

#11884

External dependencies / deployment changes

It requires pycurl module

cmsdmwmbot · 2024-02-26T18:30:03Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 tests added
- 3 changes in unstable tests
Python3 Pylint check: failed
- 2 warnings and errors that must be fixed
- 3 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 3 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14915/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2024-02-26T19:05:06Z

Jenkins results:

Python3 Unit tests: failed
- 1 new failures
- 1 tests added
- 1 changes in unstable tests
Python3 Pylint check: failed
- 2 warnings and errors that must be fixed
- 3 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 3 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14916/artifact/artifacts/PullRequestReport.html

vkuznet · 2024-02-26T21:03:34Z

Alan, this is a first draft, feel free to provide your feedback. If you satisfied with this code I can import this module in #11884 and use it there after the merge of the latter.

amaltaro

Valentin, I have two general comments for this PR.

Ideally, we should make this pycurl-based API as generic as possible - hence accepting a kwargs as input parameter, matching the actual files DBS API. The problem with that is that we have to keep all the data returned by DBS in memory, pass it upstream and let the upstream code to parse and use the data as it needs.
Another approach would be to create a specific function for the pileup use case, similar to what is written here. But then we better clarify it in the code not to mislead people.
For the unit test, we have two options. a) Either we mark it as integration, not to run it in Jenkins all the time; or b) we use a different dataset that has only a handful of blocks. That Neutrino dataset is huge and we should not add relevant load into the production system by running unit tests. In addition, let us please use a dataset from cmsweb-testbed instead, not to make any requests to the production system.

vkuznet · 2024-02-27T12:49:32Z

Alan, the second issue is addressed now. Regarding the first one. I think you mix two different concepts here:

API calls where you want generalization
optimization (e.g. memory reduction) of the function

They have nothing to do with each other. For the first one, I can put kwargs into function parameters but do we need it here? First, we must have details since only with this parameter we get file list and nevents. Without it DBS will only return list of files. Second, we need to use validFileOnly because I doubt we ever need invalid files in WMA. Therefore, my question is: is it worth to delegate to upstream code this? In other words upstream code must always specify these parameters.

For the second, (memory) optimization part. The code generalization can be done via converting it to generator which will yield individual records, like {block: {'FilesList':..., 'NumberOfEvents':...}} but it will not solve memory concern since upstream code uses entire dictionary in its operations. Therefore, without re-factoring upstream code (which we just finished) I doubt any optimization here will make any good. The entire problem is dictionary usage in many parts of WMCore, instead of list of records. The dictionary forces entire structure to be loaded into memory and causing large memory footprint. Until the usage of these dict structures will be changed literally little you can do in context of memory optimization.

cmsdmwmbot · 2024-02-27T13:10:18Z

Jenkins results:

Python3 Unit tests: failed
- 2 new failures
- 1 tests no longer failing
- 1 tests added
- 1 changes in unstable tests
Python3 Pylint check: succeeded
- 2 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14922/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2024-02-27T13:28:29Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 tests no longer failing
- 1 tests added
Python3 Pylint check: failed
- 4 warnings and errors that must be fixed
- 1 warnings
- 7 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14923/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2024-02-27T17:48:21Z

Jenkins results:

Python3 Unit tests: failed
- 1 new failures
- 1 tests added
- 2 changes in unstable tests
Python3 Pylint check: failed
- 4 warnings and errors that must be fixed
- 1 warnings
- 7 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14924/artifact/artifacts/PullRequestReport.html

vkuznet · 2024-02-27T18:06:47Z

Alan, I added memory measurement into my gist, and measured memory allocation before, after multi_getdata and after iteration over generator results. Here are results:

https://cmsweb-prod.cern.ch/dbs/prod/global/DBSReader/blocks?dataset=/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX
before: usertime=0.055698 systime=0.018106999999999998 mem=24.84375 (MB)
after: usertime=0.055705 systime=0.018109 mem=24.84375 (MB)
for /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX get 463
after gen iteration: usertime=0.071628 systime=0.022612999999999998 mem=30.25 (MB)
before getBlockInfo: usertime=0.071657 systime=0.022626 mem=30.25 (MB)
before getBlockInfo:multi_getdata: usertime=0.07818399999999999 systime=0.022737 mem=30.375 (MB)
before getBlockInfo:multi_getdata: usertime=0.078222 systime=0.022764 mem=30.375 (MB)
after getBlockInfo: usertime=2.442248 systime=1.257647 mem=97.234375 (MB)
Elapsed time 12.484441995620728

Here are few observations:

usertime, systeim and memory are the same before and after multi_getdata generator call
for multi_getdata calls within getBlockInfo function I see the same pattern
but once we iterate over generator we see jump from 24MB to 97MB (for 463 blocks info we get from DBS)

This confirms my observation that multi_getdata (generator) function by itself does not lead to any memory increase, and only once we start iterating we get appropriate bump based on number of results we get from generator function. Therefore, usage of multi_getdata ONLY lead to speed up of HTTP calls and total time is equal to average single HTTP request.

cmsdmwmbot · 2024-02-27T18:20:22Z

Jenkins results:

Python3 Unit tests: failed
- 1 new failures
- 1 tests added
Python3 Pylint check: failed
- 4 warnings and errors that must be fixed
- 1 warnings
- 7 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14925/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2024-02-27T18:38:03Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 tests added
- 2 changes in unstable tests
Python3 Pylint check: failed
- 2 warnings and errors that must be fixed
- 1 warnings
- 7 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14926/artifact/artifacts/PullRequestReport.html

vkuznet · 2024-02-27T18:45:22Z

Now, it is ready for another round of review.

amaltaro

Thank you for confirming the behavior of multi_getdata() Valentin.
These changes look good to me and I think this closes the loop of partial pileup support in the agent.

vkuznet added DBS Concurrency labels Feb 26, 2024

vkuznet self-assigned this Feb 26, 2024

vkuznet changed the title ~~Fix issue 11899~~ New DBSConcurrency module for concurrent execution of HTTP queries to DBS via pycurl manager Feb 26, 2024

vkuznet force-pushed the fix-issue-11899 branch from fc5494c to a894981 Compare February 26, 2024 18:54

vkuznet requested a review from amaltaro February 26, 2024 21:02

amaltaro reviewed Feb 26, 2024

View reviewed changes

vkuznet force-pushed the fix-issue-11899 branch from bb78c39 to 5612c2a Compare February 27, 2024 13:18

vkuznet requested a review from amaltaro February 27, 2024 13:36

vkuznet added 2 commits February 27, 2024 13:29

Initial implementation of DBSConcurrency module

e16fc94

New unit test for DBSConcurrency module

6a6d5ae

vkuznet force-pushed the fix-issue-11899 branch from 6904a3d to 6a6d5ae Compare February 27, 2024 18:29

amaltaro approved these changes Feb 27, 2024

View reviewed changes

amaltaro merged commit 9a703d6 into dmwm:master Feb 27, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New DBSConcurrency module for concurrent execution of HTTP queries to DBS via pycurl manager #11913

New DBSConcurrency module for concurrent execution of HTTP queries to DBS via pycurl manager #11913

vkuznet commented Feb 26, 2024

cmsdmwmbot commented Feb 26, 2024

cmsdmwmbot commented Feb 26, 2024

vkuznet commented Feb 26, 2024

amaltaro left a comment

vkuznet commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

vkuznet commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

vkuznet commented Feb 27, 2024

amaltaro left a comment

New DBSConcurrency module for concurrent execution of HTTP queries to DBS via pycurl manager #11913

New DBSConcurrency module for concurrent execution of HTTP queries to DBS via pycurl manager #11913

Conversation

vkuznet commented Feb 26, 2024

Status

Description

Is it backward compatible (if not, which system it affects?)

Related PRs

External dependencies / deployment changes

cmsdmwmbot commented Feb 26, 2024

cmsdmwmbot commented Feb 26, 2024

vkuznet commented Feb 26, 2024

amaltaro left a comment

Choose a reason for hiding this comment

vkuznet commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

vkuznet commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

cmsdmwmbot commented Feb 27, 2024

vkuznet commented Feb 27, 2024

amaltaro left a comment

Choose a reason for hiding this comment