Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[py2py3] fix unittests from WMCore/Datastructs in py3 #10562

Merged
merged 2 commits into from
Jun 3, 2021

Conversation

mapellidario
Copy link
Member

@mapellidario mapellidario commented Jun 1, 2021

Fixes #10531

Status

In development

Description

Changes

  • Again, another adjustment to how WMCore.DataStructs.Run.Runs are compared (i know that the motivations behind these changes may not be self-evident, I made a comment below with some investigations to justify these changes)
  • pickle.dump() and pickle.load() require the file object to be opened in bytes mode also in py3. (JobPackage.py)
  • minor changes to the unittests (usual assertItemsEqual -> assertCountEqual))

Is it backward compatible (if not, which system it affects?)

yes

Related PRs

This is a follow-up of #10012

External dependencies / deployment changes

nope

@cmsdmwmbot

This comment has been minimized.

@mapellidario
Copy link
Member Author

mapellidario commented Jun 2, 2021

Making Run.__lt__() compatible with py3 required some exploration of how dictionary comparison works in py2 and replicating it explicitly in py3.

proposed changes

This is the implementation of Run sorting (-: py2-only, +: is the py3 compatible version prposed in this PR)

[EDIT] : reflects py2py3-compatible-v1, as discussed in later comments)

    def __lt__(self, rhs):
        """
        Compare on run # first, then by lumis as a list is compared
        """
        if self.run != rhs.run:
            return self.run < rhs.run
        if sorted(self.eventsPerLumi.keys()) != sorted(rhs.eventsPerLumi.keys()):
            return sorted(self.eventsPerLumi.keys()) < sorted(rhs.eventsPerLumi.keys())
-        return self.eventsPerLumi < rhs.eventsPerLumi # this line breaks in py3
+        for self_key, rhs_key in zip(sorted(self.eventsPerLumi), sorted(rhs.eventsPerLumi)):
+            if self.eventsPerLumi[self_key] == rhs.eventsPerLumi[rhs_key]:
+                continue
+            else: 
+                return self.eventsPerLumi[self_key] < rhs.eventsPerLumi[rhs_key]
+        return False

py2 dictionaries comparisons

I paste here my findings for future reference

import sys
print sys.version_info # sys.version_info(major=2, minor=7, micro=16, releaselevel='final', serial=0)

d0 = {1: 11, 2: 22, 3: 33}

# equal
d1 = {1: 11, 2: 22, 3: 33}

# same keys, diff content
d2 =  {1: 11, 2: 22, 3: 32}
d3 =  {1: 11, 2: 22, 3: 34}
d4 = {1: 11, 2: 21, 3: 33}
d5 = {1: 11, 2: 21, 3: 34}
d6 = {1: 11, 2: 23, 3: 32}
d7 = {1: 11, 2: 23, 3: 34}

# one added key OR one missing key
d8 = {1: 11, 2: 22, 3: 33, 4: 44}
d9 = {1: 11, 2: 21, 3: 33, 4: 44}
d10 = {1: 11, 2: 23, 3: 33, 4: 44}
d11 = {1: 11, 2: 22}
d12 = {1: 11, 2: 21}
d13 = {1: 11, 2: 23}

# two missing keys AND one added key
d14 = {1: 11, 4: 22}
d15 = {1: 11, 4: 21}
d16 = {1: 11, 4: 23}

# one missing key AND one added key
d17 = {1: 11, 2: 22, 4: 32}
d18 = {1: 11, 2: 22, 4: 33}
d19 = {1: 11, 2: 22, 4: 44}
d20 = {1: 11, 2: 21, 4: 44}
d21 = {1: 11, 2: 23, 4: 44}

def cmp_py2(d1):
    cmppy2 = [d0 < d1, d0 <= d1, d0 == d1, d0 != d1, d0 > d1, d0 >= d1]
    return cmppy2 == cmp_key(d1), cmppy2, cmp_key(d1)

def cmp_key(d1):
    k0 = sorted(d0.keys())
    k1 = sorted(d1.keys())
    return [k0 < k1, k0 <= k1, k0 == k1, k0 != k1, k0 > k1, k0 >= k1]

print 
print cmp_py2(d1)   # (True, [False, True, True, False, False, True], [False, True, True, False, False, True])
print 
print cmp_py2(d2)   # (False, [False, False, False, True, True, True], [False, True, True, False, False, True])
print cmp_py2(d3)   # (False, [True, True, False, True, False, False], [False, True, True, False, False, True])
print cmp_py2(d4)   # (False, [False, False, False, True, True, True], [False, True, True, False, False, True])
print cmp_py2(d5)   # (False, [False, False, False, True, True, True], [False, True, True, False, False, True])
print cmp_py2(d6)   # (False, [True, True, False, True, False, False], [False, True, True, False, False, True])
print cmp_py2(d7)   # (False, [True, True, False, True, False, False], [False, True, True, False, False, True])
print 
print cmp_py2(d8)   # (True, [True, True, False, True, False, False], [True, True, False, True, False, False])
print cmp_py2(d9)   # (True, [True, True, False, True, False, False], [True, True, False, True, False, False])
print cmp_py2(d10)  # (True, [True, True, False, True, False, False], [True, True, False, True, False, False])
print cmp_py2(d11)  # (True, [False, False, False, True, True, True], [False, False, False, True, True, True])
print cmp_py2(d12)  # (True, [False, False, False, True, True, True], [False, False, False, True, True, True])
print cmp_py2(d13)  # (True, [False, False, False, True, True, True], [False, False, False, True, True, True])
print  
print cmp_py2(d14)  # (False, [False, False, False, True, True, True], [True, True, False, True, False, False])
print cmp_py2(d15)  # (False, [False, False, False, True, True, True], [True, True, False, True, False, False])
print cmp_py2(d16)  # (False, [False, False, False, True, True, True], [True, True, False, True, False, False])
print  
print cmp_py2(d17)  # (True, [True, True, False, True, False, False], [True, True, False, True, False, False])
print cmp_py2(d18)  # (True, [True, True, False, True, False, False], [True, True, False, True, False, False])
print cmp_py2(d19)  # (True, [True, True, False, True, False, False], [True, True, False, True, False, False])
print cmp_py2(d20)  # (False, [False, False, False, True, True, True], [True, True, False, True, False, False])
print cmp_py2(d21)  # (True, [True, True, False, True, False, False], [True, True, False, True, False, False])

from d2 to d7

froom d2 to d7 are what we should focus on, because in py2 we got to return self.eventsPerLumi < rhs.eventsPerLumi if and only if the two runs have the same list of lumi-sections. This is what does not work in py3 and that we have to replicate in py3.

In particular, we should notice that cmp_py2(d4) gives the same result as cmp_py2(d5), while cmp_py2(d6) gives the same result as cmp_py2(d7). This means that py2 sorts the keys, then compares the values key by key, starting from the lower key.

This is tested in Run_t.py:RunTest.testC with runs from run20 to run26.

from d8 to d21

I explored what happens to dictionary sorting in py2 when the dictionaries have different keys. Some weird stuff happen, for example with d0 and d16.

However, if we consider only what happens when comparing the keys of the dictionaries, then nothing out of the ordinary comes up. This is tested in Run_t.py:RunTest.testC with runs from run7 to run15.

@cmsdmwmbot

This comment has been minimized.

@cmsdmwmbot

This comment has been minimized.

@vkuznet
Copy link
Contributor

vkuznet commented Jun 2, 2021

@mapellidario two issues here:

  • performance impact when dicts are large, did you benchmark how much time such comparison will take if dicts has lots of keys and the size of dicts is large?
  • another way to comparing dicts (or its parts) is to take a hash from dict representation (but it requires an ordered list of keys anyway). Here is again the performance matter.

@mapellidario
Copy link
Member Author

mapellidario commented Jun 2, 2021

@vkuznet I did not make any benchmark for a couple of reasons.

1: is this really a problem?

These are the changes to be applied here.

      def __lt__(self, rhs):
          if self.run != rhs.run:
              return self.run < rhs.run
          if sorted(self.eventsPerLumi.keys()) != sorted(rhs.eventsPerLumi.keys()):
              return sorted(self.eventsPerLumi.keys()) < sorted(rhs.eventsPerLumi.keys())
-        return self.eventsPerLumi < rhs.eventsPerLumi
+        for (_, self_events), (_, rhs_events) in zip(sorted(viewitems(self.eventsPerLumi)), sorted(viewitems(rhs.eventsPerLumi))):
+            if self_events == rhs_events:
+                continue
+            else: 
+                return self_events < rhs_events
+        return False

What I am adding, is

  • sorting two dictionaries by values. This means getting two lists of (key, value) tuples. Since in a python dictionary the keys are unique, we only access the first item of each tuple for sorting the list of items of the dictionaries. In any case, value is an int (if i am not mistaken), nothing huge that we move around.
  • for every element, there is a O(1) operation

I do not consider this to be terrible, since a couple of lines above we already sort the keys of the dictionaries. Twice per each dictionary, without even caching the result. If it were paramount to keep the cpu time low, i would expect that we at least cache the list of sorted keys.

In any case, if we hit the part that I added, we already sorted each dictionary twice, so in the worst case scenario it would be a +100% [edited] in cpu time, assuming that py2 implementation of dict1 < dict2 is instantaneous (hardly to be the case). I can make a quick benchmark to make sure that my approach is not terribly slower than dict1 < dict2 to improve this worst case scenario estimate.

However, how often does it happen that we compare two Runs with the same list of lumi sections? This is the real question that we should should ask ourselves and I feel like I am not in the best place to answer this question. Every synthetic benchmark that I can write will never be production-like.

2: dmwm/WMCore priorities

Our priority is to make dmwm/WMCore work in py3, even if this comes with performance drawbacks. If you are worried about the performances, we can open an issue and have a second look at this very changes at a later stage.

@cmsdmwmbot

This comment has been minimized.

@mapellidario
Copy link
Member Author

Valentin, you planted a seed in the back of my mind and by the end of the workday it grew into a full plant. I had to make a synthetic benchmark to measure this.

The benchmark is in my branch 10531-fix-benchmark, which contains a couple of commits more than the branch used in this PR (with both what we have in master and my proposed changes, and with the benchmark itself): mapellidario/WMCore@10531-fix...10531-fix-benchmark .

Comparing ten times two Runs with 300_000 lumi sections (each with a random number of events between 1 and 1e9) with the comparison we have now (py2-compatible) and my current approach (py2py3-compatible-v0) takes

(results in seconds) python2 python3
py2-compatible 4.5 -
py2py3-compatible-v0 20 10

You were right, my approach is significantly slower 😬 . I will go back to the drawing board and think about something new.

Thanks for pointing this out!

@cmsdmwmbot

This comment has been minimized.

@vkuznet
Copy link
Contributor

vkuznet commented Jun 2, 2021

I doubt that you'll achieve decent performance with python itself, what you'll realize at the end is necessity to write C wrapper to battle such problems. I dealt with many dict issues many years ago when DAS was written in python. I even wrote few C-wrappers which you may find here and associated C-wrapper. At the end, I gave up with python and non-structured (no schema) CMS data and switched to GoLang for everything. Since WMCore deals with many dicts and there are a lot of non-structured data (where quite often there is no schema in dicts since keys are not defined but dynamically assigned) you'll always struggle with this and other issues because of quite large size of dicts. And, the only solution for you would be to start learning and writing C-wrappers for Python functions (which by itself far from trivial).

My suggestion still stays, try to create a hash of the dict and compare the hashes. To speed things up you may look at DAS code which provides genkey and das_hash C-wrapper. You may generate a hash of the dict using this method and compare the two.

@mapellidario
Copy link
Member Author

With the new approach [1] (let's call it py2py3-compatible-v1) I get

(results in seconds, slower = better) python2 python3
py2-compatible 4.5 -
py2py3-compatible-v1 8.5 3.7

This is in line with the +100% for the worst case scenario I had in mind earlier. ( The problem was that sorted(dict.values()) is way slower than sorted(dict.keys()), shame on me that I assumed they were similar in our case. ).

However, the performance drop appears in py2 only! The new approach in py3 is faster than the old approach in py2, so I would thank the python core developers and call this a win.

[1]

    def __lt__(self, rhs):
        ....
        for self_key, rhs_key in zip(sorted(self.eventsPerLumi), sorted(rhs.eventsPerLumi)):
            if self.eventsPerLumi[self_key] == rhs.eventsPerLumi[rhs_key]:
                continue
            else: 
                return self.eventsPerLumi[self_key] < rhs.eventsPerLumi[rhs_key]
        ...

@mapellidario
Copy link
Member Author

My suggestion still stays, try to create a hash of the dict and compare the hashes. To speed things up you may look at DAS code which provides genkey and das_hash C-wrapper. You may generate a hash of the dict using this method and compare the two.

Thank you Valentin for the suggestion! I do agree with everything you just said, however, with the time scale that we are dealing with at the moment, I think we just have to leave with this at the moment. I would leave to Kevin or Alan the decision of whether we should invest time in improving the performances of Run.py. Perfect topic for discussion for the monday meeting.

@klannon
Copy link

klannon commented Jun 2, 2021

@mapellidario, I would agree with both your focus on completing an implementation before optimizing, and also on your conclusion that since the Py3 version is comparable (even slightly better) than the existing Py2 version, it's good enough to move forward and FINISH the Py3 migration. Once we complete the Py3 migration, we can add to the brainstorming list an analysis of code performance to identify in which components optimization the benefits of optimization would outweigh the costs.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python2 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Unit tests: failed
    • 1 new failures
    • 9 tests no longer failing
    • 6 changes in unstable tests
  • Python2 Pylint check: failed
    • 13 warnings and errors that must be fixed
    • 2 warnings
    • 30 comments to review
  • Python3 Pylint check: failed
    • 13 warnings and errors that must be fixed
    • 2 warnings
    • 34 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 13 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/11870/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the very detailed description and (performance) checks, Dario. This code looks good to me and it's ready to go. But I left a couple of comments along the code in case you want to have one last look into it.
I'm happy to merge it and move forward as well, just let me know.

test/python/WMCore_t/DataStructs_t/File_t.py Outdated Show resolved Hide resolved
src/python/WMCore/DataStructs/Run.py Outdated Show resolved Hide resolved
@mapellidario
Copy link
Member Author

I updated the code with @amaltaro's suggestions. I already squashed the commits so that this is ready to be merged (I promise I haven't added anything that wasn't' suggested/approved 😏 )

With Alan's suggestion (py2py3-compatible-v2), we got a further boost

(results in seconds, slower = better) python2 python3
py2-compatible 4.5 -
py2py3-compatible-v2 6.0 2.7
py2py3-compatible-v1 8.5 3.7
py2py3-compatible-v0 20 10

Thanks @vkuznet for raising the concern in the first place and thanks Alan for providing the final implementation!

@cmsdmwmbot
Copy link

Jenkins results:

  • Python2 Unit tests: succeeded
  • Python3 Unit tests: succeeded
    • 5 tests no longer failing
    • 4 changes in unstable tests
  • Python2 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 2 warnings
    • 20 comments to review
  • Python3 Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 2 warnings
    • 20 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 11 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/11872/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

amaltaro commented Jun 3, 2021

Nicely done, Dario! Thanks

@amaltaro amaltaro merged commit 81c13a6 into dmwm:master Jun 3, 2021
@mapellidario mapellidario deleted the 10531-fix branch March 10, 2023 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[py2py3] Check unittests from WMCore/Datastructs in py3
5 participants