New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dictionary __getitem__ optimization #899

Merged
merged 3 commits into from Aug 9, 2018

Conversation

Projects
None yet
2 participants
@patiences
Contributor

patiences commented Aug 8, 2018

Currently, Dict.get(x) calls __hash__() to determine whether or not x is hashable (though the value itself is not used). Instead of using __hash__() to test an object's hashability, we could use a separate method that avoids the creation of the hashcode objects (which, since they are hashcodes, are large integers that don't get cached and have to be created each time __hash__() is called).

Performance results:

On Pystone:

Without optimization (current master)

test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 18.2472
This machine benchmarks at 2740.15 pystones/second
test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 18.7644
This machine benchmarks at 2664.63 pystones/second
test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 18.2185
This machine benchmarks at 2744.46 pystones/second

With optimization

test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 10.9498
This machine benchmarks at 4566.29 pystones/second
test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 11.2300
This machine benchmarks at 4452.37 pystones/second
test_pystone (tests.test_pystone.PystoneTest) ... Pystone(1.2) time for 50000 passes = 10.9833
This machine benchmarks at 4552.36 pystones/second

That's almost a 70% improvement! Apparently there is quite a lot of dictionary accessing in pystone.

On benchmarking test:

Without optimization

Running test_dictionary_get
  Elapsed time:  55.72767485608347  sec
  CPU process time:  0.00013799999999997148  sec
Running test_dictionary_get
  Elapsed time:  55.894008523086086  sec
  CPU process time:  0.00011600000000000499  sec
Running test_dictionary_get
  Elapsed time:  56.72291784896515  sec
  CPU process time:  0.0011000000000000454  sec

With optimization

Running test_dictionary_get
  Elapsed time:  47.00173879181966  sec
  CPU process time:  0.00012499999999998623  sec
Running test_dictionary_get
  Elapsed time:  47.65469679213129  sec
  CPU process time:  0.00012100000000003774  sec
Running test_dictionary_get
  Elapsed time:  48.37633503694087  sec
  CPU process time:  0.00014999999999998348  sec

About ~15% improvement.

@patiences patiences changed the title from [WIP] Dictionary __getitem__ optimization to Dictionary __getitem__ optimization Aug 8, 2018

@freakboy3742

👍

@freakboy3742 freakboy3742 merged commit 7a9824e into pybee:master Aug 9, 2018

5 checks passed

beekeeper:0/beefore:javacheckstyle Java lint checks passed.
Details
beekeeper:0/beefore:pycodestyle Python lint checks passed.
Details
beekeeper:1/smoke-test Smoke build (Python 3.4) passed.
Details
beekeeper:2/full-test:py3.5 Python 3.5 tests passed.
Details
beekeeper:2/full-test:py3.6 Python 3.6 tests passed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment