add performance tests #9

jab · 2015-04-24T20:01:21Z

...to verify that bidict performance is comparable to manually managing two dicts and catch any performance regressions.

jab · 2015-04-27T18:25:28Z

I could imagine implementing these tests by using a profiler to count the total number of bytecode instructions required when using bidict to accomplish some basic tasks, and then compare this to using two dicts kept in sync manually (baseline). (Wall clock times could be measured too.)

The tests could then either pass or fail depending on whether bidict's results are within some threshold of the baseline results.

Running these tests could also generate a more detailed report that would allow seeing how much a change to the code improves or degrades performance on the tests.

Any suggestions on how to add something like this to our test suite, @tomviner?

jab · 2015-04-27T20:19:53Z

Would also be cool to see how performance with pypy compares to cpython.

jab · 2015-05-04T16:47:26Z

Put up a general note about performance at https://bidict.readthedocs.org/en/master/performance.html (in part to address some more or less well-founded concern from potential users -- see here and #1 + #1 (comment).

Once we have performance tests running as part of CI, I can link to our results from various of those places.

thedrow · 2015-07-06T06:01:18Z

You can just use pytest-benchmark for performance tests.

jab · 2015-07-07T23:17:53Z

Thanks for the tip, @thedrow. Look forward to checking that out.

jab · 2015-12-21T04:41:17Z

Took a first crack at this. Haven't done anything with pytest-benchmark before. Does 8e6686b look like a reasonable starting place? /cc @thedrow et al.

jab · 2015-12-21T04:44:53Z

You can see an example of the results I'm getting e.g. here. /cc @lordmauve in case this is of interest

thedrow · 2015-12-21T09:00:25Z

Yes. Looks correct.
You can now start profiling and see where you can improve your results.

jab · 2015-12-23T19:33:45Z

Thanks for taking a look @thedrow.

Before I profile, I'd first like to adapt the current benchmarks to demonstrate the performance not just of a 10-item bidict, but rather the asymptotic behavior as number of elements increases. I'd like to confirm that bidict's space and time complexity is on the same order of the space and time complexity of manually keeping two inverse dicts in sync.

Please let me know if you have any other suggestions, and thanks again!

jab · 2016-02-12T05:25:15Z

You can check out the latest work on this here:
https://github.com/jab/bidict/compare/benchmark?expand=1

Check out the "benchmark" branch on Travis for the latest benchmark results.

With these, I was able to make some micro-optimizations to bidict._update, and the benchmarks allow us to see that in the Travis builds, test_bidict_init went from 8.27x slower than test_invdict_init on average without the optimizations, to 6.52x slower on average with the optimizations.

The benchmarks are currently pretty crude, and I have doubts about how representative they are of real-world workloads (not to mention pytest-benchmark advises against running in a VM, which there's no way around on Travis). I'd love to work together with someone on this to check over and improve on this work. Anyone interested? @lordmauve, would love your take on this. Anyone interested, please ping me on Gitter and I'll look forward to getting these landed.

lordmauve · 2016-02-12T09:04:49Z

Function call overhead is pretty high in Python. Maintaining two dicts manually, incline in your own code, would be pretty hard to beat due to the elimination of function call overhead.

We're unlikely to get down to performance comparable to that without a C implementation.

However, that's not to say that big improvements can't be made. The overhead of bidict for reads could be minimal. Inlining _put would help for writes.

jab · 2016-02-13T20:21:22Z

Thanks for taking a look, @lordmauve. I think the read performance is already good enough. As for the write performance, inlining _put turned out not to make much of a difference in my testing, so I'm going to leave the function call in place. I don't have experience speeding up Python with C, do you? I'd definitely be interested to accept a (pypy-compatible) PR that accomplishes this.

While I was looking at this, I decided bidict's update method (and other bulk insert operations, including __init__) should be atomic, which they previously weren't. i.e. If you tried to add several items, where one of them would result in a ValueExistsException, it was previously likely that many of them would be added until the exception-causing one was reached. This resulted in bidict's init benchmark slowing down from about 8x that of the inverse dict's to about 13x with CPython, and about 7x with PyPy. I think the safety gains probably make this worth it, but would love to hear feedback.

We could also consider adding a way to perform a bulk update that ignores new mappings that would clobber existing keys or values, rather than either allowing them to succeed or causing the entire update to fail. Does anyone think that would be worth it?

Also, after doing the above, it made sense to change put to accept keyword arguments allowing you to customize the key- and value-overwriting behavior, with the defaults matching the current behavior (i.e. don't allow either kind of overwrite). This makes put more flexible without breaking backwards compatibility, so it seemed worthwhile. (Now you can even say e.g. put(key, val, overwrite_key=True, overwrite_val=False) where previously there was no way to do so.) And with a more flexible put operation, adding a complementary putall was a quick and easy win, bringing the bulk API in line with the single-item API.

Could you please take a look at #33 and let me know what you think? I also left some questions in the benchmark tests I'd love some comments on. Thanks in advance, and hope you find these latest changes helpful.

jab · 2016-02-13T23:29:32Z

I just switched from the "overwrite" flags to the "collision behavior" abstraction, allowing for a third option besides "overwrite" or "raise": "ignore". PTAL! a039c42

jab · 2016-02-14T16:41:37Z

As of f6227e9, got the test_bidict_init benchmark down to only ~3.5x slower on average than the test_invdict_init benchmark -- down from ~13x slower before. This is the fastest it's ever been, even compared to before the new atomicity guarantees were added. ✨ 🚀 ✨

jab · 2016-07-28T00:56:52Z

Fixed in 0.12.0.

jab modified the milestone: 1.0 Apr 24, 2015

jab added help wanted and removed help wanted labels Apr 24, 2015

jab mentioned this issue Apr 24, 2015

refactor tests #8

Closed

jab removed the p1 label Dec 21, 2015

jab mentioned this issue Dec 28, 2015

Consider removing slice and ~ syntax #19

Closed

This was referenced Feb 13, 2016

atomic bulk updates, more compatibility helpers, benchmarks #32

Closed

atomic bulk updates, more compatibility helpers, benchmarks #33

Closed

jab closed this as completed Jul 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add performance tests #9

add performance tests #9

jab commented Apr 24, 2015 •

edited

jab commented Apr 27, 2015

jab commented Apr 27, 2015

jab commented May 4, 2015

thedrow commented Jul 6, 2015

jab commented Jul 7, 2015

jab commented Dec 21, 2015 •

edited

jab commented Dec 21, 2015 •

edited

thedrow commented Dec 21, 2015

jab commented Dec 23, 2015 •

edited

jab commented Feb 12, 2016

lordmauve commented Feb 12, 2016

jab commented Feb 13, 2016

jab commented Feb 13, 2016

jab commented Feb 14, 2016 •

edited

jab commented Jul 28, 2016

add performance tests #9

add performance tests #9

Comments

jab commented Apr 24, 2015 • edited

jab commented Apr 27, 2015

jab commented Apr 27, 2015

jab commented May 4, 2015

thedrow commented Jul 6, 2015

jab commented Jul 7, 2015

jab commented Dec 21, 2015 • edited

jab commented Dec 21, 2015 • edited

thedrow commented Dec 21, 2015

jab commented Dec 23, 2015 • edited

jab commented Feb 12, 2016

lordmauve commented Feb 12, 2016

jab commented Feb 13, 2016

jab commented Feb 13, 2016

jab commented Feb 14, 2016 • edited

jab commented Jul 28, 2016

jab commented Apr 24, 2015 •

edited

jab commented Dec 21, 2015 •

edited

jab commented Dec 21, 2015 •

edited

jab commented Dec 23, 2015 •

edited

jab commented Feb 14, 2016 •

edited