-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verify the etree_parse and etree_iterparse benchmarks are working appropriately #69824
Comments
If you look at bit.ly/pycon-ca-keynote you will notice that the etree_parse and etree_iterparse benchmarks were horrible for everyone. Because of how badly everyone seemed to do, I think the benchmarks should be verified to be doing reasonable things on implementations other than CPython 2.7. |
Would you have a quick summary for those not willing to watch a whole keynote? |
That link is to a Jupyter notebook so you don't have to watch anything. Plus the video is not even up yet so you can't skip the keynote even if you wanted to since you can't watch it yet. :) |
Ok, so when you say "horrible for everyone", this is really IronPython and Jython, right? :-) Other runtimes seem to do ok (perhaps not stellar, but ok). |
Well, Jython and IronPython obviously did the worst, but even Python 3 didn't do as well as I would have expected, so I still want to double-check the benchmarks to see if it's obvious why CPython 2.7 beats out everyone. |
I think these histograms would look better with logarithmic scale. |
Let's not pollute the issue with a critique of my notebook. You can feel free to email me personally to discuss it if you want, including why I purposefully didn't use a logarithmic scale. |
Sorry Brett. How tests were ran? There are two implementations of ElementTree, accelerated and non-accelerated. xml.etree.ElementTree by default is accelerated in Python 3, but non-accelerated in Python 2. $ python2.7 bm_elementtree.py -n 7 --take_geo_mean
0.463665158795
$ python2.7 bm_elementtree.py -n 7 --take_geo_mean --etree-module=xml.etree.ElementTree
5.46309932568
$ python3.4 bm_elementtree.py -n 7 --take_geo_mean --etree-module=xml.etree.ElementTree
0.813397633467649
$ python3.4 bm_elementtree.py -n 7 --take_geo_mean --etree-module=xml.etree.ElementTree --no-accelerator
5.31174765817514 If run the test with the same options --etree-module=xml.etree.ElementTree, it will use accelerated implementation in Python 3 and non-accelerated in Python 2. |
The commands I used are in the notebook for each implementation and you can get the same result with |
The slowing down Python 3 can be related to adding XMLPullParser (bpo-17741). |
Proposed patch optimizes iterparse(). Now it is only 33% slower than in 2.7 (was 2.6 times slower). |
Updated to tip. |
Serhiy's latest patch LGTM. |
Thank you for your review Brett. First than apply this optimization I want to fix errors propagating issue (bpo-25814). The patch for it is mainly the simplified part of the patch for this issue. |
New changeset dd67c8c53aea by Serhiy Storchaka in branch 'default': |
The iterparse benchmark in 3.6 still is 30% slower than in 2.7. The parse benchmark is 70% slower. Hence there are other causes of the slowing down. One of causes is that in 3.x an empty dict instead of None is passed to start handler as attrib parameter if the start tag has no attributes. This makes parsing parsing about 10% slower. |
Following patch speeds up ElementTree parsing (the result of the etree parse benchmark is improved by 10%). Actually it restores 2.7 code and avoids creating an empty dict for attributes if not needed. |
New changeset 1fe904420c20 by Serhiy Storchaka in branch 'default': |
Thank you for your review Brett. Now the parse benchmark in 3.6 is only 50% slower than in 2.7. Will continue to find bottlenecks. |
I am not able to find the cause of the slowdown. I think this issue can be closed now. The etree_parse and etree_iterparse benchmarks are working appropriately and showing real regression in CPython 3.x. The cause of the regression is not known. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: