-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline cache for slots #87093
Comments
I've been thinking about Python performance improvements, and I played around with an inline cache enhancement that supports slots. The results on a very simple benchmark look promising (30% speedup) but I'm terrible with our benchmarking tools, and this should be considered *very* carefully before we go ahead with it, since it could potentially pessimize the inline cache for standard attributes. I'll attach a PR in a minute. |
Slot means so many different things in Python. Here it's about data descriptors created when you set __slots__ in the class definition. It is amazing that so large speed up can be achieved for such base operation. |
I will try to run today the pyperformance test suite to see if there is any impact on standard attribute access |
These are the results: venv ❯ python -m pyperf compare_to json_old/* -G --min-speed=2 --table Benchmark hidden because not significant (50): sympy_sum, sqlalchemy_imperative, sympy_str, sympy_integrate, dulwich_log, scimark_fft, genshi_text, tornado_http, regex_dna, sqlalchemy_declarative, mako, meteor_contest, unpickle_pure_python, xml_etree_parse, genshi_xml, scimark_sor, sqlite_synth, pickle_pure_python, nbody, pickle_dict, pyflate, regex_effbot, xml_etree_process, logging_simple, python_startup, 2to3, fannkuch, python_startup_no_site, raytrace, go, hexiom, scimark_lu, json_dumps, richards, logging_format, xml_etree_generate, chaos, telco, pickle_list, unpack_sequence, regex_compile, django_template, json_loads, crypto_pyaes, xml_etree_iterparse, unpickle_list, nqueens, chameleon, scimark_sparse_mat_mult, pickle I uploaded the pyperf result data to bpo as well |
So it seems that everything is in the noise range except the "float" benchmark that is 1.11x faster |
Some microbenchmarks: CURRENT MASTER: Variable and attribute read access: Variable and attribute read access: |
Yeah, this is why. https://github.com/python/pyperformance/blob/master/pyperformance/benchmarks/bm_float.py#L12 This is a great result, IMO. I'm +1 to merge this. |
Can you add a new one |
Same here. It would be good if Inada-san could confirm the benchmarks.
That should be "read_instancevar_slots". Or you refer to some other check? |
+1 Thanks for this. |
Thanks for all the positive feedback! What is the next step? |
I would say just finishing the PR (making sure that we do not miss some arcane edge case) and updating the what's new for 3.10 :) |
I created a benchmark for this, see python/pyperformance#86. Next I'll add a blurb and then it's off to reviewers. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: