-
Notifications
You must be signed in to change notification settings - Fork 225
chore: Replace isinstance(obj, T) with type(obj) is T comparisons #1292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
/ok to test |
|
/ok to test 0db38d0 |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great work. I downloaded your script and was able to reproduce a similar result (5.23x faster) on my laptop. The math looks sound. Just to make sure, I replaced your manual calculations with Python's builtin timeit.timeit and pyperf (the latter being the sort of "gold standard" for accurate perf timings in Python). But the result is all roughly the same, and 5x is large enough that that level of accuracy doesn't really matter -- it's an obvious big win.
My only concern with this PR is backward compatibility. It is technically possible to subclass either a numpy or ctypes datatype right now and it would be accepted and work here with the isinstance check but would no longer be accepted after this change. I don't know how often that actually happens in practice, and our test suite obviously doesn't do that. I'm not sure how to assess how much we care about this -- it seems hard to do a GitHub code search for, for example. @leofang, thoughts?
If we determine we do want to be strict about backward compatibility, we could probably do:
if arg_type is ctypes_bool:
...
elif ...
...
else:
# If no exact types are found, fallback to slower `isinstance` check
if isinstance(arg_type, ctypes_bool):
...
elif:
...
else:
return 1
Note that the fallback cases are in a separate if/elif/else block so that Cython can still optimize the outer one to a C switch statement.
I suspect that would not have a significant impact on the benchmark (which doesn't exercise subclasses). If we go this route, we should also add a test that creates a subclass of a ctype and numpy type and confirms that it works and does the right thing.
Description
closes #1282
Replace
isinstance(obj, T)checks withtype(obj) is Tto optimizecuda.core.launch()Additional Notes
I made a benchmarking script in Cython to prove the speedup of using
type()in place ofisinstance()checks since the original issue requested profiling which resulted in an ~5x speedup.Appreciate some guidance to know if I've done the profiling right
Created a file
benchmark_isinstance_cython.pyx:I mainly used the compiler flags
-O3and-march=nativeand compiled and ran the above benchmark via this setup scriptsetup_benchmark.py:The script was then run with
python setup_benchmark.py build_ext --inplaceChecklist