New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wraparound option not recognized as valid keyword option for jit on CPU target #1533
Comments
We actually removed the |
@pitrou I also found that What is the new status for negative index wraparound? Does the compiler always add this (with the slight performance overhead)? Or does it no longer perform this at all? |
The "wraparound" flag to the jit() decorator was removed some time ago, but traces of it were left especially in the docstring.
Thanks! I've pushed a commit to fix this.
Yes, the compiler always adds it. Note that, in some cases, LLVM will be able to optimize away the sign check (for example if the index comes from a |
Thanks! |
Am I right in thinking that bounds checking is now off (and cannot be turned on) but negative indexing is now on (and cannot be turned off)? That's kind of strange...at the very least this should be made clear in the intro docs (if it isn't already). Do you have any sense of how performance varies across each of the 2**2=4 combinations of checks? Ideally it would be nice to be able to control the state of those checks on a per-variable basis, but that's just being greedy. |
Yes, you are right. Negative indexing is a regular Python feature while off-bounds access is a bug which shouldn't happen in correct code; it makes more sense to enable the former even at the possible cost of a small runtime hit.
I haven't made any measurement for bounds checking. Negative indexes can be totally costless in many cases where the compiler can infer that an index is always positive (or always negative). In some more complicated cases the test will remain at runtime and add a bit of overhead.
Actually we would rather not provide a ton of low-level knobs to influence code generation; besides being costly maintenance-wise, it also leads users to focus on getting those knobs "right" even though they may not matter at all. |
I take your point about too many low-level knobs. I could be completely off the mark here, but... It may well be that even if there are a few extra machine instructions they play to the strengths of modern processors (in terms of branch predicts, cache prefetching, out of order execution etc.) and thus don't actually end up causing any kind of performance hit at all....though intuitively it feels like this would be more true for bounds checking than for wraparound indexing: for example, the compiler might emit predicated gather instructions asking the CPU to get both negative and positive versions of the index, resulting in a polluted cache; alternatively (and this may need to be the case in order to prevent segfaults?) the compiler may emit true branching instructions, which would be zero-overhead if all indices are non-negative, but cause terrible performance if negative indices are randomly interspersed among positive ones... ...what I'm trying to say is that maybe bounds-checking should be standard (if it is indeed near-free), and if wraparound is relatively expensive (which may or may not be the case?) then it would be nice to turn it off. Do you have benchmarks from before you removed the |
I am rather skeptical it would be near-free. It should be harder to elide by the compiler than wraparound is (whether an index is always positive is often easy to determine by the compiler, for example if iterating on It's true that we should check sometimes what is the slowdown from adding bounds-checking in our benchmarks suite. |
Why wouldn't bounds checking be near-free? The index data is already in register and the branch predictor will be 100% convinced that the index is going to be within bounds (because if it's not we bail out of the whole function). The actual test can hopefully be done in parallel with other instructions. |
This is true (except if you declare the array as unsigned, in which case the compiler knows the indices are positive).
The same is true of wraparound: if the index is never negative, the branch predictor will always be right as well. |
Ok, I hadn't thought about passing unsigned arrays in this context, that's a good solution. Regarding the wraparound branch predict, that's the point I was trying to make earlier...but as I said it depends on whether the compiler is emitting predicated instructions or full-on branches. |
I'm trying to use the
wraparound
keyword argument forjit
to disable negative index wrap-around, but this generates aNameError
for me. I made a toy example below and show the code pasted into IPython, followed by the traceback. I'm using Anaconda 2.7.10, and Numba 0.22.1.I did some digging in the code and I can't find a reference to
no_wraparound
other than the piece of code that handles these keyword arguments.In particular, you can see for the target I am working with (a CPU), that the
OPTIONS
dict
is set to include the keywordwraparound
, whereas the piece of code that attempts to set the flag seems to expect this to beno_wraparound
.I think from what code I read that
flags.set
is referring to theConfigOptions.set
method defined inutils.py
. But this is actually hard to reason about because it's not clear why some of the other options map to alternative names, such asnopython
becomingenable_pyobject
orlooplift
becomingenable_looplift
when the flags are set ... these names (enable_pyobject
andenable_looplift
) don't appear to be part of the targetOPTIONS
dict
either, so it's unclear why they do not also raise aNameError
. I see where these option names exist incompiler.py
, but it's not clear how one goes from theOPTIONS
dict
of the target to the one defined in theFlags
class. Also,wraparound
and/orno_wraparound
seem to have been missing from thatOPTIONS
dict
incompiler.py
at least for several versions going back in the commit history. I didn't bisect to see if I could find precisely where it diverges from the documentation, but it might give you some idea of where to start looking. Some clarification about how exactly that bit of code works would be very helpful. Thanks!You may also want to elaborate in the documentation if the issue is related to the name of the keyword arg, because in the docstring it makes it seem that
wraparound
should be OK:The text was updated successfully, but these errors were encountered: