New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jit produces wrong code with version 0.49.0 #5618
Comments
I can add the following observations (with slightly more simplified code (no numpy)) from numba import njit
@njit
def foo1():
res = 0
tmp = 10
if tmp:
res = tmp
return res
print(foo1())
#Returns 1
@njit
def foo2():
res = 0
tmp = 10
if tmp != 0:
res = tmp
return res
print(foo2())
#Returns 10
@njit
def foo3(tmp):
res = 0
if tmp:
res = tmp
return res
print(foo3(10))
#Returns 10 So there is definitely an issue that only arises when using a constant value and skipping the explicit comparison in the if statement. I do not know whether that is the root cause of the original problem in @99991 's project though. |
Thanks for the report. This is a bug. |
Will patch it for 0.49.1. |
This is the first time I look into the numba src, but it looks like the conditions are added to the list of nullified conditions after pruning Lines 443 to 447 in 334e65c
and subsequently replaced by 0 or 1? Lines 464 to 475 in 334e65c
|
@HPLegion correct, this is trying to reuse the rewrite that was written for the case of conditionals. I've got a patch to fix it. |
Further... if I do this: import numpy as np
import llvmlite
import numba
import platform
from numba import njit, gdb
print("System:", platform.uname())
print("llvmlite version:", llvmlite.__version__)
print("numba version:", numba.__version__)
print("numpy version:", np.__version__)
# If this should crash, try without this line
@njit("i8(i8[:], i8[:], i8[:], i8[:], i8[:], f4[:, :, :], f4[:], f4[:, :], i8[:], i8)",
boundscheck=True, debug=True)
def _make_tree(
i0_inds,
i1_inds,
less_inds,
more_inds,
split_dims,
bounds,
split_values,
points,
indices,
min_leaf_size,
):
dimension = points.shape[1]
# Expect log2(len(points) / min_leaf_size) depth, 1000 should be plenty
stack = np.empty(1000, np.int64)
stack_size = 0
n_nodes = 0
# min_leaf_size <= leaf_node_size < max_leaf_size
max_leaf_size = 2 * min_leaf_size
# Push i0, i1, i_node
stack[stack_size] = 0
stack_size += 1
stack[stack_size] = points.shape[0]
stack_size += 1
stack[stack_size] = n_nodes
n_nodes += 1
stack_size += 1
# While there are more tree nodes to process recursively
while stack_size > 0:
stack_size -= 1
i_node = stack[stack_size]
stack_size -= 1
i1 = stack[stack_size]
stack_size -= 1
i0 = stack[stack_size]
lo = bounds[i_node, 0]
hi = bounds[i_node, 1]
for d in range(dimension):
lo[d] = points[i0, d]
hi[d] = points[i0, d]
# Find lower and upper bounds of points for each dimension
for i in range(i0 + 1, i1):
for d in range(dimension):
lo[d] = min(lo[d], points[i, d])
hi[d] = max(hi[d], points[i, d])
# Done if node is small
if i1 - i0 <= max_leaf_size:
i0_inds[i_node] = i0
i1_inds[i_node] = i1
less_inds[i_node] = -1
more_inds[i_node] = -1
split_dims[i_node] = -1
split_values[i_node] = 0.0
else:
# Split on largest dimension
lengths = hi - lo
split_dim = np.argmax(lengths)
split_value = lo[split_dim] + 0.5 * lengths[split_dim]
# Partition i0:i1 range into points where points[i, split_dim] < split_value
i = i0
j = i1 - 1
while i < j:
while i < j and points[i, split_dim] < split_value:
i += 1
while i < j and points[j, split_dim] >= split_value:
j -= 1
# Swap points
if i < j:
for d in range(dimension):
temp = points[i, d]
points[i, d] = points[j, d]
points[j, d] = temp
temp_i_node = indices[i]
indices[i] = indices[j]
indices[j] = temp_i_node
if points[i, split_dim] < split_value:
i += 1
i_split = i
# Now it holds that:
# for i in range(i0, i_split): assert(points[i, split_dim] < split_value)
# for i in range(i_split, i1): assert(points[i, split_dim] >= split_value)
# Ensure that each node has at least min_leaf_size children
i_split = max(i0 + min_leaf_size, min(i1 - min_leaf_size, i_split))
less = n_nodes
n_nodes += 1
more = n_nodes
n_nodes += 1
# push i0, i_split, less
stack[stack_size] = i0
stack_size += 1
stack[stack_size] = i_split
stack_size += 1
stack[stack_size] = less
stack_size += 1
# push i_split, i1, more
stack[stack_size] = i_split
stack_size += 1
stack[stack_size] = i1
stack_size += 1
stack[stack_size] = more
stack_size += 1
i0_inds[i_node] = i0
i1_inds[i_node] = i1
less_inds[i_node] = less
more_inds[i_node] = more
split_dims[i_node] = split_dim
split_values[i_node] = split_value
return n_nodes
def main():
k = 20
n_data = 100_000
n_query = n_data
dimension = 5
data_points = np.random.rand(n_data, dimension).astype(np.float32)
min_leaf_size = 8
n_data, dimension = data_points.shape
max_nodes = 2 * ((n_data + min_leaf_size - 1) // min_leaf_size)
i0_inds = np.empty(max_nodes, np.int64)
i1_inds = np.empty(max_nodes, np.int64)
less_inds = np.empty(max_nodes, np.int64)
more_inds = np.empty(max_nodes, np.int64)
split_dims = np.empty(max_nodes, np.int64)
bounds = np.empty((max_nodes, 2, dimension), np.float32)
split_values = np.empty(max_nodes, np.float32)
shuffled_data_points = data_points.copy()
shuffled_indices = np.arange(n_data).astype(np.int64)
_make_tree(
i0_inds,
i1_inds,
less_inds,
more_inds,
split_dims,
bounds,
split_values,
shuffled_data_points,
shuffled_indices,
min_leaf_size,
)
main() I get this:
which suggests an out of bounds access is probably involved. |
Master branch passes this test now. |
@99991 many thanks for confirming, will close this now. |
Our library's tests fail on all platforms when upgrading from numba version
0.48.0
to0.49.0
.The smallest test case that I could come up with that still reproduces the issue is the following:
It prints
666.0
with Python and1.0
when jitted.the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
to write one see http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).
The text was updated successfully, but these errors were encountered: