Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured dtypes with multi-dim fields: numba either throws or crashes #3158

Closed
aldanor opened this issue Jul 25, 2018 · 6 comments · Fixed by #8120
Closed

Structured dtypes with multi-dim fields: numba either throws or crashes #3158

aldanor opened this issue Jul 25, 2018 · 6 comments · Fixed by #8120
Labels
bug - typing Bugs: occuring at typing time

Comments

@aldanor
Copy link

aldanor commented Jul 25, 2018

Given this (quite trivial) example:

dtype1 = np.dtype([('a', 'i8'), ('b', 'i4')])
dtype2 = np.dtype((dtype1, (2, 2)))
dtype3 = np.dtype([('x', '?'), ('y', dtype2)])
arr = np.zeros(2, dtype3)

@numba.jit(nopython=True)
def foo(arr):
    return arr[0]

foo(arr)

numba will most likely throw like so:

.../numba/types/npytypes.py in __init__(self, dtype, shape)
    419 
    420     def __init__(self, dtype, shape):
--> 421         assert dtype.bitwidth % 8 == 0, \
    422             "Dtype bitwidth must be a multiple of bytes"
    423         self._shape = shape

AttributeError: 'Record' object has no attribute 'bitwidth'

... or you could also get

SystemError: 
CPUDispatcher(<function foo at 0x7f7741382d08>) returned NULL without setting an error
@stuartarchibald
Copy link
Contributor

Thanks for the report. I can reproduce the raised AttributeError, no luck with reproducing the SystemError as yet. This is clearly a bug.

@kazhoyan
Copy link

This bug occurs in much simpler cases:

Item = np.dtype([('some_field', np.int32)])
SomeList = np.dtype([('items', Item, 10)])

Accessing items property of an instance of SomeList crashes with this message:
'Record' object has no attribute 'bitwidth'

A quick fix is to replace NestedArray.size's body inside npytypes.py with something like this:

        if isinstance(self.dtype, Record):
            return self.dtype.size
        else:
            return self.dtype.bitwidth // 8

P.S. I would increase this bug's priority - it seems like a fairly common situation.

@stuartarchibald
Copy link
Contributor

@kazhoyan thanks, this is an exact reproducer:

from numba import njit
import numpy as np

Item = np.dtype([('some_field', np.int32)])
SomeList = np.dtype([('items', Item, 10)])

arr = np.zeros((2,), SomeList)

@njit
def foo(x):
    x[0]

foo.py_func(arr)
foo(arr)

Pull Requests to fix/promote discussion around such issues are welcomed.

@shash29-dev
Copy link

This still open?
Any solution till now?

@stuartarchibald
Copy link
Contributor

This still open?

Yes, still reproduces on 0.54.

Any solution till now?

Not that I'm aware of I'm afraid.

@gmarkall
Copy link
Member

gmarkall commented Dec 7, 2021

Possible starter patch for this and #1469:

diff --git a/numba/core/types/npytypes.py b/numba/core/types/npytypes.py
index 25050f30c..e9da0249d 100644
--- a/numba/core/types/npytypes.py
+++ b/numba/core/types/npytypes.py
@@ -209,6 +209,10 @@ class Record(Type):
 
         return as_struct_dtype(self)
 
+    @property
+    def bitwidth(self):
+        return self.dtype.itemsize * 8
+
     def can_convert_to(self, typingctx, other):
         """
         Convert this Record to the *other*.
@@ -427,6 +431,10 @@ class Array(Buffer):
         if (not aligned or
             (isinstance(dtype, Record) and not dtype.aligned)):
             self.aligned = False
+        if isinstance(dtype, NestedArray):
+            tmp = Array(dtype.dtype, dtype.ndim, 'C')
+            ndim += tmp.ndim
+            dtype = tmp.dtype
         if name is None:
             type_name = "array"
             if not self.mutable:
@@ -553,6 +561,10 @@ class NestedArray(Array):
     """
 
     def __init__(self, dtype, shape):
+        if isinstance(dtype, NestedArray):
+            tmp = Array(dtype.dtype, dtype.ndim, 'C')
+            shape += dtype.shape
+            dtype = tmp.dtype
         assert dtype.bitwidth % 8 == 0, \
             "Dtype bitwidth must be a multiple of bytes"
         self._shape = shape

This works around the issue, but the approach might need to be more cleanly thought-out (and tested).

With a modified version of @stuartarchibald's reproducer above (to print out return values):

from numba import njit
import numpy as np

Item = np.dtype([('some_field', np.int32)])
SomeList = np.dtype([('items', Item, 10)])

arr = np.zeros((2,), SomeList)

@njit
def foo(x):
    return x[0]

print(foo.py_func(arr))
print(foo(arr))

I get

([(0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,)],)
([(0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,)],)

with this patch.

@stuartarchibald stuartarchibald added bug - typing Bugs: occuring at typing time and removed bug labels Dec 15, 2021
gmarkall added a commit to gmarkall/numba that referenced this issue Nov 29, 2022
gmarkall added a commit to gmarkall/numba that referenced this issue Nov 29, 2022
gmarkall added a commit to gmarkall/numba that referenced this issue Jan 25, 2023
esc added a commit to esc/numba that referenced this issue Jan 27, 2023
* main: (2583 commits)
  Correct sequence in test_issue_3158_1
  Fix flake8 checks since upgrade to flake8=6.x
  Use nonzero data in tests of Issue numba#3158
  Apply suggestions from PR numba#8120 review
  Make Numba dependency check run ahead of Numba internal imports.
  Check for void return type in compile_ptx
  applying review suggestions
  Add -e to all Azure script steps
  CI: Use `set -e` in "Before Install" step and fix install
  Implement cleanups suggested in PR numba#8120 feedback
  Remove cpu NRT init guard, rtsys.initialize already has this.
  remove forced fail test
  moved file (2)
  Add test for docstring.
  Add update_wrapper to dufunc.
  Remove superfluous assertion.
  Fix failing warning checking tests.
  Fix failing tests.
  * now uses np.testing.assert_array_equal for more detailed error reporting
  Supply concrete timeline for objmode fallback deprecation.
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug - typing Bugs: occuring at typing time
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants