Slicing a string from a Numpy structured array throws an error #5802

marnixhoh · 2020-06-02T15:11:57Z

This works:

from numba import njit
import numpy as np

values_dtype = np.dtype([
    ('one', 'U25'),
    ('two', 'f8')
])

def my_test():
    values = np.zeros(2, dtype=values_dtype)
    values['one'][0] = '2020-03-03 00:01:00'
    stuff = values['one'][0][0:10]
    return stuff
result = my_test()
print(result)

But when jitting it, it throws an error:

from numba import njit
import numpy as np

values_dtype = np.dtype([
    ('one', 'U25'),
    ('two', 'f8')
])

@njit
def my_test():
    values = np.zeros(2, dtype=values_dtype)
    values['one'][0] = '2020-03-03 00:01:00'
    stuff = values['one'][0][0:10]
    return stuff
result = my_test()
print(result)

Invalid use of Function() with argument(s) of type(s): ([unichr x 25], slice)

numba: 0.49.1
numpy: 1.18.4
python: 3.7.1

I am using the latest released version of Numba (most recent is visible in
the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
I have included below a minimal working reproducer (if you are unsure how
to write one see http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).

The text was updated successfully, but these errors were encountered:

marnixhoh · 2020-06-02T15:18:22Z

I just found a work around. If you parse the string from the structured array using str(), it works. However, I am not sure what the performance penalty is for doing so??

from numba import njit
import numpy as np

values_dtype = np.dtype([
    ('one', 'U25'),
    ('two', 'f8')
])

@njit
def my_test():
    values = np.zeros(2, dtype=values_dtype)
    values['one'][0] = '2020-03-03 00:01:00'
    stuff = str(values['one'][0])[0:10]
    return stuff
result = my_test()
print(result)

marnixhoh · 2020-06-02T15:20:08Z

I also wanted to point out that some string methods do work as expected. E.g. .split():

from numba import njit
import numpy as np

values_dtype = np.dtype([
    ('one', 'U25'),
    ('two', 'f8')
])

@njit
def my_test():
    values = np.zeros(2, dtype=values_dtype)
    values['one'][0] = '2020-03-03 00:01:00'
    stuff = values['one'][0].split(':')
    return stuff
result = my_test()
print(result)

Hope this information is helpful

stuartarchibald · 2020-06-09T09:37:08Z

Thanks for the report. In the first post #5802 (comment) I think that this won't work as the type of the returned thing will depend on the values in the slice, e.g. [1:3] would produce a [unichr x 2], whereas [1:4] would produce a [unichr x 3].

With respect to your work around #5802 (comment) the performance penalty is likely that a string buffer has to be allocated and the characters copied in, without looking at the machine code I'd guess that it cannot be optimised away as it's a change of internal representation.

As to some method's working, it's because the character array will "look like" a string for the method call if the types can be statically determined.

stuartarchibald · 2020-09-11T18:33:31Z

Closing this question as it seems to be resolved. Numba now has a discourse forum https://numba.discourse.group/ which is great for questions like this, please do consider posting there in future :) Thanks!

gmarkall added the needtriage label Jun 2, 2020

stuartarchibald added question Notes an issue as a question and removed needtriage labels Jun 9, 2020

stuartarchibald closed this as completed Sep 11, 2020

JacobHayes mentioned this issue May 8, 2022

Trouble converting between list and array with string values #4505

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slicing a string from a Numpy structured array throws an error #5802

Slicing a string from a Numpy structured array throws an error #5802

marnixhoh commented Jun 2, 2020 •

edited

marnixhoh commented Jun 2, 2020

marnixhoh commented Jun 2, 2020

stuartarchibald commented Jun 9, 2020

stuartarchibald commented Sep 11, 2020

Slicing a string from a Numpy structured array throws an error #5802

Slicing a string from a Numpy structured array throws an error #5802

Comments

marnixhoh commented Jun 2, 2020 • edited

marnixhoh commented Jun 2, 2020

marnixhoh commented Jun 2, 2020

stuartarchibald commented Jun 9, 2020

stuartarchibald commented Sep 11, 2020

marnixhoh commented Jun 2, 2020 •

edited