Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ak.Array and ak.from_numpy should not accept zero-dimensional arrays #1057

Open
jpivarski opened this issue Aug 10, 2021 · 3 comments · May be fixed by #3161
Open

ak.Array and ak.from_numpy should not accept zero-dimensional arrays #1057

jpivarski opened this issue Aug 10, 2021 · 3 comments · May be fixed by #3161
Assignees
Labels
policy Choice of behavior

Comments

@jpivarski
Copy link
Member

... because they're not iterable. This was raised in an issue by @bfis:

In case that there is supposed to be support for shapeless arrays, this illustrates the inconsistency:

a = np.array(0) # value: array(0)
b = ak.Array(a) # value: <Array [0] type='1 * int64'>
c = ak.to_numpy(b) # value: array([0])
assert a.shape == c.shape # this fails

Elsewhere, we consider NumPy scalars (e.g. np.float32(3.14)) to be equivalent to zero-dimensional arrays (e.g. np.array(3.14, np.float32)), but this is interpreting the zero-dimensional array as a length-1 array.

ak.Array and ak.from_numpy do not accept scalars:

>>> ak.Array(np.float32(3.14))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/miniconda3/lib/python3.8/site-packages/awkward/highlevel.py", line 254, in __init__
    layout = ak.operations.convert.from_iter(
  File "/home/jpivarski/miniconda3/lib/python3.8/site-packages/awkward/operations/convert.py", line 885, in from_iter
    for x in iterable:
TypeError: 'numpy.float32' object is not iterable

So they should not accept zero-dimensional arrays, for consistency. In other words, not this:

>>> ak.Array(np.array(3.14, np.float32))
<Array [3.14] type='1 * float32'>

Originally posted by @jpivarski in #1055 (comment), with @bfis's response in #1055 (comment)

@jpivarski jpivarski added the policy Choice of behavior label Aug 10, 2021
@agoose77 agoose77 closed this as completed Oct 6, 2022
@agoose77 agoose77 reopened this Oct 6, 2022
@agoose77
Copy link
Collaborator

agoose77 commented Oct 6, 2022

@jpivarski do you want to ban this at the to_layout level, or just the array constructor? I imagine both, as we don't want from_numpy to succeed either.

@jpivarski
Copy link
Member Author

Yeah, it should be banned at the to_layout level. Nothing in the layouts can accept zero-dimensional arrays.

to_layout with an allow_other=True argument allows for non-array types, and most often, the intention is to get numbers this way. In NumPy and especially NumPy-API-like libraries, it's easy to get a zero-dimensional array when you really wanted a scalar: np.array(3.14) instead of np.float64(3.14). Therefore, the allow_other=True case should still let these through, but turning them into actual scalars would be a safer thing to do.

Both zero-dimensional arrays and scalars have a shape (and dtype).

>>> np.array(3.14).shape
()
>>> np.float64(3.14).shape
()

But scalars are not instances of np.ndarray.

>>> isinstance(np.array(3.14), np.ndarray)
True
>>> isinstance(np.float64(3.14), np.ndarray)
False

If to_layout with allow_other=True converts anything with an empty shape into a NumPy scalar, I think that would be sufficient:

>>> something = np.array(3.14)
>>> isinstance(something, np.ndarray)
True
>>> # Below is the check that to_layout could do...
>>> if len(getattr(something, "shape", (None,))) == 0 and hasattr(something, "dtype"):
...     something = something.dtype.type(something)
... 
>>> isinstance(something, np.ndarray)
False

@jpivarski jpivarski added this to Unprioritized in Finalization Jan 19, 2024
@jpivarski
Copy link
Member Author

As of now, this is banned at the ak.Array level, but not ak.from_numpy or ak.to_layout.

>>> ak.Array(np.array(123))
TypeError: Encountered a scalar (ndarray), but scalar conversion/promotion is disabled

This error occurred while calling

    ak.to_layout(
        numpy.ndarray(123)
        allow_record = False
        regulararray = False
        primitive_policy = 'error'
    )

>>> ak.from_numpy(np.array(123))
<Array [123] type='1 * int64'>

>>> ak.to_layout(np.array(123))
<NumpyArray dtype='int64' len='1'>[123]</NumpyArray>

It's relevant for Ragged because Ragged has to support zero-dimensional arrays. (The ragged.array._impl can either be an Awkward Array or it can be a NumPy/CuPy scalar.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
policy Choice of behavior
Projects
Finalization
P1 (highest)
Development

Successfully merging a pull request may close this issue.

3 participants