-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[Python] Inferring / converting nested Numpy array is very slow #18790
Copy link
Copy link
Closed
Description
Converting a nested Numpy array nested walks over the Numpy data as Python objects, even if the dtype is not "object". This makes it pointlessly slow compared to the non-nested case, and even the nested Python list case:
>>> %%timeit data = list(range(10000))
...:pa.array(data)
...:
746 µs ± 8.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %%timeit data = np.arange(10000)
...:pa.array(data)
...:
81.1 µs ± 57.7 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %%timeit data = [np.arange(10000)]
...:pa.array(data)
...:
3.39 ms ± 6.27 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)Reporter: Antoine Pitrou / @pitrou
Assignee: Antoine Pitrou / @pitrou
PRs and other links:
Note: This issue was originally created as ARROW-2514. Please see the migration documentation for further details.
Reactions are currently unavailable