Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: drop zero-cost views of ak.Array #2697

Merged
merged 3 commits into from
Sep 6, 2023
Merged

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented Sep 6, 2023

In the words of Taylor Swift, I'm the problem — it's me.

In #2649, I exposed a DLPack view of Awkward Arrays, alongside the __array_interface__ and __cuda_array_interface__ properties. However, these are supposed to be zero copy, and non-owned views. As such, we can't return local temporary copies that leave scope.

So, whoops, and here's a fix: we don't support DLPack or CUDA views of Awkward Arrays; they need to be explicitly converted to user-referenced backend arrays.

@agoose77
Copy link
Collaborator Author

agoose77 commented Sep 6, 2023

@lgray this will be in the next release, ASAP.

@codecov
Copy link

codecov bot commented Sep 6, 2023

Codecov Report

Merging #2697 (98fce24) into main (684a84d) will increase coverage by 0.01%.
The diff coverage is 75.00%.

Additional details and impacted files
Files Changed Coverage Δ
src/awkward/highlevel.py 76.84% <66.66%> (+0.45%) ⬆️
src/awkward/_connect/numpy.py 91.62% <80.00%> (-0.27%) ⬇️
src/awkward/_connect/dlpack.py 100.00% <100.00%> (+7.14%) ⬆️

@agoose77 agoose77 temporarily deployed to docs-preview September 6, 2023 18:41 — with GitHub Actions Inactive
ikrommyd added a commit to ikrommyd/egamma-tnp that referenced this pull request Sep 6, 2023
@ikrommyd
Copy link

ikrommyd commented Sep 6, 2023

Fixes my code as seen in the output of the last cell here: https://github.com/iasonkrom/egamma-tnp/blob/debug_dask_histogram/debug_dask_histogram.ipynb

@agoose77 agoose77 temporarily deployed to docs-preview September 6, 2023 19:10 — with GitHub Actions Inactive
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had thought that __cuda_array_interface__ and __dlpack_device__ would only affect rectilinear arrays on GPUs, and these don't have applications yet.

That might still be true. It may be that the __array_interface__ is the one that's causing the trouble. Yes, hist uses it:

>>> class Something:
...     __array_interface__ = {"data": np.arange(10), "shape": (10,), "typestr": "<i8"}
... 
>>> hist.Hist.new.Reg(10, 0, 10).Double().fill(Something())
Hist(Regular(10, 0, 10, label='Axis 0'), storage=Double()) # Sum: 10.0

CuPy uses __cuda_array_interface__ but nobody else does:

>>> data = cp.arange(10)
>>> data.__cuda_array_interface__
{'shape': (10,), 'typestr': '<i8', 'descr': [('', '<i8')], 'stream': 1, 'version': 3, 'strides': None, 'data': (140416645071360, False)}
>>> class Something:
...     __cuda_array_interface__ = {"data": (140416645071360, False), "shape": (10,), "typestr": "<i8", "stream": 1, "version": 3}
... 
>>> cp.asarray(Something())
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

(The above worked, but both of the below didn't.)

```python
>>> np.asarray(Something())
array(<__main__.Something object at 0x7fb574e86cd0>, dtype=object)
>>> hist.Hist.new.Reg(10, 0, 10).Double().fill(Something())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/mambaforge/lib/python3.9/site-packages/hist/basehist.py", line 232, in fill
    return super().fill(*args, *data, weight=weight, sample=sample, threads=threads)
  File "/home/jpivarski/mambaforge/lib/python3.9/site-packages/boost_histogram/_internal/hist.py", line 529, in fill
    self._hist.fill(*args_ars, weight=weight_ars, sample=sample_ars)  # type: ignore[arg-type]
TypeError: float() argument must be a string or a number, not 'Something'

I haven't managed to get the DLPack protocol to be identified by anything:

>>> from awkward._connect.dlpack import DLPackDevice
>>> class Something:
...     def __dlpack_device__(self):
...         return DLPackDevice.CPU
...     def __dlpack__(self, stream=None):
...         return np.arange(10)

On Slack, @agoose77 said he'd split out the __cuda_array_interface__ and __dlpack_device__ removals into a different PR because they have a different motivation than the __array_interface__ removal. The __array_interface__ removal is for fixing #2695, and this is the right fix for that.

@ikrommyd
Copy link

ikrommyd commented Sep 6, 2023

Would it be possible to roll out a patch 2.4.2 release?

@agoose77 agoose77 enabled auto-merge (squash) September 6, 2023 19:23
@agoose77 agoose77 merged commit 3b2f251 into main Sep 6, 2023
35 checks passed
@agoose77 agoose77 deleted the agoose77/fix-view-of-array branch September 6, 2023 19:24
ikrommyd added a commit to ikrommyd/egamma-tnp that referenced this pull request Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zip + combinations clobbers data on x86?
3 participants