Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise exception when 'from_array' is given a dask array #5280

Merged
merged 4 commits into from Aug 15, 2019

Conversation

djhoese
Copy link
Contributor

@djhoese djhoese commented Aug 15, 2019

Closes #5262.

  • Tests added / passed
  • Passes black dask / flake8 dask

@djhoese
Copy link
Contributor Author

djhoese commented Aug 15, 2019

@jakirkham Let me know if you want a test. I don't see any from_array specific tests so not sure where the best place is.

@jakirkham
Copy link
Member

jakirkham commented Aug 15, 2019

Let me know if you want a test.

That would be great. Could you please add one in dask/array/tests/test_array_core.py?

I don't see any from_array specific tests so not sure where the best place is.

Hmm...that's concerning. Could you please raise an issue for this?

@djhoese
Copy link
Contributor Author

djhoese commented Aug 15, 2019

Ah I didn't realize the tests were separated that way. I was only looking in dask/tests. I'm sure there are from_array tests in dask/array/tests/.

@jakirkham
Copy link
Member

jakirkham commented Aug 15, 2019

Yeah, if you search for def test_from_array_*, there are a few different cases tested. They are a bit scattered though.

@djhoese
Copy link
Contributor Author

djhoese commented Aug 15, 2019

So test was added but there is one failure in dask/array/tests/test_optimization.py:test_gh3937. The issue is that asarray is given a list of dask arrays. asarray looks like:

    if isinstance(a, Array):
        return a
    elif hasattr(a, "to_dask_array"):
        return a.to_dask_array()
    elif isinstance(a, (list, tuple)) and any(isinstance(i, Array) for i in a):
        a = stack(a)
    elif not isinstance(getattr(a, "shape", None), Iterable):
        a = np.asarray(a)
    return from_array(a, chunks=a.shape, getitem=getter_inline, **kwargs)

This will hit the stack if statement which will convert everything to a dask array which it then passes to from_array at the bottom of the function.

@jakirkham
Copy link
Member

jakirkham commented Aug 15, 2019

Where is it calling from_array in stack?

Edit: Oh right, in asarray you mean?

@jakirkham
Copy link
Member

jakirkham commented Aug 15, 2019

Could we just move that from_array into the else case? I don't think any other case needs it.

@djhoese
Copy link
Contributor Author

djhoese commented Aug 15, 2019

We could, but then chunks, getitem, and **kwargs are not being used.

@jakirkham
Copy link
Member

jakirkham commented Aug 15, 2019

They don't seem to be used by anything else anyways.

@jrbourbeau
Copy link
Member

jrbourbeau commented Aug 15, 2019

The following seems to pass tests for me locally

diff --git a/dask/array/core.py b/dask/array/core.py
index bb40549c..286dbcbc 100644
--- a/dask/array/core.py
+++ b/dask/array/core.py
@@ -3580,7 +3580,7 @@ def asarray(a, **kwargs):
     elif hasattr(a, "to_dask_array"):
         return a.to_dask_array()
     elif isinstance(a, (list, tuple)) and any(isinstance(i, Array) for i in a):
-        a = stack(a)
+        return stack(a)
     elif not isinstance(getattr(a, "shape", None), Iterable):
         a = np.asarray(a)
     return from_array(a, chunks=a.shape, getitem=getter_inline, **kwargs)

@jakirkham
Copy link
Member

jakirkham commented Aug 15, 2019

Yeah I think the application of from_array to anything other than the result of np.asarray here is unintentional. It's probably best for us to go ahead and clean that up now to avoid similar mistakes in the future.

@djhoese
Copy link
Contributor Author

djhoese commented Aug 15, 2019

Sounds good. Done. Thanks @jrbourbeau.

@jakirkham jakirkham merged commit 46525a3 into dask:master Aug 15, 2019
2 checks passed
@jakirkham
Copy link
Member

jakirkham commented Aug 15, 2019

Thanks @djhoese! 😄

@djhoese djhoese deleted the bugfix-from-array branch Aug 15, 2019
ericpre added a commit to ericpre/hyperspy that referenced this issue Sep 2, 2019
gjoseph92 added a commit to gjoseph92/dask that referenced this issue Sep 3, 2020
`asarray` worked on lists containing dask Arrays, but `asanyarray` didn't.

```python
x = da.asarray([1, 2, 3])
# works
da.asarray([0, x[1], x[2]])
# failed
da.asanyarray([0, x[1], x[2]])
```

Looks like dask#5280 only changed `asarray`, not `asanyarray`. I made most of the `asarray` tests use both methods, just to guard against such drift in the future.

`test_asarray_chunks` was the only one with inconsistent behavior, because `asanyarray` sets `chunks=a.shape`, but `asarray` does not. Unclear to me if that's intentional---if not, I'm happy to make them consistent in this PR.
jakirkham pushed a commit that referenced this issue Sep 9, 2020
* asanyarray: handle sequences containing dask Arrays

`asarray` worked on lists containing dask Arrays, but `asanyarray` didn't.

```python
x = da.asarray([1, 2, 3])
# works
da.asarray([0, x[1], x[2]])
# failed
da.asanyarray([0, x[1], x[2]])
```

Looks like #5280 only changed `asarray`, not `asanyarray`. I made most of the `asarray` tests use both methods, just to guard against such drift in the future.

`test_asarray_chunks` was the only one with inconsistent behavior, because `asanyarray` sets `chunks=a.shape`, but `asarray` does not. Unclear to me if that's intentional---if not, I'm happy to make them consistent in this PR.

* re-blacken
kumarprabhu1988 pushed a commit to kumarprabhu1988/dask that referenced this issue Oct 29, 2020
* asanyarray: handle sequences containing dask Arrays

`asarray` worked on lists containing dask Arrays, but `asanyarray` didn't.

```python
x = da.asarray([1, 2, 3])
# works
da.asarray([0, x[1], x[2]])
# failed
da.asanyarray([0, x[1], x[2]])
```

Looks like dask#5280 only changed `asarray`, not `asanyarray`. I made most of the `asarray` tests use both methods, just to guard against such drift in the future.

`test_asarray_chunks` was the only one with inconsistent behavior, because `asanyarray` sets `chunks=a.shape`, but `asarray` does not. Unclear to me if that's intentional---if not, I'm happy to make them consistent in this PR.

* re-blacken
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants