-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warning Message: FutureWarning: The numpy.moveaxis
function is not implemented by Dask array
#690
Comments
I think the warnings can be ignored. intake/intake-esm#121 is solving the one from intake. It looks like dask/dask#4822 is implementing moveaxis on dask.array. I'll see where that's at. |
You mean the warnings after importing the packages? Yea I figured since I am able to run subsequent cells. Ok, any updates would be appreciated, thanks! |
Sorry, yes, I meant the warnings on import. For now, you can also probably safely ignore the moveaxis warning. I'm guessing it'll be fixed in Dask soon. |
Unfortunately, I'm not able to run subsequent cells since the kernel seems to be preoccupied resolving the |
Oh, sorry I missed that part of your post. I'm trying this out on ocean.pangeo.io
My notebook kernel died on the line
Yeah, you're right. I was completely wrong about https://github.com/pangeo-data/pangeo-cloud-federation/issues/364#issuecomment-520064580. In NumPy 1.16, In [11]: np.__version__
Out[11]: '1.16.0'
In [12]: a = np.random.random((4, 4, 4))
In [13]: np.moveaxis(da.from_array(a, 2), 1, 2)
Out[13]: dask.array<transpose, shape=(4, 4, 4), dtype=float64, chunksize=(2, 2, 2)> With NumPy 1.17, that returns a NumPy array. In [6]: a = np.random.random((4, 4, 4))
In [7]: np.moveaxis(da.from_array(a, 2), 1, 2)
/Users/taugspurger/sandbox/dask/dask/array/core.py:1264: FutureWarning: The `numpy.moveaxis` function is not implemented by Dask array. You may want to use the da.map_blocks function or something similar to silence this warning. Your code may stop working in a future release.
FutureWarning,
Out[7]:
array([[[0.883594 , 0.83361276, 0.11596388, 0.42493785],
[0.29075857, 0.3312683 , 0.70986969, 0.76634831],
[0.61024485, 0.038276 , 0.14124975, 0.20009608],
[0.74891671, 0.28027278, 0.62557011, 0.32603486]],
[[0.45846013, 0.65317719, 0.14381856, 0.67333014],
[0.18534854, 0.53083362, 0.01030157, 0.8822557 ],
[0.55225587, 0.45671406, 0.58132645, 0.72099828],
[0.64439194, 0.01546631, 0.136054 , 0.45866154]],
[[0.9110986 , 0.71479734, 0.41174671, 0.63004493],
[0.90519822, 0.07737934, 0.72285197, 0.25865702],
[0.49462467, 0.56716872, 0.8396765 , 0.63395948],
[0.58644267, 0.62561324, 0.00824153, 0.90913008]],
[[0.51209298, 0.11582602, 0.89098367, 0.95992173],
[0.35492695, 0.8645212 , 0.53640816, 0.12354237],
[0.80328269, 0.50222311, 0.93996505, 0.23952077],
[0.57965991, 0.00851389, 0.71330849, 0.20458262]]]) So I think we end up trying to load all 134GB of the data onto the worker running your notebook. Not good. It may end up breaking things, but if you need a quick solution, you can export the environment variable |
I am using ocean.pangeo.io as well and I just use the Dask dashboard provided in the jupyter lab and scale my workers anywhere from 4-6 workers. |
OK, if you add import os
os.environ['NUMPY_EXPERIMENTAL_ARRAY_FUNCTION'] = '0' before importing numpy (not sure if before or after matters), and then from dask_kubernetes import KubeCluster
cluster = KubeCluster(n_workers=10, env={"NUMPY_EXPERIMENTAL_ARRAY_FUNCTION": "0"}) when you create the cluster, things will hopefully work. Trying it out now. |
Oh, yeah that definitely worked, since |
I am so pleased that
In general, I am quite puzzled by this behavior from numpy. The 1.17 release seems like a step in the wrong direction. Duck array functionality that worked in 1.16 is now broken in 1.17, without this special environment variable. The numpy docs seem to suggest the opposite
This seems backwards from what we are experiencing:
Over in dask/dask#2559, @shoyer noted that
Something related to the new I wonder if it is worth opening an issue in numpy to alert the broader community to this. |
I think perhaps we should, wouldn't hurt! Though @TomAugspurger provided an easy workaround, the root problem is still active, and we should let numpy community be aware of this so someone can potentially provide some more insight or fix to this. |
Inside Dask, we chose to issue a warning and fall-back to casting to NumPy arrays unwrapped functions were encountered. The alternative would be to raise an error, but I don't know how much more useful that would be here. To avoid warnings or errors, dask will need to reimplement this function. This is discussed in the relevant PR (dask/dask#4822) but the issue itself is out of date now. |
FYI dask/dask#4822 was just merged, so the next version of dask (like 2.2.1) will work nicely with FWIW, as a library maintainer, I'm happy to have all of NumPy's dispatching unified under |
meta comment: this discussion would have been impossible (or at least a lot slower) on discourse (see #677). |
@TomAugspurger, your workaround fix for creating a cluster after importing |
Hmm that's strange. I just tried it out on ocean.pangeo.io with Do you specify anything else when you create the KubeCluster? FYI, this is fixed on Dask master now. We're doing a 2.2.1 release today hopefully, so the workaround won't be necessary once the cluster is updated to use it. |
No, I just copied and pasted the same code. Will try again now. That's great to hear! |
I tried it again (with the environment specification) and waited around 10 min for workers to load and the dask dashboard was still blank - but when I omitted that part, it took about 2 minutes for workers to load. Looking forward to that dask update! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date. |
I keep getting a warning message that repeats itself until 1) I am kicked out of the server when using clusters; or 2) the kernel just keeps running when not using clusters and repeating the error message, to the point where I have to interrupt the cell. I think it might have something to do with some updates that went through yesterday affecting the numpy/dask interface, but I'm not completely sure.
Reproducible Code:
I also get a warning message after I run the cell importing all the necessary packages:
Here is the error message (the cell runs for a while if not using clusters):
I'd appreciate some insight to resolve this issue, thank you.
The text was updated successfully, but these errors were encountered: