Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__array__ and __array_wrap__ #589

Closed
kohr-h opened this issue Oct 8, 2017 · 10 comments
Closed

__array__ and __array_wrap__ #589

kohr-h opened this issue Oct 8, 2017 · 10 comments
Labels
cat:feature New features/APIs

Comments

@kohr-h
Copy link

kohr-h commented Oct 8, 2017

A question about the Numpy array interface. cupy.ndarray implements __array__, but it doesn't work since np.asarray(cupy_array) raises an exception:

ValueError: object __array__ method not producing an array

So unless you have other plans for this method, it should be changed to return a Numpy array.

Do you want a PR? I'd also like to have the __array_wrap_ method, so I'd implement that one as well.

@niboshi
Copy link
Member

niboshi commented Nov 8, 2017

I'm sorry for the late reply.

I agree with you that it would be useful if CuPy arrays worked just like NumPy arrays. However, currently we intentionally do not allow that because that would incur hidden performance hit due to transfer between device and host.

As this feature has been requested from multiple users, however, we're currently considering some other way to accomplish this, by using configuration context (like chainer.using_config in Chainer), for example.

@niboshi niboshi added the cat:feature New features/APIs label Nov 8, 2017
@niboshi niboshi self-assigned this Nov 8, 2017
@kohr-h
Copy link
Author

kohr-h commented Nov 8, 2017

Thanks for your reply. So I read too much into the TODO comment in the file :-)

Anyway, for now I intend to use CuPy as a drop-in replacement of NumPy, independent of Chainer. There it would be good to have interoperability with NumPy arrays since it would make a number of implementations simpler. Avoiding data transfers between host and device would then be the job of the calling code.

Would there be a way to achieve this without involving Chainer? For example, by switching the feature on by default and disabling it by default in Chainer, with a config option there to enable it again.

@niboshi
Copy link
Member

niboshi commented Nov 21, 2017

I understand your point. But if NumPy and CuPy arrays were implicitly convertible, users would have to be very careful about which arrays they are dealing with. If a single NumPy array were involved in a series of CuPy operations by mistake, it would result in synchronization of all operations without notice. It would be very difficult to find the bottleneck.

As for the alternative method, I didn't mean CuPy would depend on Chainer. I meant introducing similar configuration mechanism used in Chainer.

@kohr-h
Copy link
Author

kohr-h commented Nov 21, 2017

It seems I understood your comment wrong. A config option handling as it is done in Chainer would be perfectly fine.

I also agree that for your primary use case in Chainer, hidden bottlenecks would be very bad, and that you want to avoid automatic CPU<->GPU transfers. I'll have to think a bit more about my use case, and if I should also avoid mixing up CPU and GPU memory/code.

For the record: Since Numpy 1.13 the __array_ufunc__ interface can be used to intercept ufunc calls like np.sin or np.add.reduce. In particular one can set __array_ufunc__ = None to signal that it is not supposed to be implemented.

@okuta
Copy link
Member

okuta commented May 10, 2018

CuPy uses asynchronous CPU to GPU memory copy by default mode. This is very easy feature because CUDA supports asynchronous execution (kernel launch).
But, CuPy does not use asynchronous GPU to CPU memory copy by default. NumPy only supports synchronous execution. So, Python code waits to finish GPU to CPU memory copy. This causes hard to find performance problems.

It seems I understood your comment wrong. A config option handling as it is done in Chainer would be perfectly fine.

It is good if the user can select whether to use __array_wrap__ or not by environment variable or global configure flag.

@brandondube
Copy link

brandondube commented May 24, 2018

Regarding __array__, could the inspect module not be used to check if the caller comes from cupy or , and change behavior based on that? Often a user may want to e.g. plot with matplotlib, but mpl will error because cupy's __array__ does not return an array. A warning could be thrown hinting at performance implications while allowing the intended behavior.

@kmaehashi
Copy link
Member

@asi1024
Copy link
Member

asi1024 commented Jun 16, 2020

We decided not to allow implicit conversion to NumPy arrays in #3421.

@asi1024 asi1024 closed this as completed Jun 16, 2020
@leofang
Copy link
Member

leofang commented Jun 16, 2020

Just wanna confirm, is this no longer the plan?

It is good if the user can select whether to use __array_wrap__ or not by environment variable or global configure flag.

@asi1024
Copy link
Member

asi1024 commented Jun 16, 2020

We have no plan for it currently, but we will reopen if someone continues the discussion to support it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:feature New features/APIs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants