New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
should chainer matmul have an equal behavior as numpy matmul? #1963
Comments
I agree that matmul should have the same interface as that of numpy. We want to keep the interface during v1, so it is good to make this change for v2. It would be better to leave the current matmul with a different name to make the migration to v2 easy. The above implementation looks good as a starting point, except the use of |
If I understand it correctlly, the behavior of the new matmul (#2426) is as below:
So, it reduces functionality and starts to raise more errors than before? |
I would call it consistent with numpy. It droppes some operations and adds others. >>> def test(shape1, shape2):
... return (np.zeros(shape1) @ np.zeros(shape2)).shape
>>> test([2, 3], [3, 4])
(2, 4)
>>> test([2], [1, 4])
Traceback (most recent call last):
...
ValueError: shapes (2,) and (1,4) not aligned: 2 (dim 0) != 1 (dim 0)
>>> test([2, 3], [3]) # this does not work in the new version
(2,)
>>> test([5, 2, 3], [5, 3])
Traceback (most recent call last):
...
ValueError: shapes (5,2,3) and (5,3) not aligned: 3 (dim 2) != 5 (dim 0)
>>> test([5, 3], [5, 1, 3])
Traceback (most recent call last):
...
ValueError: shapes (5,3) and (5,1,3) not aligned: 3 (dim 1) != 1 (dim 1) |
Oh, sorry. I didn't notice input shapes are different between numpy.matmul and chainer functions in your original post. You're right. So my examples should be
So, it's more consistent with numpy.matmul because it reduces functionality that is inconsistent and adds some operations that are consistent with numpy.matmul e.g. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs. Thank you for your contributions. |
This issue is closed as announced. Feel free to re-open it if needed. |
Currently chainer has matmul and batch_matmul, but both have a different behavior as numpy:
Here some shapes:
In my opinion, the behavior of chainer matmul should be equal to numpy matmul or raise an Exception.
In #1901 @okuta has started to rewrite matmul's internal code and I asked to also rewrite the interface.
Therefore I want to start a discussion, if the interface should be changed.
A possible starting point for a new matmul is the following code:
The text was updated successfully, but these errors were encountered: