New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent Parameter Order in "mx.dot" Resulting in "dot shape error" #55

Closed
chyanju opened this Issue Jan 11, 2016 · 7 comments

Comments

Projects
None yet
4 participants
@chyanju

chyanju commented Jan 11, 2016

I was trying the "mx.dot()" to calculate 2D matrix multiplication. Codes & exceptions in Julia REPL are like:

# Julia REPL
julia> using MXNet
julia> A=mx.ones((5,3))
mx.NDArray(5,3)
julia> B=mx.ones((3,9))
mx.NDArray(3,9)
julia> mx.dot(A,B)
[10:41:49] ./dmlc-core/include/dmlc/logging.h:208: [10:41:49] src/ndarray/./ndarray_function.h:71: Check failed: (lshape[1]) == (rshape[0]) dot shape error: (3, 5) X (9, 3)
ERROR: MXNet.mx.MXError("[10:41:49] src/ndarray/./ndarray_function.h:71: Check failed: (lshape[1]) == (rshape[0]) dot shape error: (3, 5) X (9, 3)")
 [inlined code] from /home/joseph/.julia/v0.4/MXNet/src/base.jl:57
 in _invoke_mxfunction at /home/joseph/.julia/v0.4/MXNet/src/ndarray.jl:836
 in dot at no file
julia> mx.dot(B,A)
mx.NDArray(5,9)

So I figure the order of the parameters should be like "mx.dot(rhs::NDArray,lhs::NDArray)", which is different from others, like Python, as the following Python code works:

# Python REPL
>>> import mxnet as mx
>>> A=mx.nd.ones((5,3))
>>> B=mx.nd.ones((3,9))
>>> mx.ndarray.dot(A,B)
<mxnet.ndarray.NDArray object at 0x7f1f00a06650>

And is this specific in Julia or it is a bug? Should it be "mx.dot(lhs::NDArray,rhs::NDArray)"?

@chyanju

This comment has been minimized.

Show comment
Hide comment
@chyanju

chyanju Jan 11, 2016

BTW, the mxnet version is
"MXNet" => v"0.0.7"
in Julia, and the Python I tested above is using the same libmxnet.so as Julia does.

chyanju commented Jan 11, 2016

BTW, the mxnet version is
"MXNet" => v"0.0.7"
in Julia, and the Python I tested above is using the same libmxnet.so as Julia does.

@pluskid

This comment has been minimized.

Show comment
Hide comment
@pluskid

pluskid Jan 11, 2016

Member

In my opinion, the semantic of dot(A,B) should be A' * B -- (A transposed) x (B). The inconsistency should be a bug at some place. I will have a look. Thanks for reporting.

Member

pluskid commented Jan 11, 2016

In my opinion, the semantic of dot(A,B) should be A' * B -- (A transposed) x (B). The inconsistency should be a bug at some place. I will have a look. Thanks for reporting.

@pluskid

This comment has been minimized.

Show comment
Hide comment
@pluskid

pluskid Jan 11, 2016

Member

In Julia (and R, Matlab), the NDArray has column-majoring. So a matrix with shape (2, 3) is a matrix with shape (3, 2) in libmxnet. The implementation of dot in libmxnet checks

CHECK_EQ(lshape[1], rshape[0])

So in Python, dot of matrices (2, 3) and (3, 8) checks pass because lshape[1] is 3, and rshape[0] is 3.

But in Julia (and R, Matlab), the check will fail because lshape[1] is 2 and rshape[0] is 8.

One possible solution is when defining dot in Julia (and R, matlab), swap the lhs with rhs. Semantically, passing a matrix from Julia to libmxnet (and back) means a transpose. So (A' * B')' = B * A. Though this is quite hacky and obscure, as we need to match function name in automatic operator defining of NDArray. @tqchen @mli I guess the R and Matlab bindings have similar issue.

Member

pluskid commented Jan 11, 2016

In Julia (and R, Matlab), the NDArray has column-majoring. So a matrix with shape (2, 3) is a matrix with shape (3, 2) in libmxnet. The implementation of dot in libmxnet checks

CHECK_EQ(lshape[1], rshape[0])

So in Python, dot of matrices (2, 3) and (3, 8) checks pass because lshape[1] is 3, and rshape[0] is 3.

But in Julia (and R, Matlab), the check will fail because lshape[1] is 2 and rshape[0] is 8.

One possible solution is when defining dot in Julia (and R, matlab), swap the lhs with rhs. Semantically, passing a matrix from Julia to libmxnet (and back) means a transpose. So (A' * B')' = B * A. Though this is quite hacky and obscure, as we need to match function name in automatic operator defining of NDArray. @tqchen @mli I guess the R and Matlab bindings have similar issue.

pluskid added a commit that referenced this issue Jan 16, 2016

@pluskid

This comment has been minimized.

Show comment
Hide comment
@pluskid

pluskid Jan 16, 2016

Member

A workaround is added to swap the arguments for dot. However, we need to worry about consistency also in the symbolic API side.

Member

pluskid commented Jan 16, 2016

A workaround is added to swap the arguments for dot. However, we need to worry about consistency also in the symbolic API side.

@tqchen

This comment has been minimized.

Show comment
Hide comment
@tqchen

tqchen Jan 16, 2016

Member

This problem only occurs for few non-elementwise cases for now, e.g. dot, The symbolic API(FullyConnected) is fine as weight and bias are properly specified

Member

tqchen commented Jan 16, 2016

This problem only occurs for few non-elementwise cases for now, e.g. dot, The symbolic API(FullyConnected) is fine as weight and bias are properly specified

@vchuravy

This comment has been minimized.

Show comment
Hide comment
@vchuravy
Collaborator

vchuravy commented Sep 5, 2016

@pluskid

This comment has been minimized.

Show comment
Hide comment
@pluskid

pluskid Sep 12, 2016

Member

This should have already been fixed in v0.1.0

Member

pluskid commented Sep 12, 2016

This should have already been fixed in v0.1.0

@pluskid pluskid closed this Sep 12, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment