-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistency between tf.matmul in Python and tf$matmul in R #9
Comments
I don't know numpy and R matrix functions well enough to say why, but I can tell you that The numpy constants you are creating are: [1 2 3 4 5 6 7 8]
[1 2 3 4] The R matrices you are creating are (when converted to NumPy): [[[1 5]
[3 7]]
[[2 6]
[4 8]]]
[[[1]
[3]] So I think the R matrices are a red-herrring (you don't need to be doing that much munging around with matrices!). If we simplify your code to this it yields the expected result: library(tensorflow)
A_tf = tf$constant(c(1:8), dtype="float64", shape=c(2,2,2))
B_tf = tf$constant(c(1:4), dtype="float64", shape=c(2,2,1))
sess = tf$Session()
sess$run(A_tf)
sess$run(B_tf)
sess$run(tf$matmul(A_tf,B_tf)) |
From @bwlewis on February 14, 2017 14:2 No, it's not quite the same R code:
Python equivalent:
There is indeed something funny going on here, possibly related to avoidance of native Python lists; trying to figure it out... Also note, as @G-Lynn notes above:
(sorry to keep editing this), https://github.com/bwlewis/python is also wrong! Curiously, the error is different--it's a transposition error instead of a dimensional error. Closer, but still wrong. Wow.
|
From @bwlewis on February 14, 2017 14:29 Type of bummer. But it seems to work with either a Python list or a numpy array:
|
The R Does that help? |
From @bwlewis on February 14, 2017 15:20 It doesn't, curiously enough:
|
Even explicitly passing an R array (which will become a numpy array on the other side) gives the same result: tf$Session()$run(tf$constant(np$arange(1, 5, dtype=np$int32), shape=array(c(2,2,1)))) |
This code yields the expected output: py <- import("__builtin__")
py$print(tf$Session()$run(tf$constant(np$arange(1, 5, dtype=np$int32), shape=c(2,2,1)))) This means that the problem is NOT on the input side, it's on the marshaling of a 3-d array back into R. That code is here: https://github.com/rstudio/reticulate/blob/master/src/python.cpp#L516 I do this to "cast" the NumPy array into a Fortran compatible array: array = (PyArrayObject*)PyArray_CastToType(array, descr, NPY_ARRAY_FARRAY); Then I just access the array data in memory order. Clearly this isn't working. I had assumed this was working properly based on this test (https://github.com/rstudio/reticulate/blob/master/tests/testthat/test-python-numpy.R#L18-L22) however looking closely at the test it's invalid because the result compared to also goes through the faulty marshaling layer. |
From @bwlewis on February 14, 2017 16:12 Aha. Note also a row/column memory ordering problem in the tensor:
versus
again, as @jjallaire shows, this is on the R marshaling side because we see the right order in Python:
|
I just exported an > library(reticulate)
> a <- array(c(1:12), dim = c(2,3,2))
> a
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
> r_to_py(a)
[[[ 1 7]
[ 3 9]
[ 5 11]]
[[ 2 8]
[ 4 10]
[ 6 12]]]
@bwlewis I'm not thinking clearly in multiple Fortran vs. C dimensions right now! Do these results surprise you? (it seems to me like the arrays are different however it could be that I'm missing a subtlety of the distinct dimension ordering). |
From @bwlewis on February 14, 2017 16:36 @jjallaire It's a memory row/column order problem. I need to carefully look at the numpy source. I suspect this:
Anyway, will carefully look at numpy source tonight. |
From @G-Lynn on February 14, 2017 17:0 @jjallaire @bwlewis Thank you for looking into this! |
Okay, @bwlewis I am eternally grateful for your help on this! |
From @eddelbuettel on February 14, 2017 17:13 Department of foggy memory here, but IIRC in package RcppCNPy we always end up transposing (for R) what we get (via the cnpy library) as NumPy data. |
Would one possibility be that tensorflow is taking the NumPy array and accessing it's buffer directly with the assumption of row-major ordering? Under the hood tensorflow is using Eigen and I'm guessing they don't make copies of the NumPy arrays. |
I think there are problems going into Python as well. Here's a simple example where on the R side we have 4 2x3 arrays but on the Python side we end up with 2 3x4 arrays: > library(reticulate)
> a <- array(c(1:24), dim = c(2,3,4))
> a
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
, , 3
[,1] [,2] [,3]
[1,] 13 15 17
[2,] 14 16 18
, , 4
[,1] [,2] [,3]
[1,] 19 21 23
[2,] 20 22 24
> py_a <- r_to_py(a)
> py_a
[[[ 1 7 13 19]
[ 3 9 15 21]
[ 5 11 17 23]]
[[ 2 8 14 20]
[ 4 10 16 22]
[ 6 12 18 24]]]
|
From @bwlewis on February 14, 2017 18:22 in these examples "transpose" needs to apply to a 3d array.
|
From @bwlewis on February 14, 2017 18:23 possible! would perhaps then be a python/tensorflow bug.
|
But then again it seems like my simple repro above that has nothing to do with tensorflow indicates we definitely have problems of our own. |
Here's another example which I believe illustrates that the order of the third dimension differs between R and NumPy: > library(reticulate)
> a <- array(c(1:24), dim = c(2,3,4))
> a
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
, , 3
[,1] [,2] [,3]
[1,] 13 15 17
[2,] 14 16 18
, , 4
[,1] [,2] [,3]
[1,] 19 21 23
[2,] 20 22 24
> py_a <- r_to_py(aperm(a, c(3,1,2)))
> py_a
[[[ 1 3 5]
[ 2 4 6]]
[[ 7 9 11]
[ 8 10 12]]
[[13 15 17]
[14 16 18]]
[[19 21 23]
[20 22 24]]]
> The printed arrays are the same because ee used This is sufficiently unfamiliar territory to me that I could be mistaken. Please advise :-) |
This stack overflow thread indicates that NumPy prints 3-d arrays in a way that is somewhat unintuitive to those of us familiar with R and Matlab: http://stackoverflow.com/questions/22981845/3-dimensional-array-in-numpy Could this be the source of all the problems/confusion? (i.e. the arrays are correct just printed incomprehensibly). |
This example proves that when 3-d arrays are marshaled back and forth they preserve their equality: library(reticulate)
a <- array(c(1:24), dim = c(2,3,4))
py_a <- r_to_py(a)
r_a <- py_to_r(py_a)
identical(a, r_a) |
Another example which illustrates that NumPy receives the 3d array correctly: > library(reticulate)
> a <- array(c(1:24), dim = c(2,3,4))
> py_a <- r_to_py(a)
> py_a$shape
[[1]]
[1] 2
[[2]]
[1] 3
[[3]]
[1] 4
> py_a$flatten(order = 'F')
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
> r_a <- py_to_r(py_a)
> identical(a, r_a)
[1] TRUE |
Here's another yet example showing that we get different NumPy arrays depending on whether we create them in R or NumPy > library(reticulate)
> np <- import("numpy")
> # create a numpy 3d array from an R array
> a <- array(c(1:24), dim = c(2,3,4))
> r_to_py(a)
[[[ 1 7 13 19]
[ 3 9 15 21]
[ 5 11 17 23]]
[[ 2 8 14 20]
[ 4 10 16 22]
[ 6 12 18 24]]]
> # create a numpy 3d array via Python
> py <- py_run_string("import numpy; a = numpy.arange(1,25).reshape(2,3,4)")
> py_get_attr(py, "a")
[[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
[[13 14 15 16]
[17 18 19 20]
[21 22 23 24]]]
> I'm so confused! I feel like rather than 2 hard problems in computer science (naming things and caching invalidation) we now need to add a 3rd (in-memory array representations)!. |
The reason they are different is that the higher order dimensions go first in Python (e.g. it should be |
The NumPy C API is also reporting dims with higher order dimensions first! This means that we are getting the |
EDIT on above CBLAS comment: I forgot that cblas is just a lightweight wrapper over blas that inserts transposes as needed, see for instance http://www.netlib.org/blas/blast-forum/cblas.tgz -- so my comments on copying are wrong above, no copies are needed. |
@jjallaire Consider your example above:
On the other hand:
Thus:
This leads to the question, why isn't this line of C++ code translating the array for us? https://github.com/rstudio/reticulate/blob/master/src/python.cpp#L549:
Getting back to the original problem, there may be cases in tensorflow where row order is assumed. But one thing at a time. |
Maybe a Python n-d array bug in PyArray_CastToType? It works fine in 2-d. Here using the old
I think the problems are limited to the n-d case, n > 2. |
Also, just a nit on the use of
Really, shouldn't the default be K or, arguably, A? Seems nuts to me otherwise. |
And just to verify that your example above is not merely a print/display problem, these things are in fact different within Python. So there is a problem in the assigment of the Python objects from R?
|
Maybe it's not us. Forget about R for a minute:
EDIT: I'm wrong, and have been mis-interpreting the order flag above. It's not simply saying how to fill data structures by rows first ("C_CONTIGUOUS") or columns first ("F_CONTIGUOUS"), but also a signal to all the operators on the element order for arithmetic. In the above example, the data layout are identical in memory -- see this with |
OK, getting back to the original problem, here it is in a nutshell:
There is an inconsistency in how C_CONTIGUOUS arrays are brought into R from Python. Things work as expected in the 2-d case, but not so in the 3-d case. This difference is unexpected. |
Okay, thanks again for digging into this. I'll do some more experimenting
and hopefully soon emerge with the incantations required to make this work
as expected.
…On Wed, Feb 15, 2017 at 11:26 AM, B. W. Lewis ***@***.***> wrote:
OK, getting back to the original problem, here it is in a nutshell:
library(reticulate)
np <- import("numpy")
# 2-d
a <- py_run_string("import numpy; a = numpy.reshape(numpy.arange(1, 5), (2, 2))")
a <- py_get_attr(a, "a")
# 3-d not consistent
b <- py_run_string("import numpy; b = numpy.arange(1, 9).reshape(2, 2, 2)")
b <- py_get_attr(b, "b")
a
#[[1 2]
# [3 4]]
py_to_r(a)
# [,1] [,2]
#[1,] 1 2
#[2,] 3 4
b
#[[[1 2]
# [3 4]]
#
# [[5 6]
# [7 8]]]
py_to_r(b)
#, , 1
# [,1] [,2]
#[1,] 1 3
#[2,] 5 7
#
#, , 2
# [,1] [,2]
#[1,] 2 4
#[2,] 6 8
There is an inconsistency in how C_CONTIGUOUS arrays are brought into R
from Python. Things work as expected in the 2-d case, but not so in the 3-d
case. This difference is unexpected.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAGXx--5JAqerwXCnO_ZZfN-UTvxQR2zks5rcyc2gaJpZM4MA8at>
.
|
yeah, still working on this...more soon
…On Feb 15, 2017 12:18, "JJ Allaire" ***@***.***> wrote:
Okay, thanks again for digging into this. I'll do some more experimenting
and hopefully soon emerge with the incantations required to make this work
as expected.
On Wed, Feb 15, 2017 at 11:26 AM, B. W. Lewis ***@***.***>
wrote:
> OK, getting back to the original problem, here it is in a nutshell:
>
> library(reticulate)
> np <- import("numpy")
> # 2-d
> a <- py_run_string("import numpy; a = numpy.reshape(numpy.arange(1, 5),
(2, 2))")
> a <- py_get_attr(a, "a")
>
> # 3-d not consistent
> b <- py_run_string("import numpy; b = numpy.arange(1, 9).reshape(2, 2,
2)")
> b <- py_get_attr(b, "b")
>
> a
> #[[1 2]
> # [3 4]]
> py_to_r(a)
> # [,1] [,2]
> #[1,] 1 2
> #[2,] 3 4
>
> b
> #[[[1 2]
> # [3 4]]
> #
> # [[5 6]
> # [7 8]]]
> py_to_r(b)
> #, , 1
> # [,1] [,2]
> #[1,] 1 3
> #[2,] 5 7
> #
> #, , 2
> # [,1] [,2]
> #[1,] 2 4
> #[2,] 6 8
>
> There is an inconsistency in how C_CONTIGUOUS arrays are brought into R
> from Python. Things work as expected in the 2-d case, but not so in the
3-d
> case. This difference is unexpected.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#9 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-
auth/AAGXx--5JAqerwXCnO_ZZfN-UTvxQR2zks5rcyc2gaJpZM4MA8at>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAIsnlapg5Mgvj6cZtWSNnSMvtfgGRriks5rczNpgaJpZM4MA8at>
.
|
The issue directly gets to a common source of confusion with n-d A lightly-edited reproduction of the reference Python code in the issue
The issue goes on to reproduce the example using R-generated arrays
However, already at this point we see that the R-generated array A is However, we can see how it can be easy to make the mistake that they are the
Instead, we need to construct the R array A differently to match the
With similar care ordering the values in the b array
An extended version of this discussion will appear in a package vignette. |
From @G-Lynn on February 13, 2017 23:31
Attached is a simple example of an issue in using matmul to multiply an array of matrices in R. I think the issue is a discrepancy in the way that the shape argument works in the R and python version.
I am trying to use the tf$matmul function to multiply an array of matrices by an array of vectors. As an example, I am trying to multiply [1, 2; 3, 4] * [1; 2] and [5,6; 7,8] * [3; 4]. The example works as expected in Python, but in the TensorFlow API for R, a dimension error is generated.
In Python, the code for the example is:
############# Beginning of Python code
import tensorflow as tf
a = tf.constant(np.arange(1, 9, dtype=np.int32),shape=[2, 2, 2]) #create array of 2 matrices each 2x2: (1,2; 3,4) and (5,6; 7,8)
b = tf.constant(np.arange(1, 5, dtype=np.int32),shape=[2, 2, 1]) #multuply each of the matrices by a 2x1 vector (1,2)' and (3,4)'
sess = tf.Session()
sess.run(a)
c = tf.matmul(a, b)
sess.run(a)
sess.run(b)
sess.run(c) #the answer is the correct set of 2x1 vectors (5,11)' and (39,53)'
############# End of Python Code
When I try to implement this same example in R, an error is produced due to a difference in the way the dimensions of the arrays are indexed in the shape argument.
##################### Begin R Code
rm(list = ls())
devtools::install_github("rstudio/tensorflow")
library(tensorflow)
#Create an array of 3 matrices
A = list(matrix(1:4, nrow=2, byrow=T), matrix(5:8, nrow=2, byrow=T))
A = array(unlist(A), dim=c(2,2,2) ) #2 matrices of dimension 2x2
B = array(1:4, dim = c(2,1,2) ) #2 vectors of 2x1
A_tf = tf$constant(A, dtype="float64", shape=c(2,2,2))
B_tf = tf$constant(B, dtype="float64", shape=c(2,1,2))
sess = tf$Session()
sess$run(A_tf)
sess$run(B_tf)
sess$run(tf$matmul(A_tf,B_tf))
################## End R code
I believe the error is because of a discrepancy in the way that the shape argument works in tf$constant (R) and the way the shape argument works in tf.constant (R). In R, the number of elements in the array is the last argument in shape so that shape = c(2,1,3) means an array with 3 2x1 vectors. In the python implementation, the number of array elements is the first argument so that shape=[3, 2, 1] means 3 vectors of 2x1.
When the function tf$matmul(A_tf,B_tf) is called, I think the difference in indexing the shapes of the array is causing an error.
Thanks for your attention.
Copied from original issue: rstudio/tensorflow#88
The text was updated successfully, but these errors were encountered: