Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency between tf.matmul in Python and tf$matmul in R #9

Closed
jjallaire opened this issue Feb 14, 2017 · 35 comments
Closed

Inconsistency between tf.matmul in Python and tf$matmul in R #9

jjallaire opened this issue Feb 14, 2017 · 35 comments

Comments

@jjallaire
Copy link
Member

From @G-Lynn on February 13, 2017 23:31

Attached is a simple example of an issue in using matmul to multiply an array of matrices in R. I think the issue is a discrepancy in the way that the shape argument works in the R and python version.

I am trying to use the tf$matmul function to multiply an array of matrices by an array of vectors. As an example, I am trying to multiply [1, 2; 3, 4] * [1; 2] and [5,6; 7,8] * [3; 4]. The example works as expected in Python, but in the TensorFlow API for R, a dimension error is generated.

In Python, the code for the example is:
############# Beginning of Python code
import tensorflow as tf
a = tf.constant(np.arange(1, 9, dtype=np.int32),shape=[2, 2, 2]) #create array of 2 matrices each 2x2: (1,2; 3,4) and (5,6; 7,8)
b = tf.constant(np.arange(1, 5, dtype=np.int32),shape=[2, 2, 1]) #multuply each of the matrices by a 2x1 vector (1,2)' and (3,4)'
sess = tf.Session()
sess.run(a)
c = tf.matmul(a, b)
sess.run(a)
sess.run(b)
sess.run(c) #the answer is the correct set of 2x1 vectors (5,11)' and (39,53)'
############# End of Python Code

When I try to implement this same example in R, an error is produced due to a difference in the way the dimensions of the arrays are indexed in the shape argument.

##################### Begin R Code
rm(list = ls())
devtools::install_github("rstudio/tensorflow")
library(tensorflow)

#Create an array of 3 matrices
A = list(matrix(1:4, nrow=2, byrow=T), matrix(5:8, nrow=2, byrow=T))
A = array(unlist(A), dim=c(2,2,2) ) #2 matrices of dimension 2x2
B = array(1:4, dim = c(2,1,2) ) #2 vectors of 2x1
A_tf = tf$constant(A, dtype="float64", shape=c(2,2,2))
B_tf = tf$constant(B, dtype="float64", shape=c(2,1,2))
sess = tf$Session()
sess$run(A_tf)
sess$run(B_tf)
sess$run(tf$matmul(A_tf,B_tf))

################## End R code

I believe the error is because of a discrepancy in the way that the shape argument works in tf$constant (R) and the way the shape argument works in tf.constant (R). In R, the number of elements in the array is the last argument in shape so that shape = c(2,1,3) means an array with 3 2x1 vectors. In the python implementation, the number of array elements is the first argument so that shape=[3, 2, 1] means 3 vectors of 2x1.

When the function tf$matmul(A_tf,B_tf) is called, I think the difference in indexing the shapes of the array is causing an error.

Thanks for your attention.

Copied from original issue: rstudio/tensorflow#88

@jjallaire
Copy link
Member Author

I don't know numpy and R matrix functions well enough to say why, but I can tell you that tf$constant in R is literally call Python's tf.constant (there is no intermediate code which transforms the input and/or would cause the functions to behave differently.

The numpy constants you are creating are:

[1 2 3 4 5 6 7 8]
[1 2 3 4]

The R matrices you are creating are (when converted to NumPy):

[[[1 5]
  [3 7]]

 [[2 6]
  [4 8]]]

[[[1]
  [3]]

So I think the R matrices are a red-herrring (you don't need to be doing that much munging around with matrices!). If we simplify your code to this it yields the expected result:

library(tensorflow)
A_tf = tf$constant(c(1:8), dtype="float64", shape=c(2,2,2))
B_tf = tf$constant(c(1:4), dtype="float64", shape=c(2,2,1))
sess = tf$Session()
sess$run(A_tf)
sess$run(B_tf)
sess$run(tf$matmul(A_tf,B_tf))

@jjallaire
Copy link
Member Author

From @bwlewis on February 14, 2017 14:2

No, it's not quite the same B_tf above, and consequentially the result
is a matrix instead of an array of vectors as desired.

R code:

library(tensorflow)
import("numpy") -> np
tf$Session()$run(tf$constant(np$arange(1, 5, dtype=np$int32), shape=c(2,2,1)))

#, , 1
#     [,1] [,2]
#[1,]    1    2
#[2,]    3    4

Python equivalent:

tf.Session().run(tf.constant(np.arange(1, 5, dtype=np.int32),shape=[2, 2, 1]) )

#array([[[1],
#        [2]],
#
#       [[3],
#        [4]]], dtype=int32)

There is indeed something funny going on here, possibly related to avoidance of native Python lists; trying to figure it out...

Also note, as @G-Lynn notes above:

tf$Session()$run(tf$constant(np$arange(1, 5, dtype=np$int32), shape=c(2,1,2)))
#, , 1
#
#     [,1]
#[1,]    1
#[2,]    3
#
#, , 2
#
#     [,1]
#[1,]    2
#[2,]    4

(sorry to keep editing this), https://github.com/bwlewis/python is also wrong! Curiously, the error is different--it's a transposition error instead of a dimensional error. Closer, but still wrong. Wow.

library(python)
 import("tensorflow") -> tf
 import("numpy") -> np
tf$Session()$run(tf$constant(np$arange(1, 5, dtype=np$int32), shape=c(2,2,1)))
#[[[1]
#  [2]]
#
# [[3]
#  [4]]]

@jjallaire
Copy link
Member Author

From @bwlewis on February 14, 2017 14:29

Type of shape is not defined in the doc: https://www.tensorflow.org/api_docs/python/constant_op/constant_value_tensors

bummer. But it seems to work with either a Python list or a numpy array:

>>> tf.Session().run(tf.constant(np.arange(1,5), shape=np.array([2,2,1])))
array([[[1],
        [2]],

       [[3],
        [4]]])
>>> tf.Session().run(tf.constant(np.arange(1,5), shape=[2,2,1]))
array([[[1],
        [2]],

       [[3],
        [4]]])

@jjallaire
Copy link
Member Author

The R shape function will yield a Python list: https://rstudio.github.io/tensorflow/using_tensorflow_api.html#tensor_shapes

Does that help?

@jjallaire
Copy link
Member Author

From @bwlewis on February 14, 2017 15:20

It doesn't, curiously enough:

tf$Session()$run(tf$constant(np$arange(1, 5, dtype=np$int32), shape=shape(2,2,1)))
#, , 1
#
#     [,1] [,2]
#[1,]    1    2
#[2,]    3    4

@jjallaire
Copy link
Member Author

Even explicitly passing an R array (which will become a numpy array on the other side) gives the same result:

tf$Session()$run(tf$constant(np$arange(1, 5, dtype=np$int32), shape=array(c(2,2,1))))

@jjallaire
Copy link
Member Author

This code yields the expected output:

py <- import("__builtin__")
py$print(tf$Session()$run(tf$constant(np$arange(1, 5, dtype=np$int32), shape=c(2,2,1))))

This means that the problem is NOT on the input side, it's on the marshaling of a 3-d array back into R. That code is here:

https://github.com/rstudio/reticulate/blob/master/src/python.cpp#L516

I do this to "cast" the NumPy array into a Fortran compatible array:

array = (PyArrayObject*)PyArray_CastToType(array, descr, NPY_ARRAY_FARRAY);

Then I just access the array data in memory order. Clearly this isn't working. I had assumed this was working properly based on this test (https://github.com/rstudio/reticulate/blob/master/tests/testthat/test-python-numpy.R#L18-L22) however looking closely at the test it's invalid because the result compared to also goes through the faulty marshaling layer.

@jjallaire
Copy link
Member Author

From @bwlewis on February 14, 2017 16:12

Aha. Note also a row/column memory ordering problem in the tensor:

import tensorflow as tf
import numpy as np
tf.Session().run(tf.constant(np.arange(1, 9, dtype=np.int32),shape=[2, 2, 2]))
array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]], dtype=int32)

versus

library(tensorflow)
import("numpy") -> np
tf$constant(np$arange(1, 9, dtype=np$int32),shape=c(2,2,2))
tf$Session()$run(tf$constant(np$arange(1, 9, dtype=np$int32),shape=c(2,2,2)))
, , 1
     [,1] [,2]
[1,]    1    3
[2,]    5    7
, , 2
     [,1] [,2]
[1,]    2    4
[2,]    6    8

again, as @jjallaire shows, this is on the R marshaling side because we see the right order in Python:

py <- import("__builtin__")
py$print(tf$Session()$run(tf$constant(np$arange(1, 9, dtype=np$int32),shape=c(2,2,2))))
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]

@jjallaire
Copy link
Member Author

I just exported an r_to_py function which enabled me to write this test code:

> library(reticulate)

> a <- array(c(1:12), dim = c(2,3,2))
> a
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12

 
> r_to_py(a)
[[[ 1  7]
  [ 3  9]
  [ 5 11]]

 [[ 2  8]
  [ 4 10]
  [ 6 12]]]

@bwlewis I'm not thinking clearly in multiple Fortran vs. C dimensions right now! Do these results surprise you? (it seems to me like the arrays are different however it could be that I'm missing a subtlety of the distinct dimension ordering).

@jjallaire
Copy link
Member Author

From @bwlewis on February 14, 2017 16:36

@jjallaire It's a memory row/column order problem. I need to carefully look at the numpy source. I suspect this:

  • since default numpy is row-ordered, that means it probably links to a "c" blas instead of blas.
  • results from that cblas are probably ordered by row, maybe not respecting the NP_ARRAY_FARRAY_RO flag?
  • even given NP_ARRAY_FARRAY_RO flag, if numpy is lazy then most buffers are copied and re-ordered anyway. Even if numpy is clever that is not avoidable in many cases. The effort to make zero-copy representations between R and Python is wasted in this case. Needs further investigation...

Anyway, will carefully look at numpy source tonight.

@jjallaire
Copy link
Member Author

From @G-Lynn on February 14, 2017 17:0

@jjallaire @bwlewis Thank you for looking into this!

@jjallaire
Copy link
Member Author

Okay, @bwlewis I am eternally grateful for your help on this!

@jjallaire
Copy link
Member Author

From @eddelbuettel on February 14, 2017 17:13

Department of foggy memory here, but IIRC in package RcppCNPy we always end up transposing (for R) what we get (via the cnpy library) as NumPy data.

@jjallaire
Copy link
Member Author

Would one possibility be that tensorflow is taking the NumPy array and accessing it's buffer directly with the assumption of row-major ordering? Under the hood tensorflow is using Eigen and I'm guessing they don't make copies of the NumPy arrays.

@jjallaire
Copy link
Member Author

I think there are problems going into Python as well. Here's a simple example where on the R side we have 4 2x3 arrays but on the Python side we end up with 2 3x4 arrays:

> library(reticulate)

> a <- array(c(1:24), dim = c(2,3,4))

> a
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12

, , 3

     [,1] [,2] [,3]
[1,]   13   15   17
[2,]   14   16   18

, , 4

     [,1] [,2] [,3]
[1,]   19   21   23
[2,]   20   22   24


> py_a <- r_to_py(a)

> py_a
[[[ 1  7 13 19]
  [ 3  9 15 21]
  [ 5 11 17 23]]

 [[ 2  8 14 20]
  [ 4 10 16 22]
  [ 6 12 18 24]]]

@jjallaire
Copy link
Member Author

From @bwlewis on February 14, 2017 18:22

in these examples "transpose" needs to apply to a 3d array.
On Feb 14, 2017 12:13, "Dirk Eddelbuettel" notifications@github.com wrote:

Department of foggy memory here, but IIRC in package RcppCNPy we always
end up transposing (for R) what we get (via the cnpy library) as NumPy data.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
rstudio/tensorflow#88 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAIsnisK1rXOuJLi5fELQ5HgrxhKqT1Sks5rceDGgaJpZM4L_3Ni
.

@jjallaire
Copy link
Member Author

From @bwlewis on February 14, 2017 18:23

possible! would perhaps then be a python/tensorflow bug.
On Feb 14, 2017 13:11, "JJ Allaire" notifications@github.com wrote:

Would one possibility be that tensorflow is taking the NumPy array and
accessing it's buffer directly with the assumption of row-major ordering?
Under the hood tensorflow is using Eigen and I'm guessing they don't make
copies of the NumPy arrays.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
rstudio/tensorflow#88 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAIsnrrnBc78Zxul-bvQfTrca5NEZ6iXks5rce5YgaJpZM4L_3Ni
.

@jjallaire
Copy link
Member Author

But then again it seems like my simple repro above that has nothing to do with tensorflow indicates we definitely have problems of our own.

@jjallaire
Copy link
Member Author

Here's another example which I believe illustrates that the order of the third dimension differs between R and NumPy:

> library(reticulate)
> a <- array(c(1:24), dim = c(2,3,4))
> a
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12

, , 3

     [,1] [,2] [,3]
[1,]   13   15   17
[2,]   14   16   18

, , 4

     [,1] [,2] [,3]
[1,]   19   21   23
[2,]   20   22   24

> py_a <- r_to_py(aperm(a, c(3,1,2)))
> py_a
[[[ 1  3  5]
  [ 2  4  6]]

 [[ 7  9 11]
  [ 8 10 12]]

 [[13 15 17]
  [14 16 18]]

 [[19 21 23]
  [20 22 24]]]
> 

The printed arrays are the same because ee used aperm to move the 3rd dimension to the first slot.

This is sufficiently unfamiliar territory to me that I could be mistaken. Please advise :-)

@jjallaire
Copy link
Member Author

This stack overflow thread indicates that NumPy prints 3-d arrays in a way that is somewhat unintuitive to those of us familiar with R and Matlab:

http://stackoverflow.com/questions/22981845/3-dimensional-array-in-numpy

Could this be the source of all the problems/confusion? (i.e. the arrays are correct just printed incomprehensibly).

@jjallaire
Copy link
Member Author

This example proves that when 3-d arrays are marshaled back and forth they preserve their equality:

library(reticulate)

a <- array(c(1:24), dim = c(2,3,4))

py_a <- r_to_py(a)

r_a <- py_to_r(py_a)

identical(a, r_a)

@jjallaire
Copy link
Member Author

Another example which illustrates that NumPy receives the 3d array correctly:

> library(reticulate)

> a <- array(c(1:24), dim = c(2,3,4))

> py_a <- r_to_py(a)

> py_a$shape
[[1]]
[1] 2

[[2]]
[1] 3

[[3]]
[1] 4


> py_a$flatten(order = 'F')
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

> r_a <- py_to_r(py_a)

> identical(a, r_a)
[1] TRUE

@jjallaire
Copy link
Member Author

Here's another yet example showing that we get different NumPy arrays depending on whether we create them in R or NumPy

> library(reticulate)

> np <- import("numpy")

> # create a numpy 3d array from an R array
> a <- array(c(1:24), dim = c(2,3,4))

> r_to_py(a)
[[[ 1  7 13 19]
  [ 3  9 15 21]
  [ 5 11 17 23]]

 [[ 2  8 14 20]
  [ 4 10 16 22]
  [ 6 12 18 24]]]

> # create a numpy 3d array via Python
> py <- py_run_string("import numpy; a = numpy.arange(1,25).reshape(2,3,4)")

> py_get_attr(py, "a")    
[[[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]]

 [[13 14 15 16]
  [17 18 19 20]
  [21 22 23 24]]]
> 

I'm so confused! I feel like rather than 2 hard problems in computer science (naming things and caching invalidation) we now need to add a 3rd (in-memory array representations)!.

@jjallaire
Copy link
Member Author

The reason they are different is that the higher order dimensions go first in Python (e.g. it should be reshape(4, 2, 3) to match the dim = c(2,3,4) used in R). So there may not be a problem here.

@jjallaire
Copy link
Member Author

The NumPy C API is also reporting dims with higher order dimensions first! This means that we are getting the dim attribute wrong in R (the reason that we round trip successfully is that we get it wrong going in and going out in the same way). I'll keep experimenting.

@bwlewis
Copy link
Contributor

bwlewis commented Feb 15, 2017

EDIT on above CBLAS comment: I forgot that cblas is just a lightweight wrapper over blas that inserts transposes as needed, see for instance http://www.netlib.org/blas/blast-forum/cblas.tgz -- so my comments on copying are wrong above, no copies are needed.

@bwlewis
Copy link
Contributor

bwlewis commented Feb 15, 2017

@jjallaire Consider your example above:

library(reticulate)
np <- import("numpy")
a <- array(c(1:24), dim = c(2,3,4))
p <- r_to_py(a)
p$flags
#  C_CONTIGUOUS : False
#  F_CONTIGUOUS : True
#  OWNDATA : False
#  WRITEABLE : False
#  ALIGNED : True
#  UPDATEIFCOPY : False

On the other hand:

py_run_string("import numpy; a = numpy.arange(1,25).reshape(2,3,4); print  a.flags")
#  C_CONTIGUOUS : True
#  F_CONTIGUOUS : False
#  OWNDATA : False
#  WRITEABLE : True
#  ALIGNED : True
#  UPDATEIFCOPY : False

Thus:

  1. we need to check numpy array attributes on return to R to see if they are column or row ordered and adjust accordingly on the return.
  2. This may cause some inefficiencies in binary operators with mixed order arguments...in that one case, one of those arrays needs to be copied internally by Python. I don't think this is a super-serious problem.

This leads to the question, why isn't this line of C++ code translating the array for us? https://github.com/rstudio/reticulate/blob/master/src/python.cpp#L549:

array = (PyArrayObject*)PyArray_CastToType(array, descr, NPY_ARRAY_FARRAY);

Getting back to the original problem, there may be cases in tensorflow where row order is assumed. But one thing at a time.

@bwlewis
Copy link
Contributor

bwlewis commented Feb 15, 2017

Maybe a Python n-d array bug in PyArray_CastToType? It works fine in 2-d. Here using the old bwlewis/python package to help see what's going on:

x <- matrix(1:25.0, 5)
x %*% x
#     [,1] [,2] [,3] [,4] [,5]
#[1,]  215  490  765 1040 1315
#[2,]  230  530  830 1130 1430
#[3,]  245  570  895 1220 1545
#[4,]  260  610  960 1310 1660
#[5,]  275  650 1025 1400 1775

library(python)
import("numpy") -> np
p <- np$array(x)
p$dot(p)$flags
#  C_CONTIGUOUS : True          ## <<<---- NOTE
#  F_CONTIGUOUS : False
#  OWNDATA : True
#  WRITEABLE : True
#  ALIGNED : True
#  UPDATEIFCOPY : False

# But, it comes back to R in this example correctly!
R(p$dot(p))
#     [,1] [,2] [,3] [,4] [,5]
#[1,]  215  490  765 1040 1315
#[2,]  230  530  830 1130 1430
#[3,]  245  570  895 1220 1545
#[4,]  260  610  960 1310 1660
#[5,]  275  650 1025 1400 1775

I think the problems are limited to the n-d case, n > 2.

@bwlewis
Copy link
Contributor

bwlewis commented Feb 15, 2017

Also, just a nit on the use of flatten above, from the Python API doc:

order : {‘C’, ‘F’, ‘A’, ‘K’}, optional
‘C’ means to flatten in row-major (C-style) order.
‘F’ means to flatten in column-major (Fortran- style) order.
‘A’ means to flatten in column-major order if a is Fortran contiguous in memory, row-major order otherwise.
‘K’ means to flatten a in the order the elements occur in memory.
The default is ‘C’.

Really, shouldn't the default be K or, arguably, A? Seems nuts to me otherwise.

@bwlewis
Copy link
Contributor

bwlewis commented Feb 15, 2017

And just to verify that your example above is not merely a print/display problem, these things are in fact different within Python. So there is a problem in the assigment of the Python objects from R?

library(reticulate)
np <- import("numpy")
a <- array(c(1:24), dim = c(2,3,4))
pf <- r_to_py(a)
pc <- py_run_string("import numpy; a = numpy.arange(1,25).reshape(2,3,4)")
pc <- py_get_attr(pc, "a")

o <- import("operator")
o$sub(pf, pc)
#, , 1
#
#     [,1] [,2] [,3]
#[1,]    0   -2   -4
#[2,]  -11  -13  -15

# ...  (should all be zero)

@bwlewis
Copy link
Contributor

bwlewis commented Feb 15, 2017

Maybe it's not us. Forget about R for a minute:

library(reticulate)
np <- import("numpy")
pf <- py_run_string("import numpy; a = numpy.reshape(numpy.arange(1,25), (2,3,4), 'F')")
pf <- py_get_attr(pf, "a")
pc <- py_run_string("import numpy; b = numpy.arange(1,25).reshape(2,3,4)")
pc <- py_get_attr(pc, "b")

o <- import("operator")
o$sub(pf, pc)
#, , 1
#
#     [,1] [,2] [,3]
#[1,]    0   -2   -4
#[2,]  -11  -13  -15   # ...  (not all zero)

EDIT: I'm wrong, and have been mis-interpreting the order flag above. It's not simply saying how to fill data structures by rows first ("C_CONTIGUOUS") or columns first ("F_CONTIGUOUS"), but also a signal to all the operators on the element order for arithmetic. In the above example, the data layout are identical in memory -- see this with pc$flatten('K') and pf$flatten('K'). Despite this, the subtract operator sees the flags and adjusts element order to match.

@bwlewis
Copy link
Contributor

bwlewis commented Feb 15, 2017

OK, getting back to the original problem, here it is in a nutshell:

library(reticulate)
np <- import("numpy")
# 2-d
a <- py_run_string("import numpy; a = numpy.arange(1, 5).reshape(2, 2)")
a <- py_get_attr(a, "a")

# 3-d not consistent
b <- py_run_string("import numpy; b = numpy.arange(1, 9).reshape(2, 2, 2)")
b <- py_get_attr(b, "b")

a
#[[1 2]
# [3 4]]
py_to_r(a)
#     [,1] [,2]
#[1,]    1    2
#[2,]    3    4

b
#[[[1 2]
#  [3 4]]
#
# [[5 6]
#  [7 8]]]
py_to_r(b)
#, , 1
#     [,1] [,2]
#[1,]    1    3
#[2,]    5    7
#
#, , 2
#     [,1] [,2]
#[1,]    2    4
#[2,]    6    8

There is an inconsistency in how C_CONTIGUOUS arrays are brought into R from Python. Things work as expected in the 2-d case, but not so in the 3-d case. This difference is unexpected.

@jjallaire
Copy link
Member Author

jjallaire commented Feb 15, 2017 via email

@bwlewis
Copy link
Contributor

bwlewis commented Feb 15, 2017 via email

@bwlewis
Copy link
Contributor

bwlewis commented Feb 18, 2017

The issue directly gets to a common source of confusion with n-d
arrays in R and Python and how they are printed and stored.
The R array you construct in the example is not the same as the
one constructed in the reference Python code, but it's really easy
to confuse them!

A lightly-edited reproduction of the reference Python code in the issue
appears below.

library(tensorflow)
np   <- import("numpy", convert=FALSE)
a    <- np$arange(1, 9)$reshape(c(2L, 2L, 2L))
b    <- np$arange(1, 5)$reshape(c(2L, 2L, 1L))
c    <- tf$matmul(tf$constant(a), tf$constant(b))
tf$Session()$run(c)

## , , 1
##      [,1] [,2]
## [1,]    5   11
## [2,]   39   53

The issue goes on to reproduce the example using R-generated arrays
as follows:

A <- list(matrix(1:4, nrow=2, byrow=T), matrix(5:8, nrow=2, byrow=T))
A <- array(unlist(A), dim=c(2,2,2))

However, already at this point we see that the R-generated array A is
not the same as the above array a by comparing a with
np$array(A) below.

However, we can see how it can be easy to make the mistake that they are the
same simply because of the way the arrays are printed! The R array looks
superficially the same as the printed Python array.

print(a)

## [[[ 1.  2.]
##   [ 3.  4.]]
## 
##  [[ 5.  6.]
##   [ 7.  8.]]]


print(np$array(A))

## [[[1 5]
##   [2 6]]
## 
##  [[3 7]
##   [4 8]]]


print(A)

## , , 1
##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4
## 
## , , 2
##      [,1] [,2]
## [1,]    5    6
## [2,]    7    8

Instead, we need to construct the R array A differently to match the
row-major order of Python, discussed in the previous sections. We can
use many approaches including:

(A <- np$array(aperm(array(1:8, c(2,2,2)), c(3,2,1))))

## [[[1 2]
##   [3 4]]
## 
##  [[5 6]
##   [7 8]]]

With similar care ordering the values in the b array
we can finish replicating the example in R (with the same
result as the reference Python example above).

A <- np$array(aperm(array(1:8, c(2,2,2)), c(3,2,1)))
B <- np$array(aperm(array(1:4, c(2,2,1)), c(2,1,3)))
C <- tf$matmul(tf$constant(A), tf$constant(B))
tf$Session()$run(C)

## , , 1
##      [,1] [,2]
## [1,]    5   11
## [2,]   39   53

An extended version of this discussion will appear in a package vignette.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants