Difference between R and Python output #1440

SantiagoD999 · 2024-05-04T18:57:56Z

Good morning,

I am using the excellent keras3 package in R and comparing it to the results I get using python I noticed some difference even though I am setting the same random seed and neural network architecture. The version that I am using of keras is 3.3.3 and of tensorflow is 2.16.1

library(reticulate)
library(keras3)

py_code<-"
from sklearn.model_selection import train_test_split
import numpy as np
from keras import *

np.random.seed(42)

n = 500
reg = 3
relevant_reg=3

betas = np.random.normal(loc=3, scale=2, size=(relevant_reg, 1))
X = np.random.normal(size=(n, reg))
y = X[:, :relevant_reg] @ betas + np.random.normal(size=(n,1),loc=0,scale=1)

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

utils.set_random_seed(1)
model_NN1 = Sequential([
  Input(shape=(X.shape[1],)),
  layers.Dense(100, activation='relu'),
  layers.Dense(1) 
])

model_NN1 .compile(loss='mean_squared_error', optimizer='adam')

model_NN1 .fit(x_train, y_train, epochs=100,vebose=0)

model_NN1_eval = model_NN1.evaluate(x_test, y_test)"

results_py<-py_run_string(py_code)
results_py$model_NN1_eval

x_train<-py$x_train
x_test<-py$x_test
y_test<-py$y_test
y_train<-py$y_train

x_train <- array_reshape(x_train, c(nrow(x_train), ncol(x_train)))
x_test <- array_reshape(x_test, c(nrow(x_test), ncol(x_test)))

keras3::set_random_seed(1)

model <- keras_model_sequential(input_shape = NCOL(x_train))
model |>
  layer_dense(units =100, activation = 'relu') |>
  layer_dense(units = 1)

model |> compile(
  loss = 'mean_squared_error',
  optimizer = "adam",
)

model |> fit(
  x_train, y_train,
  epochs = 100,verbose=0)

results_r<-model %>% evaluate(x_test, y_test)

results_py$model_NN1_eval
results_r$loss

Does anyone know why results_py$model_NN1_eval and results_r$loss are not the same?

Thank you very much.

t-kalinowski · 2024-05-05T15:21:13Z

Thanks for reporting. This took a few head scratches to track down, but it ended up being really simple.

The order in which the layers are instantiated is different in R because |> does not eagerly evaluate the left hand side argument. Instead, the argument, (i.e., the expression that creates the layer) is not evaluated until the last call in the chain of arguments attempts to access it, which is after that last call has already created the layer.

We could change this behavior in keras3. (It's not an issue with %>%, only |>)

By the way, do you know keras has keras3::split_dataset()?

Here is your MRE reorganized while I was tracked down the difference.
The final result is identical now.

library(reticulate)
library(keras3)

# ---- make data ----
py_run_string(r"---(

from sklearn.model_selection import train_test_split
import numpy as np
from keras import *

np.random.seed(42)

n = 500
reg = 3
relevant_reg = 3

betas = np.random.normal(loc=3, scale=2, size=(relevant_reg, 1))
X = np.random.normal(size=(n, reg))
y = X[:, :relevant_reg] @ betas + np.random.normal(size=(n, 1), loc=0, scale=1)

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

)---")

x_train <- py_eval("x_train", convert = FALSE)
x_test  <- py_eval("x_test", convert = FALSE)
y_test  <- py_eval("y_test", convert = FALSE)
y_train <- py_eval("y_train", convert = FALSE)


# ---- model makers ----
train_py_model <- function() {

  py_run_string(r"---(
utils.clear_session()
utils.set_random_seed(1)

model = Sequential([Input(shape=(3,))])
model.add(layers.Dense(100, activation="relu"))
model.add(layers.Dense(1))

model.compile(loss="mean_squared_error", optimizer="adam")

model.fit(x_train, y_train, epochs=100, verbose=0)

result = model.evaluate(x_test, y_test, return_dict = True)['loss']
)---")$result

}


train_r_model <- function() {
  evalq(envir = globalenv(), {
    clear_session()
    set_random_seed(1)

    model <- keras_model_sequential(3) # shape(x_train)[[2]])
    model |> layer_dense(100, activation = 'relu')
    model |> layer_dense(1)
    # model |>
    #   layer_dense(100, activation = 'relu') |>
    #   layer_dense(1)

    model |> compile(loss = "mean_squared_error", optimizer = "adam")

    model |> fit(x_train, y_train, epochs = 100, verbose = 0)

    result <- model |> evaluate(x_test, y_test) |> _$loss
    model$evaluate
    result
  })
}

print(train_py_model())
print(train_r_model())

waldo::compare(serialize_keras_object(py$model),
               serialize_keras_object(model))

t-kalinowski · 2024-05-05T15:47:56Z

The behavior of layer_ functions within |> chains is changed in main now.

These two snippets produce identical models now, creating the underlying layers in the same order:

model <- keras_model_sequential(3) 
model |> layer_dense(100, activation = 'relu')
model |> layer_dense(1)

model <-  keras_model_sequential(3) |> 
  layer_dense(100, activation = 'relu') |>
  layer_dense(1)

Thanks for reporting!

SantiagoD999 · 2024-05-05T22:32:01Z

Thank you very much for your reply. I tried running

model <- keras_model_sequential(3) 
model |> layer_dense(100, activation = 'relu')
model |> layer_dense(1)

and

model <- keras_model_sequential(3) 
model |> layer_dense(100, activation = 'relu')
model |> layer_dense(1)

In the context of

library(reticulate)
library(keras3)

# ---- make data ----
py_run_string(r"---(

from sklearn.model_selection import train_test_split
import numpy as np
from keras import *

np.random.seed(42)

n = 500
reg = 3
relevant_reg = 3

betas = np.random.normal(loc=3, scale=2, size=(relevant_reg, 1))
X = np.random.normal(size=(n, reg))
y = X[:, :relevant_reg] @ betas + np.random.normal(size=(n, 1), loc=0, scale=1)

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

)---")

x_train <- py_eval("x_train", convert = FALSE)
x_test  <- py_eval("x_test", convert = FALSE)
y_test  <- py_eval("y_test", convert = FALSE)
y_train <- py_eval("y_train", convert = FALSE)


# ---- model makers ----
train_py_model <- function() {
  
  py_run_string(r"---(
utils.clear_session()
utils.set_random_seed(1)

model = Sequential([Input(shape=(3,))])
model.add(layers.Dense(100, activation="relu"))
model.add(layers.Dense(1))

model.compile(loss="mean_squared_error", optimizer="adam")

model.fit(x_train, y_train, epochs=100, verbose=0)

result = model.evaluate(x_test, y_test, return_dict = True)['loss']
)---")$result
  
}


train_r_1_model <- function() {
    clear_session()
    set_random_seed(1)
    
    model <- keras_model_sequential(3) |> 
      layer_dense(100, activation = 'relu') %>%
      layer_dense(1)
    
    model %>% compile(loss = "mean_squared_error", optimizer = "adam")
    
    model %>% fit(x_train, y_train, epochs = 100, verbose = 0)
    
    result <- model %>% evaluate(x_test, y_test)
    model$evaluate
    result
}
train_r_2_model <- function() {
  clear_session()
  set_random_seed(1)
  
  model <- keras_model_sequential(3)
  model %>% layer_dense(100, activation = 'relu') 
  model %>%layer_dense(1)
  
  model %>% compile(loss = "mean_squared_error", optimizer = "adam")
  
  model %>% fit(x_train, y_train, epochs = 100, verbose = 0)
  
  result <- model %>% evaluate(x_test, y_test)
  model$evaluate
  result
}

print(train_py_model())
print(train_r_1_model())
print(train_r_2_model())

But the functions train_r_1_model() and train_r_2_model() do not give the same results, even though train_py_model() and train_r_2_model() are giving the same results. I also noticed that using %>% or |> does not change the results.

t-kalinowski · 2024-05-06T12:52:11Z

Did you update to use the development version of keras3?

remotes::install_github("rstudio/keras")

When I run the code in your last reply, I get identical loss values each time (1.095762).

If you haven't updated yet, then it's probably the stray |> remaining in train_r_1_model() that's leading to the difference.

SantiagoD999 · 2024-05-06T15:47:16Z

Thank you very much for your reply. When trying to install the development version I get the following error:

Using GitHub PAT from the git credential store.
Error: Failed to install 'keras' from GitHub:
  HTTP error 401.
  Bad credentials

I changed the |> for %>%, but the train_r_1_model() is producing a different result, do you know why this may be happening?

train_r_1_model <- function() {
  clear_session()
  set_random_seed(1)
  
  model <- keras_model_sequential(3) %>%
    layer_dense(100, activation = 'relu') %>%
    layer_dense(1)
  
  model %>% compile(loss = "mean_squared_error", optimizer = "adam")
  
  model %>% fit(x_train, y_train, epochs = 100, verbose = 0)
  
  result <- model %>% evaluate(x_test, y_test)
  model$evaluate
  result
}


'print(train_r_1_model())'

 1.081783

t-kalinowski · 2024-05-06T16:13:38Z

If you're unable to install the dev version, then in the interim, you'll need to avoid long pipe chains and do something like this:

model <- keras_model_sequential(3)
model |> layer_dense(100, activation = 'relu') 
model |> layer_dense(1)

However, I would recommend fixing your setup so you can install the development version via remotes::install_github(). It should work by default. My guess is that you have an expired token in your git credential store.

?usethis::create_github_token is a great place to start troubleshooting this.

SantiagoD999 · 2024-05-06T16:17:13Z

Thank you very much for your reply.

t-kalinowski closed this as completed in 210478a May 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between R and Python output #1440

Difference between R and Python output #1440

SantiagoD999 commented May 4, 2024 •

edited by t-kalinowski

t-kalinowski commented May 5, 2024

t-kalinowski commented May 5, 2024 •

edited

SantiagoD999 commented May 5, 2024 •

edited by t-kalinowski

t-kalinowski commented May 6, 2024 •

edited

SantiagoD999 commented May 6, 2024 •

edited by t-kalinowski

t-kalinowski commented May 6, 2024

SantiagoD999 commented May 6, 2024

Difference between R and Python output #1440

Difference between R and Python output #1440

Comments

SantiagoD999 commented May 4, 2024 • edited by t-kalinowski

t-kalinowski commented May 5, 2024

t-kalinowski commented May 5, 2024 • edited

SantiagoD999 commented May 5, 2024 • edited by t-kalinowski

t-kalinowski commented May 6, 2024 • edited

SantiagoD999 commented May 6, 2024 • edited by t-kalinowski

t-kalinowski commented May 6, 2024

SantiagoD999 commented May 6, 2024

SantiagoD999 commented May 4, 2024 •

edited by t-kalinowski

t-kalinowski commented May 5, 2024 •

edited

SantiagoD999 commented May 5, 2024 •

edited by t-kalinowski

t-kalinowski commented May 6, 2024 •

edited

SantiagoD999 commented May 6, 2024 •

edited by t-kalinowski