Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference between R and Python output #1440

Closed
SantiagoD999 opened this issue May 4, 2024 · 7 comments
Closed

Difference between R and Python output #1440

SantiagoD999 opened this issue May 4, 2024 · 7 comments

Comments

@SantiagoD999
Copy link

SantiagoD999 commented May 4, 2024

Good morning,

I am using the excellent keras3 package in R and comparing it to the results I get using python I noticed some difference even though I am setting the same random seed and neural network architecture. The version that I am using of keras is 3.3.3 and of tensorflow is 2.16.1

library(reticulate)
library(keras3)

py_code<-"
from sklearn.model_selection import train_test_split
import numpy as np
from keras import *

np.random.seed(42)

n = 500
reg = 3
relevant_reg=3

betas = np.random.normal(loc=3, scale=2, size=(relevant_reg, 1))
X = np.random.normal(size=(n, reg))
y = X[:, :relevant_reg] @ betas + np.random.normal(size=(n,1),loc=0,scale=1)

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

utils.set_random_seed(1)
model_NN1 = Sequential([
  Input(shape=(X.shape[1],)),
  layers.Dense(100, activation='relu'),
  layers.Dense(1) 
])

model_NN1 .compile(loss='mean_squared_error', optimizer='adam')

model_NN1 .fit(x_train, y_train, epochs=100,vebose=0)

model_NN1_eval = model_NN1.evaluate(x_test, y_test)"

results_py<-py_run_string(py_code)
results_py$model_NN1_eval

x_train<-py$x_train
x_test<-py$x_test
y_test<-py$y_test
y_train<-py$y_train

x_train <- array_reshape(x_train, c(nrow(x_train), ncol(x_train)))
x_test <- array_reshape(x_test, c(nrow(x_test), ncol(x_test)))

keras3::set_random_seed(1)

model <- keras_model_sequential(input_shape = NCOL(x_train))
model |>
  layer_dense(units =100, activation = 'relu') |>
  layer_dense(units = 1)

model |> compile(
  loss = 'mean_squared_error',
  optimizer = "adam",
)

model |> fit(
  x_train, y_train,
  epochs = 100,verbose=0)

results_r<-model %>% evaluate(x_test, y_test)

results_py$model_NN1_eval
results_r$loss

Does anyone know why results_py$model_NN1_eval and results_r$loss are not the same?

Thank you very much.

@t-kalinowski
Copy link
Member

Thanks for reporting. This took a few head scratches to track down, but it ended up being really simple.

The order in which the layers are instantiated is different in R because |> does not eagerly evaluate the left hand side argument. Instead, the argument, (i.e., the expression that creates the layer) is not evaluated until the last call in the chain of arguments attempts to access it, which is after that last call has already created the layer.

We could change this behavior in keras3. (It's not an issue with %>%, only |>)

By the way, do you know keras has keras3::split_dataset()?

Here is your MRE reorganized while I was tracked down the difference.
The final result is identical now.

library(reticulate)
library(keras3)

# ---- make data ----
py_run_string(r"---(

from sklearn.model_selection import train_test_split
import numpy as np
from keras import *

np.random.seed(42)

n = 500
reg = 3
relevant_reg = 3

betas = np.random.normal(loc=3, scale=2, size=(relevant_reg, 1))
X = np.random.normal(size=(n, reg))
y = X[:, :relevant_reg] @ betas + np.random.normal(size=(n, 1), loc=0, scale=1)

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

)---")

x_train <- py_eval("x_train", convert = FALSE)
x_test  <- py_eval("x_test", convert = FALSE)
y_test  <- py_eval("y_test", convert = FALSE)
y_train <- py_eval("y_train", convert = FALSE)


# ---- model makers ----
train_py_model <- function() {

  py_run_string(r"---(
utils.clear_session()
utils.set_random_seed(1)

model = Sequential([Input(shape=(3,))])
model.add(layers.Dense(100, activation="relu"))
model.add(layers.Dense(1))

model.compile(loss="mean_squared_error", optimizer="adam")

model.fit(x_train, y_train, epochs=100, verbose=0)

result = model.evaluate(x_test, y_test, return_dict = True)['loss']
)---")$result

}


train_r_model <- function() {
  evalq(envir = globalenv(), {
    clear_session()
    set_random_seed(1)

    model <- keras_model_sequential(3) # shape(x_train)[[2]])
    model |> layer_dense(100, activation = 'relu')
    model |> layer_dense(1)
    # model |>
    #   layer_dense(100, activation = 'relu') |>
    #   layer_dense(1)

    model |> compile(loss = "mean_squared_error", optimizer = "adam")

    model |> fit(x_train, y_train, epochs = 100, verbose = 0)

    result <- model |> evaluate(x_test, y_test) |> _$loss
    model$evaluate
    result
  })
}

print(train_py_model())
print(train_r_model())

waldo::compare(serialize_keras_object(py$model),
               serialize_keras_object(model))

@t-kalinowski
Copy link
Member

t-kalinowski commented May 5, 2024

The behavior of layer_ functions within |> chains is changed in main now.

These two snippets produce identical models now, creating the underlying layers in the same order:

model <- keras_model_sequential(3) 
model |> layer_dense(100, activation = 'relu')
model |> layer_dense(1)
model <-  keras_model_sequential(3) |> 
  layer_dense(100, activation = 'relu') |>
  layer_dense(1)

Thanks for reporting!

@SantiagoD999
Copy link
Author

SantiagoD999 commented May 5, 2024

Thank you very much for your reply. I tried running

model <- keras_model_sequential(3) 
model |> layer_dense(100, activation = 'relu')
model |> layer_dense(1)

and

model <- keras_model_sequential(3) 
model |> layer_dense(100, activation = 'relu')
model |> layer_dense(1)

In the context of

library(reticulate)
library(keras3)

# ---- make data ----
py_run_string(r"---(

from sklearn.model_selection import train_test_split
import numpy as np
from keras import *

np.random.seed(42)

n = 500
reg = 3
relevant_reg = 3

betas = np.random.normal(loc=3, scale=2, size=(relevant_reg, 1))
X = np.random.normal(size=(n, reg))
y = X[:, :relevant_reg] @ betas + np.random.normal(size=(n, 1), loc=0, scale=1)

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

)---")

x_train <- py_eval("x_train", convert = FALSE)
x_test  <- py_eval("x_test", convert = FALSE)
y_test  <- py_eval("y_test", convert = FALSE)
y_train <- py_eval("y_train", convert = FALSE)


# ---- model makers ----
train_py_model <- function() {
  
  py_run_string(r"---(
utils.clear_session()
utils.set_random_seed(1)

model = Sequential([Input(shape=(3,))])
model.add(layers.Dense(100, activation="relu"))
model.add(layers.Dense(1))

model.compile(loss="mean_squared_error", optimizer="adam")

model.fit(x_train, y_train, epochs=100, verbose=0)

result = model.evaluate(x_test, y_test, return_dict = True)['loss']
)---")$result
  
}


train_r_1_model <- function() {
    clear_session()
    set_random_seed(1)
    
    model <- keras_model_sequential(3) |> 
      layer_dense(100, activation = 'relu') %>%
      layer_dense(1)
    
    model %>% compile(loss = "mean_squared_error", optimizer = "adam")
    
    model %>% fit(x_train, y_train, epochs = 100, verbose = 0)
    
    result <- model %>% evaluate(x_test, y_test)
    model$evaluate
    result
}
train_r_2_model <- function() {
  clear_session()
  set_random_seed(1)
  
  model <- keras_model_sequential(3)
  model %>% layer_dense(100, activation = 'relu') 
  model %>%layer_dense(1)
  
  model %>% compile(loss = "mean_squared_error", optimizer = "adam")
  
  model %>% fit(x_train, y_train, epochs = 100, verbose = 0)
  
  result <- model %>% evaluate(x_test, y_test)
  model$evaluate
  result
}

print(train_py_model())
print(train_r_1_model())
print(train_r_2_model())

But the functions train_r_1_model() and train_r_2_model() do not give the same results, even though train_py_model() and train_r_2_model() are giving the same results. I also noticed that using %>% or |> does not change the results.

@t-kalinowski
Copy link
Member

t-kalinowski commented May 6, 2024

Did you update to use the development version of keras3?

remotes::install_github("rstudio/keras")

When I run the code in your last reply, I get identical loss values each time (1.095762).

If you haven't updated yet, then it's probably the stray |> remaining in train_r_1_model() that's leading to the difference.

@SantiagoD999
Copy link
Author

SantiagoD999 commented May 6, 2024

Thank you very much for your reply. When trying to install the development version I get the following error:

Using GitHub PAT from the git credential store.
Error: Failed to install 'keras' from GitHub:
  HTTP error 401.
  Bad credentials

I changed the |> for %>%, but the train_r_1_model() is producing a different result, do you know why this may be happening?

train_r_1_model <- function() {
  clear_session()
  set_random_seed(1)
  
  model <- keras_model_sequential(3) %>%
    layer_dense(100, activation = 'relu') %>%
    layer_dense(1)
  
  model %>% compile(loss = "mean_squared_error", optimizer = "adam")
  
  model %>% fit(x_train, y_train, epochs = 100, verbose = 0)
  
  result <- model %>% evaluate(x_test, y_test)
  model$evaluate
  result
}


'print(train_r_1_model())'

 1.081783

@t-kalinowski
Copy link
Member

If you're unable to install the dev version, then in the interim, you'll need to avoid long pipe chains and do something like this:

model <- keras_model_sequential(3)
model |> layer_dense(100, activation = 'relu') 
model |> layer_dense(1)

However, I would recommend fixing your setup so you can install the development version via remotes::install_github(). It should work by default. My guess is that you have an expired token in your git credential store.

?usethis::create_github_token is a great place to start troubleshooting this.

@SantiagoD999
Copy link
Author

Thank you very much for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants