New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
using a custom layer, loaded_model cannot give the same predicted value compared with original model [results not reproducible] #64
Comments
@h44632186 Sorry for the late response! |
@sushreebarsa Thank you for your response. The gist you posted is exactly the same as the ticket I mentioned. |
@sushreebarsa Btw, I have already identified it is the custom layer that caused the above problem. If I replace the custom layer with an original Dense layer (referring to the code), both the original model and the loaded_model gave the same results. So I think the problem is caused by the custom layer. |
Hi @h44632186 , |
Hi @SuryanarayanaY , I have checked the gist you posted, and confirmed it is correct.
|
@SuryanarayanaY Hi, do you have any suggestions to solve this issue? |
I think the issue here is that in your custom layer, you are creating a new dense layer every time you call the model. So each new call to the model will create a new dense layer with new random weights. This will mean you often totally lose the state of your model (and might get extra confusing with function tracing, where your traced model will not create new weights). The recommended approach would be to not create variables or layers inside call like this. If you change your custom layer as below, everything works. @tf.keras.utils.register_keras_serializable()
class Custom_Layer(tf.keras.layers.Layer):
def __init__(self, units, **kwargs):
super(Custom_Layer, self).__init__(**kwargs)
self.units = units
self.dense = tf.keras.layers.Dense(self.units)
def call(self, x):
x = self.dense(x)
return x
def get_config(self):
config = super(Custom_Layer, self).get_config()
config.update(units = self.units)
return config https://keras.io/guides/making_new_layers_and_models_via_subclassing/ has an in depth guide. |
Hi @h44632186 , As mentioned in above comment, when using sub_classing of models or layers you need to define the layers inside the constructor to make them serializable and stateful. I have modified your code as suggested above and now both models are able to reproduce same outputs. Please refer to attached gist. Thanks |
@SuryanarayanaY , The constructor here has a layer but still the model is is serializable without explicit serialization and deserialization in get_config() and from_config() methods. As per documentation the layer should be serialized and de-serialized in get_config() and from_config() methods right? What am I missing here ? |
This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you. |
This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further. |
System Info
Current Behaviour:
I implemented a custom layer, and use this layer to build a model. After training it, the original model gave a predicted value A, and the original model is saved as a h5 file. Then I load model from the h5 file, but the loaded model gave a different predicted value B, which means the results are not reproducible. Normally, the two models are supposed to give the same predicted value.
The custom layer is as simple as a Dense layer, and I have already locate that the custom layer caused the above problem. Actually, if I comment the line of custom layer, and uncomment the line below it (which is an original tf Dense layer), both the original model and the loaded_model gave the same results.
Standalone code to reproduce the issue
https://colab.research.google.com/drive/19_8DqzfC2JadKM9ZykJRLDEcxEkzjrT7?usp=sharing
Can also refer to the link where the issue is firstly posted: tensorflow/tensorflow#59041
Relevant log output
2022-12-29 10:33:52.034627: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-12-29 10:33:53.136713: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-12-29 10:33:53.138072: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-12-29 10:33:53.198743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:3d:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-12-29 10:33:53.198834: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-12-29 10:33:53.203166: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-12-29 10:33:53.203273: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-12-29 10:33:53.204328: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-12-29 10:33:53.204657: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-12-29 10:33:53.209031: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-12-29 10:33:53.209857: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-12-29 10:33:53.210030: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-12-29 10:33:53.212456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-12-29 10:33:53.335970: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-29 10:33:53.348062: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-12-29 10:33:53.349549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:3d:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-12-29 10:33:53.349604: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-12-29 10:33:53.349644: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-12-29 10:33:53.349654: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-12-29 10:33:53.349664: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-12-29 10:33:53.349673: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-12-29 10:33:53.349682: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-12-29 10:33:53.349691: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-12-29 10:33:53.349701: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-12-29 10:33:53.352041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-12-29 10:33:53.352076: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-12-29 10:33:53.873305: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-12-29 10:33:53.873357: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0
2022-12-29 10:33:53.873366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N
2022-12-29 10:33:53.877072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22430 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:3d:00.0, compute capability: 8.6)
/usr/local/lib64/python3.8/site-packages/tensorflow/python/data/ops/dataset_ops.py:3503: UserWarning: Even though the tf.config.experimental_run_functions_eagerly option is set, this option does not apply to tf.data functions. tf.data functions are still traced and executed as graphs.
warnings.warn(
2022-12-29 10:33:53.990568: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2022-12-29 10:33:53.997553: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2600000000 Hz
2022-12-29 10:33:54.015124: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-12-29 10:33:54.679072: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-12-29 10:33:54.679260: I tensorflow/stream_executor/cuda/cuda_blas.cc:1838] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
625/625 [==============================] - 5s 7ms/step - loss: 0.4327
[[0.44245386]
[0.6916534 ]
[0.49306032]
[0.98741436]
[0.7631112 ]]
[[0.48893338]
[0.54947186]
[0.40105245]
[0.56347597]
[0.32270208]]
The text was updated successfully, but these errors were encountered: