TFServing 2.10.0 crash when trying to do inference with GPFlow model #1978

battuzz · 2022-09-19T14:45:15Z

Bug

When serving a GPFlow saved_model in TFServing 2.9+, the model server crashes

2022-09-19 09:50:19.941042: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:229] Restoring SavedModel bundle.
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
/usr/bin/tf_serving_entrypoint.sh: line 3:     7 Aborted                 (core dumped) tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"

This is related to the slice() operation inside the Kernel class. I just filled out the bug report also in the tensorflow/serving repository: tensorflow/serving#2061

To reproduce

Minimal, reproducible example

This does not work:

import tensorflow as tf
import gpflow

k = gpflow.kernels.SquaredExponential(variance=30., lengthscales=[1., 2., 3., 4., 5.])

module = tf.Module()
module.k = k
module.predict = tf.function(
    lambda x: k(x, x), 
    input_signature=[tf.TensorSpec(name='x', shape=(None,5), dtype=tf.float64)]
)

tf.saved_model.save(module, '<saved_model_location>', signatures={'predict' : module.predict})

While this one works fine:

import tensorflow as tf
import gpflow

k = gpflow.kernels.SquaredExponential(variance=30., lengthscales=[1., 2., 3., 4., 5.])

module = tf.Module()
module.k = k
module.predict = tf.function(
    lambda x: k(x, x, presliced=True), 
    input_signature=[tf.TensorSpec(name='x', shape=(None,5), dtype=tf.float64)]
)

tf.saved_model.save(module, '<saved_model_location>', signatures={'predict' : module.predict})

The only difference is presliced=True that will skip the operation x[..., slice] that will crash the model server.

To test, I run the tensorflow serving instance with the following command:

docker run --rm --name mytfserving -t  -p 9500:8500 -p 9501:8501 -v <my_saved_model_location>:/models tensorflow/serving:2.10.0 --model_config_file=/models/models.config

And with the following models.config:

model_config_list {
  config {
    name: 'mymodel'
    base_path: '/models/mymodel'
    model_platform: 'tensorflow'
    model_version_policy {
      all {}
    }
  }
}

Expected behavior

I would like to be able to serve the model on TFServing. Possibly rewriting the slice operation could work.

System information

GPflow version: 2.5.2
GPflow installed from: pypi
TensorFlow version: 2.9.2
Python version 3.8
Operating system Windows

Additional context

It seems like a regression from TF 2.9, where they dropped support to this slice() operation (for TFServing).
I've already created an issue on tensorflow/serving github page to see if this is the case or not. If so, just ignore this issue...

The text was updated successfully, but these errors were encountered:

jesnie · 2022-09-23T15:32:06Z

I'm sorry, but I have no experience with TFServing whatsoever so I don't know how I'd debug this. However GPflow is a Open Source project, and if you send me a PR with a fix I'd happily review it.

battuzz added the bug label Sep 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TFServing 2.10.0 crash when trying to do inference with GPFlow model #1978

TFServing 2.10.0 crash when trying to do inference with GPFlow model #1978

battuzz commented Sep 19, 2022

jesnie commented Sep 23, 2022

TFServing 2.10.0 crash when trying to do inference with GPFlow model #1978

TFServing 2.10.0 crash when trying to do inference with GPFlow model #1978

Comments

battuzz commented Sep 19, 2022

Bug

To reproduce

Expected behavior

System information

Additional context

jesnie commented Sep 23, 2022