Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server not working on model which works in Docker Hub Container; inconsistent with Google Release #10

Closed
rummens opened this issue Mar 22, 2019 · 14 comments

Comments

Projects
None yet
3 participants
@rummens
Copy link

commented Mar 22, 2019

Description

I want to deploy the Bitnami TensorFlow Serving image from the RedHat Registry .

First of all there is no way to understand which tag correspond to which release from Google on the Docker Hub. Which is the currently latest stable release from Google? It seems that you only publish the nightly builds, as a new image gets pushed every day?

In addition, the model server on multiple RedHat images seems broken, since I have the same model deployed in the Bitnami container and in the official Google one from Docker Hub. The model works fine in the Google container but cannot be queried in the bitnami one. Although it picks up the model and serves it, when quering it the return is an init error (depending on the container version it cannot init a different layer):

{
"error": "Attempting to use uninitialized value bert/embeddings/LayerNorm/gamma\n\t [[{{node bert/embeddings/LayerNorm/gamma/read}}]]"
}

Why is that? Is there a way to figure out which image on the RedHat Registry is the based on the current stable/latest release from Docker Hub?

Steps to reproduce the issue:

  1. Deploy Model in RedHat container
  2. Make POST inference -> weird init errors
  3. Make same POST against Docker Hub container -> works

Describe the results you received:

{
    "error": "Attempting to use uninitialized value bert/embeddings/LayerNorm/gamma\n\t [[{{node bert/embeddings/LayerNorm/gamma/read}}]]"
}

Describe the results you expected:

Normal Output of model, no error.

Additional information you deem important (e.g. issue happens only occasionally):

I tried differnet versions of the redhat image (e.g. 1.12 and 1.13), the kind of error is the same but the layer in which in happens changes. So instead of an error in layer 7, it appears in layer 1 for example.

Version

Running in OpenShift 3.4.

Image Version:

RedHat:
1.13.0-rhel-7-r14

Docker:
latest -> 1.13.0

Model Server version inside of both images:

RedHat:
TensorFlow ModelServer: 1.13.0-rc1+dev.sha.c2fe59c
TensorFlow Library: 1.13.1

Docker:
TensorFlow ModelServer: 1.13.0-rc1+dev.sha.f16e7778
TensorFlow Library: 1.13.1
@javsalgar

This comment has been minimized.

Copy link
Contributor

commented Mar 25, 2019

Hi,

Thanks for letting us know. When you mean the image from Google, do you mean this image? https://hub.docker.com/r/tensorflow/tensorflow.

What we do is download the tensorflow source code, compile it and with it we build our containers. We update the containers every day so it has the latest system packages (so we avoid as much as possible CVE issues).

Regarding the init errors, I would like to take a look, but I'm no Tensorflow expert. Could you let me know a set of commands that I can use to reproduce the issue?

@rummens

This comment has been minimized.

Copy link
Author

commented Mar 25, 2019

Yes I use that image.

I have create a repo with the required model files: https://drive.google.com/open?id=1cjRSbzMqWKVpF9ERu77NHEldxvlplp9K. Deploy the model as usual and then make a POST to this endpoint with the following data:

HOST:PORT/v1/models/bert:predict

{"instances": [{"input_mask": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "input_ids": [101, 2040, 22558, 2250, 22199, 1029, 102, 20901, 2038, 2088, 2877, 3231, 4128, 2408, 2885, 1010, 5214, 1997, 5604, 1998, 6042, 1037, 2898, 2846, 1997, 3941, 1998, 5693, 2013, 2948, 2000, 7733, 9819, 1012, 2224, 2256, 9414, 3945, 2000, 2156, 2065, 2045, 2003, 1037, 4322, 2008, 6010, 2115, 5918, 1012, 2250, 22199, 12939, 3231, 4322, 5608, 2000, 5326, 2037, 4128, 1010, 3941, 1998, 5911, 11532, 2408, 1996, 20901, 2177, 1998, 6327, 6304, 1012, 2250, 22199, 2036, 12939, 3231, 4128, 2000, 23569, 27605, 4371, 1996, 27891, 1997, 2037, 3231, 19571, 2083, 3941, 6631, 1998, 2578, 1012, 4821, 1010, 2250, 22199, 3640, 2151, 4022, 5310, 1997, 2019, 20901, 3231, 4322, 1010, 1996, 5770, 1997, 2256, 2440, 11532, 2006, 1996, 2087, 7218, 1998, 2800, 2965, 2000, 3231, 1010, 9398, 3686, 2030, 8292, 28228, 12031, 1037, 5310, 1521, 1055, 4031, 2006, 2051, 1998, 5166, 1012, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "segment_ids": [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "unique_ids": 1000000000}, {"input_mask": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "input_ids": [101, 2040, 22558, 2250, 22199, 1029, 102, 4821, 1010, 2250, 22199, 3640, 2151, 4022, 5310, 1997, 2019, 20901, 3231, 4322, 1010, 1996, 5770, 1997, 2256, 2440, 11532, 2006, 1996, 2087, 7218, 1998, 2800, 2965, 2000, 3231, 1010, 9398, 3686, 2030, 8292, 28228, 12031, 1037, 5310, 1521, 1055, 4031, 2006, 2051, 1998, 5166, 1012, 2250, 22199, 2003, 1996, 2691, 4132, 2306, 1996, 2177, 2029, 3084, 2009, 3722, 2000, 2424, 2019, 20901, 4256, 2040, 6753, 1037, 3327, 5604, 4023, 1012, 2009, 7861, 11452, 2015, 4322, 5608, 2000, 4503, 2037, 5906, 1010, 2037, 2897, 1998, 2449, 4346, 1996, 4555, 16991, 1998, 3754, 2000, 2491, 2037, 2951, 1012, 1012, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "segment_ids": [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "unique_ids": 1000000001}]}
@javsalgar

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

Hi,

Could you be a bit more specific in the commands to deploy that model? I'd like to test it but I haven't worked with Tensorflow in the past.

@rummens

This comment has been minimized.

Copy link
Author

commented Mar 26, 2019

Sure, most of the information is already on this repo's readme: Link

You can do it locally but since we are running in OpenShift, maybe you should do it too. Not sure :-)

After pulling the image do the following:

  1. Persist the configuration as described in the readme above (Section "Persisting your configuration").
  2. Copy all the files I gave you (folder bert) into the folder you connected to your docker container (this will make them available inside of the container under the path /bitnami)
  3. Change the config file inside of the container as described in the readme above (Section "Configuration file").
    This is the config I use (you might have to adapt the base_path to your settings):
model_config_list: {
config: {
name: "bert",
base_path: "/bitnami/tensorflow-serving/models/bert/",
model_platform: "tensorflow",
}
}
  1. Make sure you restart the container to reinit the model server.
  2. Have a look at the logs of the container, it should not say something about a not able to find the model but something like this:
2019-03-21 17:09:21.246960: I tensorflow_serving/model_servers/server.cc:82] Building single TensorFlow model file config:  model_name: bert model_base_path: /models/bert
2019-03-21 17:09:21.247218: I tensorflow_serving/model_servers/server_core.cc:461] Adding/updating models.
2019-03-21 17:09:21.247237: I tensorflow_serving/model_servers/server_core.cc:558]  (Re-)adding model: bert
2019-03-21 17:09:21.348790: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: bert version: 2}
2019-03-21 17:09:21.348829: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: bert version: 2}
2019-03-21 17:09:21.348844: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: bert version: 2}
2019-03-21 17:09:21.348867: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /models/bert/2
2019-03-21 17:09:21.348878: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/bert/2
2019-03-21 17:09:21.361377: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-03-21 17:09:21.375669: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-03-21 17:09:21.425810: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:182] Restoring SavedModel bundle.
2019-03-21 17:09:21.796603: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:132] Running initialization op on SavedModel bundle.
2019-03-21 17:09:21.834680: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:285] SavedModel load for tags { serve }; Status: success. Took 485775 microseconds.
2019-03-21 17:09:21.834794: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:101] No warmup data file found at /models/bert/2/assets.extra/tf_serving_warmup_requests
2019-03-21 17:09:21.835215: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: bert version: 2}
2019-03-21 17:09:21.857150: I tensorflow_serving/model_servers/server.cc:313] Running gRPC ModelServer at 0.0.0.0:8500 ...
[evhttp_server.cc : 237] RAW: Entering the event loop ...
2019-03-21 17:09:21.864033: I tensorflow_serving/model_servers/server.cc:333] Exporting HTTP/REST API at:localhost:8501 ...

Proceed with the steps from my last post to get the error.

@rummens

This comment has been minimized.

Copy link
Author

commented Apr 2, 2019

Any updates on this?

@miguelaeh

This comment has been minimized.

Copy link

commented Apr 8, 2019

Hi @rummens,
I have followed your steps, but I am not able to get the error.
I have copied your files into the /bitnami folder and I have used your configuration file, changing the base_path to mine. I post the Tensorflow logs here:

tensorflow-serving_1  |
tensorflow-serving_1  | Welcome to the Bitnami tensorflow-serving container
tensorflow-serving_1  | Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-tensorflow-serving
tensorflow-serving_1  | Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-tensorflow-serving/issues
tensorflow-serving_1  |
tensorflow-serving_1  | nami    INFO  Initializing tensorflow-serving
tensorflow-serving_1  | nami    INFO  tensorflow-serving successfully initialized
tensorflow-serving_1  | INFO  ==> Starting tensorflow-serving...
tensorflow-serving_1  | TensorFlow Serving is not running... starting server in background mode
tensorflow-serving_1  | 2019-04-08 14:48:25.016426: I tensorflow_serving/model_servers/server_core.cc:461] Adding/updating models.
tensorflow-serving_1  | 2019-04-08 14:48:25.016522: I tensorflow_serving/model_servers/server_core.cc:558]  (Re-)adding model: bert
tensorflow-serving_1  | 2019-04-08 14:48:25.116906: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: bert version: 2}
tensorflow-serving_1  | 2019-04-08 14:48:25.116960: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: bert version: 2}
tensorflow-serving_1  | 2019-04-08 14:48:25.116988: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: bert version: 2}
tensorflow-serving_1  | 2019-04-08 14:48:25.117023: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /bitnami/bert/2
tensorflow-serving_1  | 2019-04-08 14:48:25.117046: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /bitnami/bert/2
tensorflow-serving_1  | 2019-04-08 14:48:25.133610: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
tensorflow-serving_1  | 2019-04-08 14:48:25.150091: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
tensorflow-serving_1  | 2019-04-08 14:48:25.214138: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:182] Restoring SavedModel bundle.
tensorflow-serving_1  | 2019-04-08 14:48:25.609770: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:132] Running initialization op on SavedModel bundle.
tensorflow-serving_1  | 2019-04-08 14:48:25.654727: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:285] SavedModel load for tags { serve }; Status: success. Took 537658 microseconds.
tensorflow-serving_1  | 2019-04-08 14:48:25.654839: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:101] No warmup data file found at /bitnami/bert/2/assets.extra/tf_serving_warmup_requests
tensorflow-serving_1  | 2019-04-08 14:48:25.654984: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: bert version: 2}
tensorflow-serving_1  | 2019-04-08 14:48:25.667093: I tensorflow_serving/model_servers/server.cc:313] Running gRPC ModelServer at 0.0.0.0:8500 ...
tensorflow-serving_1  | [warn] getaddrinfo: address family for nodename not supported
tensorflow-serving_1  | [evhttp_server.cc : 237] RAW: Entering the event loop ...
tensorflow-serving_1  | 2019-04-08 14:48:25.674672: I tensorflow_serving/model_servers/server.cc:333] Exporting HTTP/REST API at:localhost:8501 ...

After that, I did a POST with the data you posted. And I get the following response:

{
    "predictions": [
        {
            "end_logits": [1.88635, -7.71321, -7.67642, -8.25464, -5.21849, -7.37146, -6.47271, 0.49895, -7.63512, -7.41713, -6.46705, -6.89572, -4.2181, -7.86675, -3.47279, -6.13398, -8.34171, -7.81871, -7.11278, -8.01439, -7.09987, -7.9371, -7.98266, -7.421, -7.69318, -6.76245, -7.68158, -6.1028, -7.9786, -6.87365, -7.6551, -7.12621, -5.04349, -6.34223, -8.15959, -8.17065, -8.12821, -6.02864, -7.89356, -7.79615, -7.91913, -7.95742, -7.7102, -8.07146, -7.33924, -7.90614, -7.86938, -7.49437, -5.70677, -5.44679, -8.17311, -6.11704, -7.95962, -6.01986, -6.06159, -1.49583, -7.56889, -7.4132, -7.12605, -6.68924, -7.08996, -6.8845, -7.60274, -6.83489, -5.28159, -7.22466, -7.42371, -1.97366, -1.59171, -7.37733, -6.06673, -2.25972, -4.60752, -8.52935, -6.78608, -7.60115, -8.12158, -7.19667, -4.60371, -7.80494, -8.15132, -7.5587, -6.97777, -8.16885, -7.27163, -7.27105, -7.43738, -7.16989, -5.1173, -7.87953, -7.49218, -6.58547, -7.45253, -5.18924, -4.66551, -7.57856, -6.63115, -8.49946, -6.09317, -8.00419, -7.07554, -7.17987, -5.73054, -7.22227, -6.68655, -2.27478, -6.47436, -3.94676, -6.21166, -8.15075, -7.89853, -7.62918, -7.66006, -7.6437, -6.42169, -7.556, -8.12139, -7.88107, -7.47042, -7.83188, -7.16718, -6.51117, -7.56302, -6.8062, -7.28315, -8.11626, -6.73703, -7.46329, -8.36728, -7.53583, -6.63665, -7.22987, -6.83201, -7.35009, -6.97935, -5.8814, -7.86263, -7.5484, -7.79917, -4.63619, -5.05621, -6.18158, -8.0197, -8.0098, -8.00065, -8.00563, -7.99471, -7.96485, -8.01101, -8.02766, -7.98552, -7.99756, -8.02606, -8.02027, -8.00464, -8.01293, -7.99578, -8.01951, -8.03921, -8.06384, -8.01115, -8.05136, -8.03646, -8.06708, -8.05742, -8.01833, -8.05162, -8.03772, -8.01131, -8.03326, -8.03416, -8.02604, -8.02558, -8.03628, -8.04898, -8.02824, -8.0242, -8.03927, -8.04013, -8.05176, -8.01941, -8.04782, -8.07533, -8.06134, -8.02841, -8.00807, -8.0653, -8.04339, -8.0472, -8.05569, -8.02819, -8.03766, -8.03592, -8.06554, -8.01616, -8.00958, -8.0352, -8.02069, -8.02544, -8.01891],
            "start_logits": [1.89395, -6.17103, -5.71777, -6.34353, -8.69349, -6.336, -7.11741, 1.11222, -6.29514, -4.62471, -7.63364, -4.95477, -6.67329, -6.67357, -5.05092, -7.26321, -6.27973, -7.77083, -6.24084, -7.70391, -7.10408, -7.12483, -7.12772, -7.75358, -7.975, -6.80516, -8.01904, -7.00503, -6.68381, -5.6422, -7.65209, -6.26765, -7.5305, -6.89652, -6.49628, -6.2252, -6.01721, -7.51752, -7.40349, -7.43719, -7.36661, -7.16736, -7.98579, -7.023, -7.12836, -7.76345, -7.28101, -7.52805, -7.34341, -6.19979, -3.6709, -7.55136, -6.47709, -2.78851, -6.94207, -4.01396, -6.99115, -6.69709, -7.85423, -7.31325, -8.52707, -7.3206, -8.03917, -6.73382, -7.38379, -5.9956, -5.01767, -0.458483, -5.08685, -7.95153, -4.45515, -5.68045, -7.13676, -4.6458, -7.83034, -7.7752, -6.85525, -5.15395, -6.57467, -7.20129, -7.00695, -8.05484, -8.3622, -7.35165, -7.32059, -8.43227, -7.34679, -5.52198, -6.91232, -7.31733, -6.0772, -7.68389, -8.36163, -7.65515, -7.37059, -6.95018, -7.23477, -4.33394, -7.70486, -6.75653, -4.11418, -6.63958, -7.21764, -8.12978, -5.75854, -1.50274, -6.63868, -7.42107, -7.58481, -6.07518, -6.80728, -7.96151, -6.33652, -6.46055, -6.80299, -7.73459, -6.86747, -6.9004, -7.3938, -7.97764, -7.68548, -7.92056, -7.77384, -7.2315, -8.36081, -7.4228, -8.63137, -8.08261, -7.17473, -8.27812, -8.50419, -7.62375, -6.96657, -8.30429, -8.24775, -7.40648, -7.35741, -6.90331, -8.0297, -6.76656, -7.58663, -7.21984, -8.05904, -8.0479, -8.07065, -8.05414, -8.06583, -8.09119, -8.05842, -8.03403, -8.0695, -8.0603, -8.05814, -8.05401, -8.06088, -8.05724, -8.07478, -8.06466, -8.04992, -8.0292, -8.07227, -8.04891, -8.06145, -8.03724, -8.03422, -8.0588, -8.04744, -8.06347, -8.08017, -8.05346, -8.05682, -8.06, -8.05593, -8.05017, -8.03559, -8.04371, -8.04633, -8.04184, -8.03873, -8.02622, -8.04819, -8.04433, -8.02308, -8.03243, -8.0554, -8.07409, -8.03549, -8.05544, -8.04929, -8.04327, -8.07355, -8.06391, -8.0601, -8.04202, -8.07568, -8.08226, -8.05762, -8.0697, -8.05561, -8.07516],
            "unique_ids": 1000000000
        },
        {
            "end_logits": [1.7971, -7.69476, -7.69979, -8.24493, -5.12942, -7.20498, -6.29411, -6.89877, -5.51851, -8.22334, -4.99419, -7.64681, -6.64564, -7.02387, -5.21917, -7.2014, -6.298, -1.74696, -4.97182, -2.79664, -5.43804, -7.81159, -7.6644, -7.69437, -7.1555, -7.42649, -5.77371, -7.44204, -7.99793, -7.89191, -7.51014, -7.96775, -7.11213, -6.68531, -7.50293, -6.598, -7.39869, -8.23903, -6.83253, -7.77239, -8.47055, -7.6694, -6.71383, -7.36133, -7.18606, -7.37836, -7.05179, -5.75529, -7.84509, -7.53321, -7.98055, -4.29564, -3.2521, -7.35999, -5.06275, -8.02504, -7.87654, -6.49647, -5.72829, -6.81087, -7.5094, -3.43106, -7.41876, -8.24059, -8.23543, -7.26439, -7.75959, -6.60055, -4.67576, -2.44382, -0.685445, -4.85748, -5.33752, -7.54303, -7.29981, -5.9717, -4.58172, -4.59226, -7.92281, -8.40496, -8.04877, -6.95516, -5.40954, -2.68969, -7.69513, -7.19537, -7.3715, -5.55237, -6.60155, -7.90538, -6.07703, -7.55561, -4.5476, -7.7187, -8.13389, -7.94818, -7.15718, -7.68567, -7.34945, -7.54567, -7.14372, -7.16989, -5.5085, -5.0984, -5.9343, -5.93809, -7.93428, -7.94276, -7.98788, -7.98569, -7.98563, -8.07888, -8.02819, -8.01415, -8.01683, -7.98067, -7.99473, -8.00823, -7.9599, -7.97134, -7.95014, -7.93278, -8.03036, -7.98561, -7.96053, -8.02014, -7.98149, -8.05193, -8.07696, -8.03016, -8.01335, -8.03546, -8.02516, -8.07061, -8.03722, -8.01404, -8.07952, -8.04804, -8.01867, -8.05968, -8.04412, -8.02413, -8.02588, -8.0363, -8.00355, -8.01281, -8.00063, -7.95522, -8.01383, -8.01509, -7.97682, -7.96553, -8.00064, -7.98441, -7.9882, -7.99824, -7.96236, -7.97507, -8.00692, -8.0433, -7.97083, -7.9887, -7.96293, -8.04843, -7.95422, -7.90545, -7.96051, -7.9667, -7.94074, -8.02239, -8.00472, -7.9908, -8.02842, -8.03611, -8.04443, -8.01002, -8.02953, -8.03085, -8.01437, -8.0454, -8.00132, -8.00447, -8.03271, -8.00425, -7.98719, -7.89763, -8.00812, -7.98091, -7.97825, -7.99747, -7.97061, -7.98386, -7.97647, -8.02069, -7.95947, -7.95605, -7.99679, -7.95132, -7.95325, -7.94678],
            "start_logits": [1.69411, -6.23828, -5.97208, -6.20629, -8.56986, -6.35793, -7.20431, -6.10802, -5.90824, -3.57492, -7.18318, -6.13202, -3.60061, -5.92503, -6.30996, -7.57493, -4.15579, -0.94657, -4.72717, -6.48217, -6.80237, -5.55348, -6.48098, -7.67901, -5.52591, -6.04397, -6.02081, -7.4529, -6.56467, -6.75301, -7.11356, -7.80245, -7.56675, -7.58747, -7.46179, -6.91821, -8.25127, -7.07649, -8.46045, -7.58839, -6.76486, -8.08384, -8.22104, -7.43108, -6.92263, -8.20415, -8.07333, -6.97405, -6.56892, -5.571, -7.78529, -5.77651, -7.42387, -2.69386, -7.31664, -6.88971, -5.09293, -5.15971, -6.76728, -6.79877, -5.52128, -5.73292, -7.40855, -6.75794, -6.89355, -6.67498, -6.20267, -4.60173, -1.47207, -1.18635, -2.82989, -6.29915, -5.33553, -7.45546, -7.62264, -5.82455, -7.62862, -6.6745, -5.96336, -6.46013, -7.41071, -8.31349, -3.2044, -4.46141, -7.3123, -6.60542, -7.84519, -6.7457, -7.96695, -6.07117, -6.22039, -8.19423, -5.98226, -7.28155, -7.1815, -6.93448, -7.30734, -8.16096, -7.89577, -8.05406, -7.34154, -8.30617, -7.17018, -7.9957, -7.16852, -7.3213, -8.12426, -8.12409, -8.08949, -8.08971, -8.08755, -8.02866, -8.05805, -8.06331, -8.06477, -8.08951, -8.07633, -8.07999, -8.10378, -8.09809, -8.10913, -8.12794, -8.05799, -8.08603, -8.1008, -8.05964, -8.09263, -8.03134, -8.01245, -8.04282, -8.04677, -8.03661, -8.0361, -8.01393, -8.03993, -8.04992, -8.00284, -8.023, -8.04776, -8.01076, -8.02369, -8.04803, -8.0439, -8.02561, -8.06556, -8.05153, -8.06927, -8.09828, -8.04904, -8.04665, -8.07846, -8.08126, -8.06471, -8.07746, -8.06924, -8.06898, -8.10084, -8.09168, -8.08115, -8.05335, -8.11334, -8.10348, -8.12311, -8.05894, -8.11119, -8.13315, -8.10909, -8.10413, -8.12074, -8.05254, -8.07597, -8.07138, -8.04774, -8.03797, -8.03515, -8.05518, -8.03872, -8.04257, -8.05183, -8.02965, -8.06015, -8.06646, -8.04775, -8.07133, -8.08556, -8.14601, -8.07829, -8.10066, -8.09927, -8.09366, -8.10981, -8.10192, -8.09973, -8.07898, -8.11445, -8.11818, -8.08041, -8.11404, -8.11151, -8.12287],
            "unique_ids": 1000000001
        }
    ]
}

But I get no error.
Are these result what you expected?

@rummens

This comment has been minimized.

Copy link
Author

commented Apr 8, 2019

Hmm yes and no. The results are absoluty correct, this is what the model is suppossed to output. I just dont understand why you dont get the error and we do. Any idea why? Am I using the wrong image?

This is what OpenShift tells me about the image:

**Config**
Author   Unknown  
Built    a month ago  
Digestsha256:840e4d6ceeef2a409746ea0481af18c81bdbc7bfd5e325f9b3d789566c25bccf 
Identifiersha256:7794201dcb2c91c5469e8a19a374c72252950d6600706f34a4e0d0b7364a576c  
Labels  maintainer=Bitnami <containers@bitnami.com>    
Annotations  openshift.io/generated-by: OpenShiftWebConsole  openshift.io/imported-from: bitnami/tensorflow-serving  
Docker Version18.09.3 

**Environment** 
PATH=/opt/bitnami/tensorflow-serving/bin:/opt/bitnami/tensorflow-serving/bazel-bin/tensorflow_serving/model_servers:/opt/bitnami/nami/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin  
IMAGE_OS=debian-9  
NAMI_VERSION=1.0.0-1  
GPG_KEY_SERVERS_LIST=ha.pool.sks-keyservers.net hkp://p80.pool.sks-keyservers.net:80 keyserver.ubuntu.com hkp://keyserver.ubuntu.com:80 pgp.mit.edu 
 TINI_VERSION=v0.13.2  
TINI_GPG_KEY=595E85A6B1B4779EA4DAAEC70B588DFF0527A9B7  
GOSU_VERSION=1.10  GOSU_GPG_KEY=B42F6819007F00F88E364FD4036A9C25BF357DD4  BITNAMI_IMAGE_VERSION=1.13.0-debian-9-r0  
BITNAMI_PKG_CHMOD=-R g+rwX  
HOME=/  
BITNAMI_APP_NAME=tensorflow-serving  
NAMI_PREFIX=/.nami  
TENSORFLOW_SERVING_MODEL_NAME=inception  
TENSORFLOW_SERVING_PORT_NUMBER=8500  TENSORFLOW_SERVING_REST_API_PORT_NUMBER=8501


**Layers**
sha256:755e3461ac38d37509872b0f865eda493b9da04e7d31e84e1abf79bf9bae6d4b

 
20.7 MB

sha256:f543c5e0a42c91e9567cb7714f6cb97fa573be06bf94b37ac5e5dd0cf1ce4048

 
5206 B

sha256:c67d198228df953d66cb83458d483f08f9210cbd935901a38c48873c8b3dca8d

 
11.9 MB

sha256:c88860fb6a26ab4509756fd7222d07cebebc9cf5da8056fee863ce2e5748eb4d

 
14589 B

sha256:a7fc7b3fa1af9f581b23fb251c562be6e445e347383461d1086f86799db75d6b

 
579805 B

sha256:670b8acb8eabc968cd47a694ff024bab0aaf9a71de95705d01df5650cbcfd715

 
5316 B

sha256:d007c12f8d500581686102f3a1a4e4b62087faa4c3a5e6af50a0bf8582aba87a

 
236 B

sha256:53404f5fe8200dea03dd99f2746360cc57dc9c71ee23c276f71928d0335feb04

 
43.6 MB

sha256:65360d2703c5581ea28c0b58b1b600cc246a51eae60b0dcdea3bd6d0c6c11e50

 
605 B

sha256:01ad79583bc7711ae0161722a57748907a0fc2a49f37e9bf1dd971eea940f804


@miguelaeh

This comment has been minimized.

Copy link

commented Apr 11, 2019

Hi @rummens,
I have reproduced similar errors with the Red Hat image. I am going to create an internal task so we will work on it in the future. We will update this issue when we have more information.

Thanks for your reporting.

@miguelaeh miguelaeh added the on-hold label Apr 11, 2019

@miguelaeh

This comment has been minimized.

Copy link

commented Apr 12, 2019

Hi @rummens ,
After some memory problems I have made your model work on OpenShift Online. This is the YAML I have used:

apiVersion: v1
kind: Pod
metadata:
  name: tensorflow-serving
  labels:
    app: tensorflowlow
spec:
  imagePullSecrets:
      - name: petete
  volumes:
  - name: tensorflow-vol
    emptyDir: {}
  containers:
  - name: tensorflow-serving
    image: registry.connect.redhat.com/bitnami/tensorflow-serving
    resources:
      requests:
        memory: "2Gi"
      limits:
        memory: "2Gi"
    volumeMounts:
    - name: tensorflow-vol
      mountPath: /bitnami
  initContainers:
  - name: init-tensorflow
    image: busybox
    workingDir: /tmp
    command: ['sh','-c','wget --no-check-certificate -O data.tar.gz "https://s3.amazonaws.com/bitnami-autobuild-ondemand/temporary/s3temp-bitnami-20190411083532-128499/data.tar.gz?AWSAccessKeyId=AKIAIVO22QZHYMJEBWVA&Expires=1555058132&Signature=1JpHLTmnaXeFNGyutkxj1HUJDfE="; tar xvf data.tar.gz; cp -R ./data/* /bitnami/']
    volumeMounts:
    - name: tensorflow-vol
      mountPath: /bitnami

The wget command of the init-container is downloading your folder bert, that I previously upload to an S3 backend. You can do it manually if you want. I have request 2Gi of memory because with less memory the process is killed.
After deploy this Pod I use the following POST to the model:

curl --data '{"instances": [{"input_mask": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "input_ids": [101, 2040, 22558, 2250, 22199, 1029, 102, 20901, 2038, 2088, 2877, 3231, 4128, 2408, 2885, 1010, 5214, 1997, 5604, 1998, 6042, 1037, 2898, 2846, 1997, 3941, 1998, 5693, 2013, 2948, 2000, 7733, 9819, 1012, 2224, 2256, 9414, 3945, 2000, 2156, 2065, 2045, 2003, 1037, 4322, 2008, 6010, 2115, 5918, 1012, 2250, 22199, 12939, 3231, 4322, 5608, 2000, 5326, 2037, 4128, 1010, 3941, 1998, 5911, 11532, 2408, 1996, 20901, 2177, 1998, 6327, 6304, 1012, 2250, 22199, 2036, 12939, 3231, 4128, 2000, 23569, 27605, 4371, 1996, 27891, 1997, 2037, 3231, 19571, 2083, 3941, 6631, 1998, 2578, 1012, 4821, 1010, 2250, 22199, 3640, 2151, 4022, 5310, 1997, 2019, 20901, 3231, 4322, 1010, 1996, 5770, 1997, 2256, 2440, 11532, 2006, 1996, 2087, 7218, 1998, 2800, 2965, 2000, 3231, 1010, 9398, 3686, 2030, 8292, 28228, 12031, 1037, 5310, 1521, 1055, 4031, 2006, 2051, 1998, 5166, 1012, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "segment_ids": [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "unique_ids": 1000000000}, {"input_mask": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "input_ids": [101, 2040, 22558, 2250, 22199, 1029, 102, 4821, 1010, 2250, 22199, 3640, 2151, 4022, 5310, 1997, 2019, 20901, 3231, 4322, 1010, 1996, 5770, 1997, 2256, 2440, 11532, 2006, 1996, 2087, 7218, 1998, 2800, 2965, 2000, 3231, 1010, 9398, 3686, 2030, 8292, 28228, 12031, 1037, 5310, 1521, 1055, 4031, 2006, 2051, 1998, 5166, 1012, 2250, 22199, 2003, 1996, 2691, 4132, 2306, 1996, 2177, 2029, 3084, 2009, 3722, 2000, 2424, 2019, 20901, 4256, 2040, 6753, 1037, 3327, 5604, 4023, 1012, 2009, 7861, 11452, 2015, 4322, 5608, 2000, 4503, 2037, 5906, 1010, 2037, 2897, 1998, 2449, 4346, 1996, 4555, 16991, 1998, 3754, 2000, 2491, 2037, 2951, 1012, 1012, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "segment_ids": [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "unique_ids": 1000000001}]}' localhost:8501/v1/models/bert:predict

And I got this output:

{
    "predictions": [
        {
            "start_logits": [1.89395, -6.17103, -5.71777, -6.34353, -8.69349, -6.336, -7.11741, 1.11222, -6.29514, -4.62471, -7.63364, -4.95477, -6.67329, -6.67357, -5.05092, -7.26321, -6.27973, -7.77083, -6.24084, -7.70391, -7.10408, -7.12483, -7.12772, -7.75358, -7.975, -6.80516, -8.01904, -7.00503, -6.68381, -5.6422, -7.65209, -6.26765, -7.5305, -6.89652, -6.49628, -6.2252, -6.01721, -7.51752, -7.40349, -7.43719, -7.36661, -7.16736, -7.98579, -7.023, -7.12836, -7.76345, -7.28102, -7.52805, -7.34341, -6.19979, -3.6709, -7.55136, -6.4771, -2.78851, -6.94207, -4.01396, -6.99116, -6.69709, -7.85423, -7.31325, -8.52707, -7.3206, -8.03917, -6.73382, -7.38379, -5.9956, -5.01766, -0.458483, -5.08685, -7.95153, -4.45515, -5.68045, -7.13676, -4.6458, -7.83034, -7.7752, -6.85525, -5.15395, -6.57467, -7.20129, -7.00695, -8.05484, -8.3622, -7.35165, -7.32059, -8.43227, -7.34679, -5.52198, -6.91232, -7.31733, -6.0772, -7.68389, -8.36163, -7.65515,-7.37059, -6.95018, -7.23477, -4.33394, -7.70486, -6.75653, -4.11418, -6.63959, -7.21764, -8.12978, -5.75854, -1.50274, -6.63868, -7.42107, -7.58481, -6.07518, -6.80728, -7.96152, -6.33652, -6.46055, -6.80299, -7.73459, -6.86747, -6.9004, -7.3938, -7.97764, -7.68548, -7.92056, -7.77384, -7.2315, -8.36081, -7.4228, -8.63137, -8.08261, -7.17473, -8.27812, -8.50419, -7.62375, -6.96657, -8.30429, -8.24775, -7.40648, -7.35741, -6.90331, -8.0297, -6.76656, -7.58663, -7.21984, -8.05904, -8.0479, -8.07065, -8.05414, -8.06583, -8.09119, -8.05842, -8.03402, -8.0695, -8.0603, -8.05814, -8.05402, -8.06088, -8.05724, -8.07478, -8.06466, -8.04992, -8.0292, -8.07227, -8.04891, -8.06145, -8.03724, -8.03422, -8.0588, -8.04744, -8.06347, -8.08017, -8.05346, -8.05682, -8.06, -8.05593, -8.05017, -8.03559, -8.04371, -8.04633, -8.04184, -8.03873, -8.02622, -8.04819, -8.04433, -8.02308, -8.03243, -8.0554, -8.07409, -8.03549, -8.05544, -8.04929, -8.04327, -8.07355, -8.06391, -8.0601, -8.04202, -8.07568, -8.08226, -8.05762, -8.0697, -8.05561, -8.07516],
            "unique_ids": 1000000000,
            "end_logits": [1.88635, -7.71321, -7.67642, -8.25464, -5.21849, -7.37146, -6.47271, 0.49895, -7.63512, -7.41713, -6.46705, -6.89572, -4.2181, -7.86675, -3.47279, -6.13398, -8.3417, -7.81871, -7.11278, -8.01439, -7.09987, -7.9371, -7.98266, -7.421, -7.69318, -6.76245, -7.68158, -6.1028, -7.9786, -6.87365, -7.6551, -7.12621, -5.04349, -6.34223, -8.15959, -8.17065, -8.12821, -6.02864, -7.89356, -7.79615, -7.91913, -7.95742, -7.71019, -8.07146, -7.33924, -7.90614, -7.86938, -7.49437, -5.70677, -5.44679, -8.17311, -6.11704, -7.95962, -6.01986, -6.06159, -1.49583, -7.56889, -7.4132, -7.12605, -6.68924, -7.08996, -6.8845, -7.60274, -6.83489, -5.28159, -7.22466, -7.42371, -1.97366, -1.59171, -7.37732, -6.06673, -2.25972, -4.60752, -8.52935, -6.78608, -7.60115, -8.12158, -7.19667, -4.60371, -7.80494, -8.15132, -7.5587, -6.97777, -8.16885, -7.27163, -7.27105, -7.43738, -7.16989, -5.1173, -7.87953, -7.49218, -6.58548, -7.45253, -5.18924, -4.66551, -7.57856, -6.63115, -8.49946, -6.09317, -8.00419, -7.07554, -7.17987, -5.73054, -7.22227, -6.68655, -2.27478, -6.47436, -3.94676, -6.21166, -8.15075, -7.89853, -7.62918, -7.66006, -7.6437, -6.42169, -7.556, -8.12139, -7.88107, -7.47042, -7.83188, -7.16717, -6.51117, -7.56302, -6.8062, -7.28315, -8.11626, -6.73703, -7.46329, -8.36728, -7.53583, -6.63665, -7.22987, -6.83201, -7.35009, -6.97935, -5.8814, -7.86263, -7.5484, -7.79917, -4.63619, -5.05621, -6.18158, -8.0197, -8.0098, -8.00065, -8.00563, -7.99471, -7.96485, -8.01101, -8.02766, -7.98552, -7.99756, -8.02606, -8.02027, -8.00464, -8.01293, -7.99578, -8.01951,-8.03921, -8.06384, -8.01115, -8.05136, -8.03646, -8.06708, -8.05742, -8.01833, -8.05162, -8.03772, -8.01131, -8.03326, -8.03416, -8.02604, -8.02558, -8.03628, -8.04898, -8.02824, -8.0242, -8.03927, -8.04013, -8.05176, -8.01941, -8.04782, -8.07533, -8.06134, -8.02841, -8.00807, -8.0653, -8.04339, -8.0472, -8.05569,-8.02819, -8.03766, -8.03592, -8.06554, -8.01616, -8.00958, -8.0352, -8.02069, -8.02544, -8.01891]
        },
        {
            "start_logits": [1.69411, -6.23828, -5.97208, -6.20629, -8.56986, -6.35793, -7.20431, -6.10802, -5.90824, -3.57492, -7.18318, -6.13202, -3.60061, -5.92503, -6.30996, -7.57493, -4.15578, -0.94657, -4.72717, -6.48217, -6.80237, -5.55348, -6.48098, -7.67901, -5.52591, -6.04397, -6.02081, -7.4529, -6.56467,-6.75301, -7.11356, -7.80245, -7.56675, -7.58747, -7.46179, -6.91821, -8.25127, -7.07649, -8.46045, -7.58839, -6.76486, -8.08384, -8.22104, -7.43108, -6.92263, -8.20416, -8.07333, -6.97405, -6.56892, -5.571, -7.78529, -5.77651, -7.42387, -2.69386, -7.31664, -6.88971, -5.09293, -5.15971, -6.76728, -6.79877, -5.52128, -5.73292, -7.40855, -6.75794, -6.89355, -6.67498, -6.20267, -4.60173, -1.47207, -1.18635, -2.82989, -6.29915, -5.33553, -7.45546, -7.62264, -5.82455, -7.62862, -6.6745, -5.96336, -6.46013, -7.41071, -8.31349, -3.20439, -4.46141, -7.3123, -6.60542, -7.84519, -6.7457, -7.96695, -6.07117, -6.22039, -8.19423, -5.98226, -7.28155, -7.1815, -6.93448, -7.30734, -8.16096, -7.89577, -8.05406, -7.34153, -8.30617, -7.17018, -7.9957, -7.16852, -7.3213, -8.12426, -8.12409, -8.08949, -8.08971, -8.08755, -8.02866, -8.05805, -8.06331, -8.06477, -8.0895, -8.07633, -8.07999, -8.10379, -8.09809, -8.10913, -8.12794, -8.058, -8.08603, -8.1008, -8.05964, -8.09263, -8.03134, -8.01245, -8.04282, -8.04677, -8.03661, -8.0361, -8.01393, -8.03993, -8.04992, -8.00284, -8.023, -8.04776, -8.01076, -8.02369, -8.04803, -8.04389, -8.02561, -8.06556, -8.05153, -8.06927, -8.09828, -8.04904, -8.04665, -8.07846, -8.08126, -8.06471, -8.07746, -8.06924, -8.06898, -8.10084, -8.09168, -8.08115, -8.05335, -8.11334, -8.10348, -8.12311, -8.05894, -8.11119, -8.13315, -8.10909, -8.10413, -8.12074, -8.05254, -8.07597, -8.07138, -8.04774, -8.03797, -8.03515, -8.05518, -8.03872, -8.04257, -8.05183, -8.02965, -8.06015, -8.06646, -8.04775, -8.07133, -8.08556, -8.14601, -8.07829, -8.10066, -8.09927,-8.09366, -8.10981, -8.10192, -8.09973, -8.07898, -8.11445, -8.11818, -8.08041, -8.11404, -8.11151, -8.12287],
            "unique_ids": 1000000001,
            "end_logits": [1.7971, -7.69476, -7.69979, -8.24493, -5.12942, -7.20498, -6.29411, -6.89877, -5.51851, -8.22334, -4.99419, -7.64681, -6.64564, -7.02387, -5.21917, -7.2014, -6.298, -1.74696, -4.97182, -2.79664, -5.43804, -7.81159, -7.6644, -7.69437, -7.1555, -7.42649, -5.77371, -7.44204, -7.99793, -7.89191, -7.51014, -7.96775, -7.11213, -6.68531, -7.50293, -6.598, -7.39869, -8.23903, -6.83253, -7.77239, -8.47055, -7.6694, -6.71383, -7.36133, -7.18606, -7.37836, -7.05179, -5.75529, -7.84509, -7.53321, -7.98055, -4.29565, -3.2521, -7.35999, -5.06275, -8.02504, -7.87654, -6.49647, -5.72829, -6.81087, -7.5094, -3.43106, -7.41876, -8.24059, -8.23543, -7.26439, -7.75959, -6.60055, -4.67576, -2.44382, -0.685445, -4.85748, -5.33752, -7.54303, -7.29981, -5.9717, -4.58172, -4.59226, -7.92281, -8.40496, -8.04877, -6.95516, -5.40954, -2.68969, -7.69513, -7.19537, -7.3715, -5.55238, -6.60155, -7.90538, -6.07703, -7.55561, -4.5476, -7.7187, -8.13389, -7.94818, -7.15718, -7.68566, -7.34945, -7.54567, -7.14372, -7.16989, -5.5085, -5.0984, -5.9343, -5.93809, -7.93428, -7.94276, -7.98788, -7.98569, -7.98563, -8.07888, -8.02819, -8.01415, -8.01683, -7.98067, -7.99473, -8.00823, -7.9599, -7.97134, -7.95014, -7.93278, -8.03036, -7.98562, -7.96053, -8.02014,-7.98148, -8.05193, -8.07696, -8.03016, -8.01335, -8.03546, -8.02516, -8.07061, -8.03722, -8.01404, -8.07952, -8.04804, -8.01867, -8.05968, -8.04412, -8.02413, -8.02587, -8.0363, -8.00355, -8.01281, -8.00063, -7.95522, -8.01383, -8.01509, -7.97682, -7.96553, -8.00064, -7.98441, -7.9882, -7.99824, -7.96236, -7.97507, -8.00692, -8.0433, -7.97083, -7.9887, -7.96293, -8.04843, -7.95422, -7.90545, -7.96051, -7.9667, -7.94075, -8.02239, -8.00472, -7.9908, -8.02842, -8.03611, -8.04443, -8.01002, -8.02953, -8.03085, -8.01437, -8.0454, -8.00132, -8.00447, -8.03271, -8.00425, -7.98719, -7.89763, -8.00812, -7.98091, -7.97825, -7.99747, -7.97061, -7.98386, -7.97647, -8.02069, -7.95947, -7.95605, -7.99679, -7.95132, -7.95325, -7.94678]
        }
    ]
}

Can you try that and check if it works?

@miguelaeh miguelaeh removed the on-hold label Apr 15, 2019

@stale

This comment has been minimized.

Copy link

commented Apr 30, 2019

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@stale stale bot added the stale label Apr 30, 2019

@rummens

This comment has been minimized.

Copy link
Author

commented May 2, 2019

Sorry for the late response, I was traveling a lot these past few weeks and only now had a chance to test this.

The weird thing is that my yaml file looks fairly similiar (except for the init pod which downloads the stuff since I copy it manuelly to the storage).
The only explanation I have now is that this specific version of OpenShift caueses the problem. I am running this on our production cluster which has a very old version of OpenShift installed (s. below). Could this be the case?

OpenShift Master: v3.4.1.44.17
Kubernetes Master: v1.4.0+776c994

For comparision reason this is the yaml of the pod (created by a deployment config etc.):

apiVersion: v1
kind: Pod
metadata:
  name: tf-serving-redhat-18-lypc9
  generateName: tf-serving-redhat-18-
  namespace: tfserving
  selfLink: /api/v1/namespaces/tfserving/pods/tf-serving-redhat-18-lypc9
  uid: c0bfd853-4f03-11e9-b28f-005056a1e934
  resourceVersion: '139396143'
  creationTimestamp: '2019-03-25T13:41:54Z'
  labels:
    app: tf-serving-redhat
    deployment: tf-serving-redhat-18
    deploymentconfig: tf-serving-redhat
  annotations:
    kubernetes.io/created-by: >
      {"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"tfserving","name":"tf-serving-redhat-18","uid":"bd73656a-4f02-11e9-b28f-005056a1e934","apiVersion":"v1","resourceVersion":"137192036"}}
    openshift.io/deployment-config.latest-version: '18'
    openshift.io/deployment-config.name: tf-serving-redhat
    openshift.io/deployment.name: tf-serving-redhat-18
    openshift.io/generated-by: OpenShiftNewApp
    openshift.io/scc: restricted
spec:
  volumes:
    - name: tf-serving-redhat
      persistentVolumeClaim:
        claimName: tf-serving-redhat-pv
    - name: default-token-anf5r
      secret:
        secretName: default-token-anf5r
        defaultMode: 420
  containers:
    - name: tf-serving-redhat
      image: >-
        bitnami/tensorflow-serving@sha256:840e4d6ceeef2a409746ea0481af18c81bdbc7bfd5e325f9b3d789566c25bccf
      ports:
        - containerPort: 8500
          protocol: TCP
        - containerPort: 8501
          protocol: TCP
      resources: {}
      volumeMounts:
        - name: tf-serving-redhat
          mountPath: /bitnami
        - name: default-token-anf5r
          readOnly: true
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      terminationMessagePath: /dev/termination-log
      imagePullPolicy: Always
      securityContext:
        capabilities:
          drop:
            - KILL
            - MKNOD
            - SETGID
            - SETUID
            - SYS_CHROOT
        privileged: false
        seLinuxOptions:
          level: 's0:c21,c5'
        runAsUser: 1000430000
  restartPolicy: Always
  terminationGracePeriodSeconds: 30
  dnsPolicy: ClusterFirst
  nodeSelector:
    type: dev
  serviceAccountName: default
  serviceAccount: default
  nodeName: opsh-wr1.aircloud.common.airbusds.corp
  securityContext:
    seLinuxOptions:
      level: 's0:c21,c5'
    fsGroup: 1000430000
  imagePullSecrets:
    - name: default-dockercfg-vwwrn
status:
  phase: Running
  conditions:
    - type: Initialized
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2019-03-25T13:41:54Z'
    - type: Ready
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2019-03-25T13:42:01Z'
    - type: PodScheduled
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2019-03-25T13:41:54Z'
  hostIP: 10.116.217.245
  podIP: 172.30.10.243
  startTime: '2019-03-25T13:41:54Z'
  containerStatuses:
    - name: tf-serving-redhat
      state:
        running:
          startedAt: '2019-03-25T13:42:01Z'
      lastState: {}
      ready: true
      restartCount: 0
      image: >-
        bitnami/tensorflow-serving@sha256:840e4d6ceeef2a409746ea0481af18c81bdbc7bfd5e325f9b3d789566c25bccf
      imageID: >-
        docker-pullable://docker.io/bitnami/tensorflow-serving@sha256:840e4d6ceeef2a409746ea0481af18c81bdbc7bfd5e325f9b3d789566c25bccf
      containerID: >-
        docker://3d3c866f8e6b152d8625d90b86005329c65e5737a3ca085b12d86831b94a9406

@stale stale bot removed the stale label May 2, 2019

@miguelaeh

This comment has been minimized.

Copy link

commented May 3, 2019

Could this be the case?

Not sure, do you have the possibility to try on a more recent cluster? I mean, I try it using both minishift and openshift online and it works following your steps.

I copy it manuelly to the storage

Have you check that the data is mounted correctly? If not, try to enter into the pod with a bash and search for the model to be sure this is not the problem.

@stale

This comment has been minimized.

Copy link

commented May 18, 2019

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@stale stale bot added the stale label May 18, 2019

@rummens

This comment has been minimized.

Copy link
Author

commented May 20, 2019

I tried to get access to a more recent cluster but unfortunetly we dont have anything more recent deployed :-/ But since you were able to deploy this on OpenShift Online, I have to assume that this has something to do to with our old cluster. I can confirm that the model is mounted correctly.

I will close the issue for now. In case I have new information, I will reopen it.

Thanks for your help
Marcel

@rummens rummens closed this May 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.