Join GitHub today
Server not working on model which works in Docker Hub Container; inconsistent with Google Release #10
I want to deploy the Bitnami TensorFlow Serving image from the RedHat Registry .
First of all there is no way to understand which tag correspond to which release from Google on the Docker Hub. Which is the currently latest stable release from Google? It seems that you only publish the nightly builds, as a new image gets pushed every day?
In addition, the model server on multiple RedHat images seems broken, since I have the same model deployed in the Bitnami container and in the official Google one from Docker Hub. The model works fine in the Google container but cannot be queried in the bitnami one. Although it picks up the model and serves it, when quering it the return is an init error (depending on the container version it cannot init a different layer):
Why is that? Is there a way to figure out which image on the RedHat Registry is the based on the current stable/latest release from Docker Hub?
Steps to reproduce the issue:
Describe the results you received:
Describe the results you expected:
Normal Output of model, no error.
Additional information you deem important (e.g. issue happens only occasionally):
I tried differnet versions of the redhat image (e.g. 1.12 and 1.13), the kind of error is the same but the layer in which in happens changes. So instead of an error in layer 7, it appears in layer 1 for example.
Running in OpenShift 3.4.
Model Server version inside of both images:
Thanks for letting us know. When you mean the image from Google, do you mean this image? https://hub.docker.com/r/tensorflow/tensorflow.
What we do is download the tensorflow source code, compile it and with it we build our containers. We update the containers every day so it has the latest system packages (so we avoid as much as possible CVE issues).
Regarding the init errors, I would like to take a look, but I'm no Tensorflow expert. Could you let me know a set of commands that I can use to reproduce the issue?
Yes I use that image.
I have create a repo with the required model files: https://drive.google.com/open?id=1cjRSbzMqWKVpF9ERu77NHEldxvlplp9K. Deploy the model as usual and then make a POST to this endpoint with the following data:
Sure, most of the information is already on this repo's readme: Link
You can do it locally but since we are running in OpenShift, maybe you should do it too. Not sure :-)
After pulling the image do the following:
Proceed with the steps from my last post to get the error.
After that, I did a POST with the data you posted. And I get the following response:
But I get no error.
Hmm yes and no. The results are absoluty correct, this is what the model is suppossed to output. I just dont understand why you dont get the error and we do. Any idea why? Am I using the wrong image?
This is what OpenShift tells me about the image:
Hi @rummens ,
And I got this output:
Can you try that and check if it works?
Sorry for the late response, I was traveling a lot these past few weeks and only now had a chance to test this.
The weird thing is that my yaml file looks fairly similiar (except for the init pod which downloads the stuff since I copy it manuelly to the storage).
OpenShift Master: v126.96.36.199.17
For comparision reason this is the yaml of the pod (created by a deployment config etc.):
Not sure, do you have the possibility to try on a more recent cluster? I mean, I try it using both minishift and openshift online and it works following your steps.
Have you check that the data is mounted correctly? If not, try to enter into the pod with a bash and search for the model to be sure this is not the problem.
I tried to get access to a more recent cluster but unfortunetly we dont have anything more recent deployed :-/ But since you were able to deploy this on OpenShift Online, I have to assume that this has something to do to with our old cluster. I can confirm that the model is mounted correctly.
I will close the issue for now. In case I have new information, I will reopen it.
Thanks for your help