Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server connection error #387

Closed
TgithubJ opened this Issue Mar 30, 2017 · 15 comments

Comments

Projects
None yet
9 participants
@TgithubJ
Copy link

TgithubJ commented Mar 30, 2017

Hi, I tried to run the example tensorflow-serving mnist codes, server and client codes, on two different company servers.

On server side, I could run it successfully.

[[ Server side ]]

~~:~/serving$ bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=mnist --model_base_path=/tmp/mnist_model/
2017-03-30 17:59:26.700366: I tensorflow_serving/model_servers/main.cc:152] Building single TensorFlow model file config: model_name: mnist model_base_path: /tmp/mnist_model/ model_version_policy: 0
2017-03-30 17:59:26.702443: I tensorflow_serving/model_servers/server_core.cc:338] Adding/updating models.
2017-03-30 17:59:26.702490: I tensorflow_serving/model_servers/server_core.cc:384] (Re-)adding model: mnist
2017-03-30 17:59:26.806245: I tensorflow_serving/core/basic_manager.cc:698] Successfully reserved resources to load servable {name: mnist version: 1}
2017-03-30 17:59:26.806524: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: mnist version: 1}
2017-03-30 17:59:26.806581: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: mnist version: 1}
2017-03-30 17:59:26.806911: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:360] Attempting to load native SavedModelBundle in bundle-shim from: /tmp/mnist_model/1
2017-03-30 17:59:26.807024: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:195] Loading SavedModel from: /tmp/mnist_model/1
2017-03-30 17:59:26.818125: W external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-03-30 17:59:26.818171: W external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-03-30 17:59:26.919879: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:114] Restoring SavedModel bundle.
2017-03-30 17:59:26.940303: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:149] Running LegacyInitOp on SavedModel bundle.
2017-03-30 17:59:26.952505: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:239] Loading SavedModel: success. Took 145488 microseconds.
2017-03-30 17:59:26.952705: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: mnist version: 1}
2017-03-30 17:59:27.008023: I tensorflow_serving/model_servers/main.cc:272] Running ModelServer at 0.0.0.0:9000 ...

However, I couldn’t access the server on client side.

[[ Client side ]]

~~:~/serving$ bazel-bin/tensorflow_serving/example/mnist_client --num_tests=1000 --server=[Corporate IP]:9000
Extracting /tmp/train-images-idx3-ubyte.gz
Extracting /tmp/train-labels-idx1-ubyte.gz
Extracting /tmp/t10k-images-idx3-ubyte.gz
Extracting /tmp/t10k-labels-idx1-ubyte.gz
AbortionError(code=StatusCode.UNAVAILABLE, details="Connect Failed")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
AbortionError(code=StatusCode.UNAVAILABLE, details="Connect Failed")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
AbortionError(code=StatusCode.UNAVAILABLE, details="Connect Failed")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
AbortionError(code=StatusCode.UNAVAILABLE, details="Connect Failed")

By the way, I could ping each others.

~~:~/serving$ ping [Corporate IP1]
PING 56(84) bytes of data.
64 bytes from 10.125.4.23: icmp_seq=1 ttl=63 time=1.52 ms
64 bytes from 10.125.4.23: icmp_seq=2 ttl=63 time=1.53 ms
64 bytes from 10.125.4.23: icmp_seq=3 ttl=63 time=1.54 ms

~~:~$ ping [Corporate IP2]
PING 56(84) bytes of data.
64 bytes from 10.125.4.21: icmp_seq=1 ttl=63 time=1.80 ms
64 bytes from 10.125.4.21: icmp_seq=2 ttl=63 time=3.01 ms
64 bytes from 10.125.4.21: icmp_seq=3 ttl=63 time=2.17 ms

@TgithubJ TgithubJ closed this Mar 31, 2017

@TgithubJ

This comment has been minimized.

Copy link
Author

TgithubJ commented Mar 31, 2017

It was a company network issue.

@SgtJaehoonChoi

This comment has been minimized.

Copy link

SgtJaehoonChoi commented Apr 8, 2017

I have a same problem can you explain how to solve it more specifically plz?

@tspthomas

This comment has been minimized.

Copy link

tspthomas commented Apr 11, 2017

Same issue in here. Could you provide more details on how you solve the problem?

@TgithubJ

This comment has been minimized.

Copy link
Author

TgithubJ commented Apr 11, 2017

Hi, @SgtJaehoonChoi @tspthomas .
I found out that current version of bazel does not support specific proxy control.
Our IT team haven't figured out solution for this yet, so I am using public cloud service and it works.

If it is not a security related problems, maybe you should check the serving side is working properly..

@benelot

This comment has been minimized.

Copy link

benelot commented Apr 13, 2017

Just had the issue here, but on a localhost server to localhost client situation. Problem was I was not allowed to use the network interface. Just check on the client side if sudo fixes your problem. Then configure your user permissions accordingly.

@tspthomas

This comment has been minimized.

Copy link

tspthomas commented Apr 13, 2017

Hi @SgtJaehoonChoi, @TgithubJ. We've found a workaround for this proxy issue, but we're not sure what is the root cause yet. We ran the server normally and without any changes in the environment. However, in the client we tested removing the proxy settings (with the unset command, removing environment variables http_proxy and https_proxy) and it was able to connect to the server successfully. We tried to set no_proxy for some addresses but we didn't succeed. Proxy can sometimes be a nightmare in corporate environments :)

@benelot we'll try to check whether it helps, thanks for the information.

@stevenbinhu21

This comment has been minimized.

Copy link

stevenbinhu21 commented Apr 17, 2017

Can anyone please help? I am getting the same Exception error, tried unsetting the proxies but the error persists. Is there any solution yet?

@stevenbinhu21

This comment has been minimized.

Copy link

stevenbinhu21 commented Apr 17, 2017

Ahh... Never mind, solved it by removing the proxy settings from ~/.bashrc and run the client and server in separate windows

@ravindranathakila

This comment has been minimized.

Copy link

ravindranathakila commented Jun 6, 2017

Happens when tensorflow server is not reachable from the client

@JordanDalton

This comment has been minimized.

Copy link

JordanDalton commented Jul 8, 2017

@benelot can you provide the steps to achieve this?

Just had the issue here, but on a localhost server to localhost client situation. Problem was I was not allowed to use the network interface. Just check on the client side if sudo fixes your problem. Then configure your user permissions accordingly.

@ravindranathakila

This comment has been minimized.

Copy link

ravindranathakila commented Jul 8, 2017

FWIW telnet :

@benelot

This comment has been minimized.

Copy link

benelot commented Jul 10, 2017

@JordanDalton: From what I understand this is not a tensorflow problem, but a problem you can run into when using tensorflow on your linux machine. If your user is not allowed by the linux permissions to use the network interface, it is not going to be able to communicate through the net. So as a simple test, check if you can fix the problem through sudo. The longterm fix is of course updating your user permissions to use the network interface without sudo.

@JordanDalton

This comment has been minimized.

Copy link

JordanDalton commented Jul 10, 2017

What I ended up doing was docker inspect on the container that I was using as the server. In the json object that is returned I took the IPAddress value and passed it in the args when calling the client:

root@b81da3a4a6d9:/serving# bazel-bin/tensorflow_serving/example/mnist_client --num_tests=1000 --server=172.17.0.2:9010
Extracting /tmp/train-images-idx3-ubyte.gz
Extracting /tmp/train-labels-idx1-ubyte.gz
Extracting /tmp/t10k-images-idx3-ubyte.gz
Extracting /tmp/t10k-labels-idx1-ubyte.gz
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Inference error rate: 10.4%
root@b81da3a4a6d9:/serving# 
@CLIsVeryOK

This comment has been minimized.

Copy link

CLIsVeryOK commented Jun 22, 2018

you could refer here, #591; I met the same problem because I do not set -p9000:9000 parameter result in no net work available for my docker.
1 step : what you need is build the docker
sudo docker build --pull -t $USER/tensorflow-serving-devel -f tensorflow_serving/tools/docker/Dockerfile.devel .
2 step: run the docker
sudo docker run --name=tensorflow_container -p9000:9000 -it $USER/tensorflow-serving-devel
then you can configure your own model.

@SefaZeng

This comment has been minimized.

Copy link

SefaZeng commented Mar 4, 2019

I had the same issue.And the same,put the server and client in the company machine,the request return Connect Failed.But I make it success on windows before that.
Is the net work of the company the cause of this error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.