Skip to content

Unable to detect tensorflow model version during deployment #2742

@mapattacker

Description

@mapattacker

Expected behavior
Successful endpoint deployed

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.59.5
  • Framework name: Tensorflow
  • Framework version: 2.4.0
  • Python version: 3.7
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context

I have been unable to deploy a tensorflow model, and not sure how to debug it. This is the deployment code.

import sagemaker
from sagemaker.tensorflow import TensorFlowModel

role = sagemaker.get_execution_role()
model = TensorFlowModel(entry_point='inference.py',
                        framework_version="2.4",
                        model_data="s3://xxxx/output/model.tar.gz",
                        role=role)

 predictor = model.deploy(initial_instance_count=1, 
                         instance_type='ml.t2.medium')

The model.tar.gz is saved in the following structure.

└── 0000000
    ├── assets
    ├── saved_model.pb
    └── variables
        ├── variables.data-00000-of-00001
        └── variables.index

Screenshots or logs

It appears that the problem was that the model_config_list was unable to capture the model version, as the version key does not have a value; resulting in the error Error parsing text-format tensorflow.serving.ModelServerConfig: 9:7: Expected integer, got: }. Does anyone know I can debug this?

message
INFO:__main__:PYTHON SERVICE: True
INFO:__main__:starting services
INFO:__main__:using default model name: model
INFO:__main__:tensorflow serving model config: 
"model_config_list: {
  config: {
    name: 'model'
    base_path: '/opt/ml/model'
    model_platform: 'tensorflow'
    model_version_policy: {
      specific: {
        versions: 
      }
    }
  }"
}
INFO:__main__:tensorflow version info:
2021-11-01 09:06:35.558496: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.
"2021-11-01 09:06:35.560355: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:105] SageMaker Profiler is not enabled. The timeline writer thread will not be started, future recorded events will be dropped."
TensorFlow ModelServer: 2.4.0-rc4+dev.sha.no_git
TensorFlow Library: 2.4.1
INFO:__main__:tensorflow serving command: tensorflow_model_server --port=26000 --rest_api_port=26001 --model_config_file=/sagemaker/model-config.cfg --max_num_load_retries=0    
INFO:__main__:started tensorflow serving (pid: 17)
INFO:tfs_utils:Trying to connect with model server: http://localhost:26001/v1/models/model
"WARNING:urllib3.connectionpool:Retrying (Retry(total=8, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f41b10>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
2021-11-01 09:06:36.089481: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.
"2021-11-01 09:06:36.089889: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:105] SageMaker Profiler is not enabled. The timeline writer thread will not be started, future recorded events will be dropped."
"[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/text_format.cc:324] Error parsing text-format tensorflow.serving.ModelServerConfig: 9:7: Expected integer, got: }"
Failed to start server. Error: Invalid argument: Invalid protobuf file: '/sagemaker/model-config.cfg'
"WARNING:urllib3.connectionpool:Retrying (Retry(total=7, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f5a1d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=6, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f5a790>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f5ad50>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f5f350>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f5f910>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f5fed0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f694d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f69a90>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
WARNING:tfs_utils:model: http://localhost:26001/v1/models/model is not available yet 
INFO:tfs_utils:Trying to connect with model server: http://localhost:26001/v1/models/model
"WARNING:urllib3.connectionpool:Retrying (Retry(total=8, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f71e90>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=7, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18ce9490>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=6, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18ce9a50>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cef050>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cef610>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cefbd0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f69e10>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f69390>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f69150>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
WARNING:tfs_utils:model: http://localhost:26001/v1/models/model is not available yet 
INFO:tfs_utils:Trying to connect with model server: http://localhost:26001/v1/models/model
"WARNING:urllib3.connectionpool:Retrying (Retry(total=8, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f5af90>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=7, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f5a310>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=6, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f418d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f417d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda191b71d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda19a4e090>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cf7190>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cf7750>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cf7d10>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
WARNING:tfs_utils:model: http://localhost:26001/v1/models/model is not available yet 
INFO:tfs_utils:Trying to connect with model server: http://localhost:26001/v1/models/model
"WARNING:urllib3.connectionpool:Retrying (Retry(total=8, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f69210>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=7, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f69fd0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=6, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cefb10>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cfe650>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cfed10>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d07310>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d078d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d07ed0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d0d510>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
WARNING:tfs_utils:model: http://localhost:26001/v1/models/model is not available yet 
INFO:tfs_utils:Trying to connect with model server: http://localhost:26001/v1/models/model
"WARNING:urllib3.connectionpool:Retrying (Retry(total=8, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d0b990>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=7, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d0da90>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=6, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d0d310>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d07e90>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d071d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f712d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f69e50>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f699d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f69610>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
WARNING:tfs_utils:model: http://localhost:26001/v1/models/model is not available yet 
INFO:tfs_utils:Trying to connect with model server: http://localhost:26001/v1/models/model
"WARNING:urllib3.connectionpool:Retrying (Retry(total=8, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18cfe710>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=7, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d0ba90>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=6, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d1e0d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d1e6d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18d1ecd0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f5fdd0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fda18f69b50>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"Traceback (most recent call last):
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connection.py"", line 170, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File ""/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py"", line 96, in create_connection
    raise err
  File ""/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py"", line 86, in create_connection
    sock.connect(sa)"
ConnectionRefusedError: [Errno 111] Connection refused
"During handling of the above exception, another exception occurred:"
"Traceback (most recent call last):
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py"", line 706, in urlopen
    chunked=chunked,
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py"", line 394, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connection.py"", line 234, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File ""/usr/local/lib/python3.7/http/client.py"", line 1277, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File ""/usr/local/lib/python3.7/http/client.py"", line 1323, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File ""/usr/local/lib/python3.7/http/client.py"", line 1272, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File ""/usr/local/lib/python3.7/http/client.py"", line 1032, in _send_output
    self.send(msg)
  File ""/usr/local/lib/python3.7/http/client.py"", line 972, in send
    self.connect()
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connection.py"", line 200, in connect
    conn = self._new_conn()
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connection.py"", line 182, in _new_conn
    self, ""Failed to establish a new connection: %s"" % e"
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fda18f69750>: Failed to establish a new connection: [Errno 111] Connection refused
"During handling of the above exception, another exception occurred:"
"Traceback (most recent call last):
  File ""/sagemaker/serve.py"", line 444, in <module>
    ServiceManager().start()
  File ""/sagemaker/serve.py"", line 426, in start
    self._wait_for_tfs()
  File ""/sagemaker/serve.py"", line 338, in _wait_for_tfs
    self._tfs_default_model_name, self._tfs_wait_time_seconds)
  File ""/sagemaker/tfs_utils.py"", line 238, in wait_for_model
    response = session.get(tfs_url)
  File ""/usr/local/lib/python3.7/site-packages/requests/sessions.py"", line 555, in get
    return self.request('GET', url, **kwargs)
  File ""/usr/local/lib/python3.7/site-packages/requests/sessions.py"", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File ""/usr/local/lib/python3.7/site-packages/requests/sessions.py"", line 655, in send
    r = adapter.send(request, **kwargs)
  File ""/usr/local/lib/python3.7/site-packages/requests/adapters.py"", line 449, in send
    timeout=timeout
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py"", line 796, in urlopen
    **response_kw
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py"", line 796, in urlopen
    **response_kw
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py"", line 796, in urlopen
    **response_kw
  [Previous line repeated 4 more times]
  File ""/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py"", line 758, in urlopen
    retries.sleep()
  File ""/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py"", line 414, in sleep
    self._sleep_backoff()
  File ""/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py"", line 398, in _sleep_backoff
    time.sleep(backoff)
  File ""/sagemaker/multi_model_utils.py"", line 38, in _raise_timeout_error
    raise Exception(408, ""Timed out after {} seconds"".format(seconds))"
"Exception: (408, 'Timed out after 300 seconds')"
INFO:__main__:PYTHON SERVICE: True
INFO:__main__:starting services
INFO:__main__:using default model name: model
INFO:__main__:tensorflow serving model config: 
"model_config_list: {
  config: {
    name: 'model'
    base_path: '/opt/ml/model'
    model_platform: 'tensorflow'
    model_version_policy: {
      specific: {
        versions: 
      }
    }
  }"
}
INFO:__main__:tensorflow version info:
2021-11-01 09:11:55.158393: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.
"2021-11-01 09:11:55.161351: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:105] SageMaker Profiler is not enabled. The timeline writer thread will not be started, future recorded events will be dropped."
TensorFlow ModelServer: 2.4.0-rc4+dev.sha.no_git
TensorFlow Library: 2.4.1
INFO:__main__:tensorflow serving command: tensorflow_model_server --port=26000 --rest_api_port=26001 --model_config_file=/sagemaker/model-config.cfg --max_num_load_retries=0    
INFO:__main__:started tensorflow serving (pid: 17)
INFO:tfs_utils:Trying to connect with model server: http://localhost:26001/v1/models/model
"WARNING:urllib3.connectionpool:Retrying (Retry(total=8, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fccf2e71610>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
2021-11-01 09:11:55.698345: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.
"2021-11-01 09:11:55.698503: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:105] SageMaker Profiler is not enabled. The timeline writer thread will not be started, future recorded events will be dropped."
"[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/text_format.cc:324] Error parsing text-format tensorflow.serving.ModelServerConfig: 9:7: Expected integer, got: }"
Failed to start server. Error: Invalid argument: Invalid protobuf file: '/sagemaker/model-config.cfg'
"WARNING:urllib3.connectionpool:Retrying (Retry(total=7, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fccf0963050>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=6, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fccf0963610>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fccf0963bd0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fccf09681d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fccf0968790>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fccf0968d50>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"
"WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fccf0971350>: Failed to establish a new connection: [Errno 111] Connection refused')': /v1/models/model"

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions