Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load model failed - error: Worker died #3104

Open
geraldstanje opened this issue Apr 23, 2024 · 6 comments
Open

Load model failed - error: Worker died #3104

geraldstanje opened this issue Apr 23, 2024 · 6 comments

Comments

@geraldstanje
Copy link

geraldstanje commented Apr 23, 2024

🐛 Describe the bug

Load model failed: policy_vs_doc_model_tar_gz, error: Worker died.

Error logs

$ docker run --rm -it --gpus all torchserve-setfit:latest
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2024-04-23T05:00:33,943 [WARN ] main org.pytorch.serve.util.ConfigManager - Your torchserve instance can access any URL to load models. When deploying to production, make sure to limit the set of allowed_urls in config.properties
2024-04-23T05:00:33,945 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2024-04-23T05:00:33,999 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration from /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml
2024-04-23T05:00:34,126 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.10.0
TS Home: /home/venv/lib/python3.9/site-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Metrics config path: /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml
Number of GPUs: 4
Number of CPUs: 48
Max heap size: 30688 M
Python executable: /home/venv/bin/python
Config file: /home/model-server/conf/config.properties
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082
Model Store: /home/model-server/model-store
Initial Models: policy_vs_doc_model.tar.gz
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 32
Netty client threads: 0
Default workers per model: 4
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: LOG
Disable system metrics: false
Workflow Store: /home/model-server/model-store
CPP log config: N/A
Model config: N/A
System metrics command: default
2024-04-23T05:00:34,133 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2024-04-23T05:00:34,150 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: policy_vs_doc_model.tar.gz
2024-04-23T05:00:35,076 [WARN ] main org.pytorch.serve.archive.model.ModelArchive - Model archive version is not defined. Please upgrade to torch-model-archiver 0.2.0 or higher
2024-04-23T05:00:35,077 [WARN ] main org.pytorch.serve.archive.model.ModelArchive - Model archive createdOn is not defined. Please upgrade to torch-model-archiver 0.2.0 or higher
2024-04-23T05:00:35,079 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model policy_vs_doc_model_tar_gz
2024-04-23T05:00:35,079 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model policy_vs_doc_model_tar_gz
2024-04-23T05:00:35,079 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model policy_vs_doc_model_tar_gz loaded.
2024-04-23T05:00:35,080 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: policy_vs_doc_model_tar_gz, count: 4
2024-04-23T05:00:35,087 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9000, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2024-04-23T05:00:35,087 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9003, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2024-04-23T05:00:35,087 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9002, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2024-04-23T05:00:35,088 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2024-04-23T05:00:35,087 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9001, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2024-04-23T05:00:35,139 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2024-04-23T05:00:35,139 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2024-04-23T05:00:35,140 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2024-04-23T05:00:35,141 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2024-04-23T05:00:35,141 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
Model server started.
2024-04-23T05:00:35,386 [WARN ] pool-3-thread-1 org.pytorch.serve.metrics.MetricCollector - worker pid is not available yet.
2024-04-23T05:00:36,395 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - s_name_part0=/home/model-server/tmp/.ts.sock, s_name_part1=9003, pid=41
2024-04-23T05:00:36,396 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9003
2024-04-23T05:00:36,397 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - s_name_part0=/home/model-server/tmp/.ts.sock, s_name_part1=9002, pid=43
2024-04-23T05:00:36,398 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9002
2024-04-23T05:00:36,399 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - s_name_part0=/home/model-server/tmp/.ts.sock, s_name_part1=9000, pid=42
2024-04-23T05:00:36,400 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000
2024-04-23T05:00:36,403 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Successfully loaded /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml.
2024-04-23T05:00:36,404 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - [PID]41
2024-04-23T05:00:36,404 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Torch worker started.
2024-04-23T05:00:36,404 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-policy_vs_doc_model_tar_gz_1.0 State change null -> WORKER_STARTED
2024-04-23T05:00:36,404 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Python runtime: 3.9.18
2024-04-23T05:00:36,406 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Successfully loaded /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml.
2024-04-23T05:00:36,407 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - [PID]43
2024-04-23T05:00:36,407 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9002-policy_vs_doc_model_tar_gz_1.0 State change null -> WORKER_STARTED
2024-04-23T05:00:36,407 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - s_name_part0=/home/model-server/tmp/.ts.sock, s_name_part1=9001, pid=44
2024-04-23T05:00:36,407 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Torch worker started.
2024-04-23T05:00:36,407 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Python runtime: 3.9.18
2024-04-23T05:00:36,407 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9001
2024-04-23T05:00:36,408 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9002
2024-04-23T05:00:36,408 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9003
2024-04-23T05:00:36,409 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Successfully loaded /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml.
2024-04-23T05:00:36,409 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - [PID]42
2024-04-23T05:00:36,409 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Torch worker started.
2024-04-23T05:00:36,409 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-policy_vs_doc_model_tar_gz_1.0 State change null -> WORKER_STARTED
2024-04-23T05:00:36,409 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Python runtime: 3.9.18
2024-04-23T05:00:36,409 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000
2024-04-23T05:00:36,415 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9002.
2024-04-23T05:00:36,415 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
2024-04-23T05:00:36,415 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9003.
2024-04-23T05:00:36,416 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Successfully loaded /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml.
2024-04-23T05:00:36,416 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - [PID]44
2024-04-23T05:00:36,416 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Torch worker started.
2024-04-23T05:00:36,416 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Python runtime: 3.9.18
2024-04-23T05:00:36,416 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-policy_vs_doc_model_tar_gz_1.0 State change null -> WORKER_STARTED
2024-04-23T05:00:36,417 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9001
2024-04-23T05:00:36,418 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD repeats 1 to backend at: 1713848436418
2024-04-23T05:00:36,418 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9001.
2024-04-23T05:00:36,418 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD repeats 1 to backend at: 1713848436418
2024-04-23T05:00:36,418 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD repeats 1 to backend at: 1713848436418
2024-04-23T05:00:36,418 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD repeats 1 to backend at: 1713848436418
2024-04-23T05:00:36,421 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Looping backend response at: 1713848436421
2024-04-23T05:00:36,421 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Looping backend response at: 1713848436421
2024-04-23T05:00:36,421 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Looping backend response at: 1713848436421
2024-04-23T05:00:36,421 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Looping backend response at: 1713848436421
2024-04-23T05:00:36,450 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - model_name: policy_vs_doc_model_tar_gz, batchSize: 1
2024-04-23T05:00:36,450 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - model_name: policy_vs_doc_model_tar_gz, batchSize: 1
2024-04-23T05:00:36,451 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - model_name: policy_vs_doc_model_tar_gz, batchSize: 1
2024-04-23T05:00:36,451 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - model_name: policy_vs_doc_model_tar_gz, batchSize: 1
2024-04-23T05:00:36,451 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Backend worker process died.
2024-04-23T05:00:36,451 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Backend worker process died.
2024-04-23T05:00:36,451 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Backend worker process died.
2024-04-23T05:00:36,451 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Backend worker process died.
2024-04-23T05:00:36,451 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:36,452 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:36,452 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:36,452 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:36,452 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 108, in load
2024-04-23T05:00:36,452 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 108, in load
2024-04-23T05:00:36,452 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 108, in load
2024-04-23T05:00:36,452 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 108, in load
2024-04-23T05:00:36,452 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
2024-04-23T05:00:36,452 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
2024-04-23T05:00:36,452 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
2024-04-23T05:00:36,452 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
2024-04-23T05:00:36,452 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 153, in _load_handler_file
2024-04-23T05:00:36,453 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 153, in _load_handler_file
2024-04-23T05:00:36,453 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 153, in _load_handler_file
2024-04-23T05:00:36,453 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name)
2024-04-23T05:00:36,453 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name)
2024-04-23T05:00:36,453 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 153, in _load_handler_file
2024-04-23T05:00:36,453 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name)
2024-04-23T05:00:36,453 [INFO ] epollEventLoopGroup-5-4 org.pytorch.serve.wlm.WorkerThread - 9001 Worker disconnected. WORKER_STARTED
2024-04-23T05:00:36,453 [INFO ] epollEventLoopGroup-5-2 org.pytorch.serve.wlm.WorkerThread - 9002 Worker disconnected. WORKER_STARTED
2024-04-23T05:00:36,453 [INFO ] epollEventLoopGroup-5-3 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
2024-04-23T05:00:36,453 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:36,453 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.wlm.WorkerThread - 9003 Worker disconnected. WORKER_STARTED
2024-04-23T05:00:36,453 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:36,453 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name)
2024-04-23T05:00:36,453 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:36,453 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:36,453 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2024-04-23T05:00:36,453 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2024-04-23T05:00:36,454 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:36,453 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2024-04-23T05:00:36,454 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:36,453 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2024-04-23T05:00:36,454 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:36,454 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:36,454 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:36,454 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:36,454 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:36,454 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:36,454 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:36,454 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:36,454 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:36,454 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:36,454 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:36,454 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:36,455 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:36,455 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:36,455 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:36,455 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:36,455 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:36,455 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:36,455 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - During handling of the above exception, another exception occurred:
2024-04-23T05:00:36,455 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:36,455 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:36,455 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:36,455 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:36,455 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:36,455 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - During handling of the above exception, another exception occurred:
2024-04-23T05:00:36,455 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:36,456 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - During handling of the above exception, another exception occurred:
2024-04-23T05:00:36,456 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - During handling of the above exception, another exception occurred:
2024-04-23T05:00:36,456 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 263, in <module>
2024-04-23T05:00:36,456 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:36,456 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:36,456 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:36,456 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:36,456 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     worker.run_server()
2024-04-23T05:00:36,456 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:36,456 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 263, in <module>
2024-04-23T05:00:36,456 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:36,456 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 231, in run_server
2024-04-23T05:00:36,456 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 263, in <module>
2024-04-23T05:00:36,456 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     worker.run_server()
2024-04-23T05:00:36,457 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     worker.run_server()
2024-04-23T05:00:36,457 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     self.handle_connection(cl_socket)
2024-04-23T05:00:36,457 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 263, in <module>
2024-04-23T05:00:36,457 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 231, in run_server
2024-04-23T05:00:36,457 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 231, in run_server
2024-04-23T05:00:36,457 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 194, in handle_connection
2024-04-23T05:00:36,457 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     worker.run_server()
2024-04-23T05:00:36,457 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     self.handle_connection(cl_socket)
2024-04-23T05:00:36,457 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     self.handle_connection(cl_socket)
2024-04-23T05:00:36,457 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 231, in run_server
2024-04-23T05:00:36,457 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service, result, code = self.load_model(msg)
2024-04-23T05:00:36,457 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 194, in handle_connection
2024-04-23T05:00:36,457 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     self.handle_connection(cl_socket)
2024-04-23T05:00:36,458 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 194, in handle_connection
2024-04-23T05:00:36,458 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 194, in handle_connection
2024-04-23T05:00:36,458 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 131, in load_model
2024-04-23T05:00:36,458 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service, result, code = self.load_model(msg)
2024-04-23T05:00:36,458 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service, result, code = self.load_model(msg)
2024-04-23T05:00:36,458 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service, result, code = self.load_model(msg)
2024-04-23T05:00:36,458 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 131, in load_model
2024-04-23T05:00:36,458 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service = model_loader.load(
2024-04-23T05:00:36,458 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 131, in load_model
2024-04-23T05:00:36,458 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 131, in load_model
2024-04-23T05:00:36,459 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 110, in load
2024-04-23T05:00:36,459 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service = model_loader.load(
2024-04-23T05:00:36,459 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service = model_loader.load(
2024-04-23T05:00:36,459 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service = model_loader.load(
2024-04-23T05:00:36,459 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = self._load_default_handler(handler)
2024-04-23T05:00:36,459 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 110, in load
2024-04-23T05:00:36,459 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 110, in load
2024-04-23T05:00:36,459 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 110, in load
2024-04-23T05:00:36,459 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 159, in _load_default_handler
2024-04-23T05:00:36,459 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = self._load_default_handler(handler)
2024-04-23T05:00:36,460 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = self._load_default_handler(handler)
2024-04-23T05:00:36,460 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = self._load_default_handler(handler)
2024-04-23T05:00:36,460 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name, "ts.torch_handler")
2024-04-23T05:00:36,460 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 159, in _load_default_handler
2024-04-23T05:00:36,460 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:36,460 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 159, in _load_default_handler
2024-04-23T05:00:36,460 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 159, in _load_default_handler
2024-04-23T05:00:36,460 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name, "ts.torch_handler")
2024-04-23T05:00:36,460 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:36,460 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name, "ts.torch_handler")
2024-04-23T05:00:36,461 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name, "ts.torch_handler")
2024-04-23T05:00:36,461 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:36,461 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:36,461 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:36,461 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:36,461 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:36,461 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:36,461 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:36,461 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:36,461 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:36,461 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:36,461 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:36,461 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:36,461 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:36,461 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:36,461 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named 'ts.torch_handler./home/model-server/tmp/models/124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:36,461 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:36,461 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:36,462 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:36,462 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:36,462 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named 'ts.torch_handler./home/model-server/tmp/models/124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:36,462 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named 'ts.torch_handler./home/model-server/tmp/models/124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:36,462 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named 'ts.torch_handler./home/model-server/tmp/models/124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:36,454 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died., responseTimeout:120sec
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1679) ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:229) [model-server.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-23T05:00:36,454 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died., responseTimeout:120sec
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1679) ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:229) [model-server.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-23T05:00:36,454 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died., responseTimeout:120sec
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1679) ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:229) [model-server.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-23T05:00:36,454 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died., responseTimeout:120sec
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1679) ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:229) [model-server.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-23T05:00:36,466 [WARN ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: policy_vs_doc_model_tar_gz, error: Worker died.
2024-04-23T05:00:36,466 [WARN ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: policy_vs_doc_model_tar_gz, error: Worker died.
2024-04-23T05:00:36,466 [WARN ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: policy_vs_doc_model_tar_gz, error: Worker died.
2024-04-23T05:00:36,466 [WARN ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: policy_vs_doc_model_tar_gz, error: Worker died.
2024-04-23T05:00:36,466 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-04-23T05:00:36,466 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-04-23T05:00:36,466 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-04-23T05:00:36,466 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9002-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-04-23T05:00:36,467 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery start timestamp: 1713848436467
2024-04-23T05:00:36,467 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery start timestamp: 1713848436467
2024-04-23T05:00:36,467 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery start timestamp: 1713848436467
2024-04-23T05:00:36,467 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery start timestamp: 1713848436467
2024-04-23T05:00:36,468 [WARN ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9003-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:36,468 [WARN ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9002-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:36,468 [WARN ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9001-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:36,468 [WARN ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9003-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:36,468 [WARN ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9002-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:36,468 [WARN ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:36,468 [WARN ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9001-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:36,468 [WARN ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:36,469 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9003 in 1 seconds.
2024-04-23T05:00:36,469 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9000 in 1 seconds.
2024-04-23T05:00:36,469 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9002 in 1 seconds.
2024-04-23T05:00:36,469 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9001 in 1 seconds.
2024-04-23T05:00:36,497 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9003-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:36,497 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:36,497 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9003-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:36,497 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:36,497 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9002-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:36,498 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9002-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:36,503 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9001-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:36,503 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9001-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:37,278 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:20.0|#Level:Host|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,278 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:307.1441421508789|#Level:Host|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,279 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:685.0881881713867|#Level:Host|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,279 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:69.0|#Level:Host|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,279 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,DeviceId:0|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,279 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0.0|#Level:Host,DeviceId:0|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,280 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,DeviceId:1|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,280 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0.0|#Level:Host,DeviceId:1|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,280 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,DeviceId:2|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,280 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0.0|#Level:Host,DeviceId:2|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,280 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,DeviceId:3|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,280 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0.0|#Level:Host,DeviceId:3|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,280 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0.0|#Level:Host,DeviceId:0|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,281 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0.0|#Level:Host,DeviceId:1|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,281 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0.0|#Level:Host,DeviceId:2|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,281 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0.0|#Level:Host,DeviceId:3|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,281 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:187566.625|#Level:Host|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,281 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:1883.01953125|#Level:Host|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,281 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:1.9|#Level:Host|#hostname:1fb7bafa693a,timestamp:1713848437
2024-04-23T05:00:37,470 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9001, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2024-04-23T05:00:37,470 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9002, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2024-04-23T05:00:37,470 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9000, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2024-04-23T05:00:37,470 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9003, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2024-04-23T05:00:38,732 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - s_name_part0=/home/model-server/tmp/.ts.sock, s_name_part1=9002, pid=357
2024-04-23T05:00:38,732 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9002
2024-04-23T05:00:38,740 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - s_name_part0=/home/model-server/tmp/.ts.sock, s_name_part1=9000, pid=359
2024-04-23T05:00:38,741 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000
2024-04-23T05:00:38,741 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Successfully loaded /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml.
2024-04-23T05:00:38,741 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - [PID]357
2024-04-23T05:00:38,741 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Torch worker started.
2024-04-23T05:00:38,741 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Python runtime: 3.9.18
2024-04-23T05:00:38,741 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9002-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STOPPED -> WORKER_STARTED
2024-04-23T05:00:38,742 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9002
2024-04-23T05:00:38,743 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9002.
2024-04-23T05:00:38,743 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD repeats 1 to backend at: 1713848438743
2024-04-23T05:00:38,743 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Looping backend response at: 1713848438743
2024-04-23T05:00:38,749 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Successfully loaded /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml.
2024-04-23T05:00:38,749 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - [PID]359
2024-04-23T05:00:38,749 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Torch worker started.
2024-04-23T05:00:38,749 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Python runtime: 3.9.18
2024-04-23T05:00:38,749 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STOPPED -> WORKER_STARTED
2024-04-23T05:00:38,749 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000
2024-04-23T05:00:38,751 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD repeats 1 to backend at: 1713848438751
2024-04-23T05:00:38,751 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
2024-04-23T05:00:38,751 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Looping backend response at: 1713848438751
2024-04-23T05:00:38,762 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - s_name_part0=/home/model-server/tmp/.ts.sock, s_name_part1=9003, pid=358
2024-04-23T05:00:38,763 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9003
2024-04-23T05:00:38,768 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Successfully loaded /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml.
2024-04-23T05:00:38,769 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - [PID]358
2024-04-23T05:00:38,769 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Torch worker started.
2024-04-23T05:00:38,769 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Python runtime: 3.9.18
2024-04-23T05:00:38,769 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STOPPED -> WORKER_STARTED
2024-04-23T05:00:38,769 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - s_name_part0=/home/model-server/tmp/.ts.sock, s_name_part1=9001, pid=356
2024-04-23T05:00:38,769 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9003
2024-04-23T05:00:38,769 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9001
2024-04-23T05:00:38,770 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9003.
2024-04-23T05:00:38,770 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD repeats 1 to backend at: 1713848438770
2024-04-23T05:00:38,770 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Looping backend response at: 1713848438770
2024-04-23T05:00:38,771 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - model_name: policy_vs_doc_model_tar_gz, batchSize: 1
2024-04-23T05:00:38,786 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Successfully loaded /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml.
2024-04-23T05:00:38,786 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - [PID]356
2024-04-23T05:00:38,786 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Torch worker started.
2024-04-23T05:00:38,786 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STOPPED -> WORKER_STARTED
2024-04-23T05:00:38,786 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Python runtime: 3.9.18
2024-04-23T05:00:38,786 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Backend worker process died.
2024-04-23T05:00:38,787 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9001
2024-04-23T05:00:38,787 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:38,787 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 108, in load
2024-04-23T05:00:38,787 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
2024-04-23T05:00:38,787 [INFO ] epollEventLoopGroup-5-5 org.pytorch.serve.wlm.WorkerThread - 9002 Worker disconnected. WORKER_STARTED
2024-04-23T05:00:38,787 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - model_name: policy_vs_doc_model_tar_gz, batchSize: 1
2024-04-23T05:00:38,787 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 153, in _load_handler_file
2024-04-23T05:00:38,787 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2024-04-23T05:00:38,787 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name)
2024-04-23T05:00:38,787 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:38,787 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:38,788 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:38,788 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:38,787 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died., responseTimeout:120sec
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1679) ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:229) [model-server.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-23T05:00:38,788 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:38,788 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9001.
2024-04-23T05:00:38,788 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD repeats 1 to backend at: 1713848438788
2024-04-23T05:00:38,788 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:38,788 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:38,788 [WARN ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: policy_vs_doc_model_tar_gz, error: Worker died.
2024-04-23T05:00:38,788 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Looping backend response at: 1713848438788
2024-04-23T05:00:38,788 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - During handling of the above exception, another exception occurred:
2024-04-23T05:00:38,788 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Backend worker process died.
2024-04-23T05:00:38,788 [DEBUG] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9002-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-04-23T05:00:38,788 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:38,788 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:38,799 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 108, in load
2024-04-23T05:00:38,799 [WARN ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again
2024-04-23T05:00:38,799 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:38,799 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
2024-04-23T05:00:38,799 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 263, in <module>
2024-04-23T05:00:38,799 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 153, in _load_handler_file
2024-04-23T05:00:38,799 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     worker.run_server()
2024-04-23T05:00:38,799 [INFO ] epollEventLoopGroup-5-6 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
2024-04-23T05:00:38,799 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name)
2024-04-23T05:00:38,799 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:38,799 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:38,799 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:38,800 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:38,800 [WARN ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9002-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:38,800 [WARN ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9002-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:38,800 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:38,800 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9002 in 1 seconds.
2024-04-23T05:00:38,800 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:38,811 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 231, in run_server
2024-04-23T05:00:38,811 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2024-04-23T05:00:38,812 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9002-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:38,800 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:38,812 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - During handling of the above exception, another exception occurred:
2024-04-23T05:00:38,812 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:38,812 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died., responseTimeout:120sec
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1679) ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:229) [model-server.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-23T05:00:38,812 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:38,812 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 263, in <module>
2024-04-23T05:00:38,812 [WARN ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: policy_vs_doc_model_tar_gz, error: Worker died.
2024-04-23T05:00:38,812 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     worker.run_server()
2024-04-23T05:00:38,812 [DEBUG] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 231, in run_server
2024-04-23T05:00:38,813 [WARN ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     self.handle_connection(cl_socket)
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 194, in handle_connection
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service, result, code = self.load_model(msg)
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 131, in load_model
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service = model_loader.load(
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 110, in load
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = self._load_default_handler(handler)
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 159, in _load_default_handler
2024-04-23T05:00:38,813 [WARN ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:38,813 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name, "ts.torch_handler")
2024-04-23T05:00:38,813 [WARN ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:38,814 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:38,814 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9000 in 1 seconds.
2024-04-23T05:00:38,814 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:38,826 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - model_name: policy_vs_doc_model_tar_gz, batchSize: 1
2024-04-23T05:00:38,826 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - model_name: policy_vs_doc_model_tar_gz, batchSize: 1
2024-04-23T05:00:38,826 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Backend worker process died.
2024-04-23T05:00:38,826 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Backend worker process died.
2024-04-23T05:00:38,827 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:38,827 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:38,827 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 108, in load
2024-04-23T05:00:38,827 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 108, in load
2024-04-23T05:00:38,827 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
2024-04-23T05:00:38,827 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
2024-04-23T05:00:38,827 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 153, in _load_handler_file
2024-04-23T05:00:38,827 [INFO ] epollEventLoopGroup-5-7 org.pytorch.serve.wlm.WorkerThread - 9003 Worker disconnected. WORKER_STARTED
2024-04-23T05:00:38,827 [INFO ] epollEventLoopGroup-5-8 org.pytorch.serve.wlm.WorkerThread - 9001 Worker disconnected. WORKER_STARTED
2024-04-23T05:00:38,827 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name)
2024-04-23T05:00:38,827 [INFO ] W-9002-policy_vs_doc_model_tar_gz_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9002-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:38,827 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 153, in _load_handler_file
2024-04-23T05:00:38,827 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:38,827 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2024-04-23T05:00:38,827 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:38,827 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2024-04-23T05:00:38,828 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:38,828 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name)
2024-04-23T05:00:38,828 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:38,827 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died., responseTimeout:120sec
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1679) ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:229) [model-server.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-23T05:00:38,828 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:38,828 [WARN ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: policy_vs_doc_model_tar_gz, error: Worker died.
2024-04-23T05:00:38,828 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:38,828 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:38,828 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:38,828 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:38,828 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died., responseTimeout:120sec
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1679) ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:435) ~[?:?]
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:229) [model-server.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-23T05:00:38,828 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
2024-04-23T05:00:38,828 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - During handling of the above exception, another exception occurred:
2024-04-23T05:00:38,828 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
2024-04-23T05:00:38,828 [DEBUG] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-04-23T05:00:38,828 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
2024-04-23T05:00:38,828 [WARN ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again
2024-04-23T05:00:38,828 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:38,828 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - During handling of the above exception, another exception occurred:
2024-04-23T05:00:38,829 [WARN ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: policy_vs_doc_model_tar_gz, error: Worker died.
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - 
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 263, in <module>
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     worker.run_server()
2024-04-23T05:00:38,829 [DEBUG] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-policy_vs_doc_model_tar_gz_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 263, in <module>
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 231, in run_server
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     worker.run_server()
2024-04-23T05:00:38,829 [WARN ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     self.handle_connection(cl_socket)
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 231, in run_server
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 194, in handle_connection
2024-04-23T05:00:38,829 [WARN ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9001-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     self.handle_connection(cl_socket)
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service, result, code = self.load_model(msg)
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 131, in load_model
2024-04-23T05:00:38,829 [WARN ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9001-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     service = model_loader.load(
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_service_worker.py", line 194, in handle_connection
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 110, in load
2024-04-23T05:00:38,829 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = self._load_default_handler(handler)
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9001 in 1 seconds.
2024-04-23T05:00:38,830 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/home/venv/lib/python3.9/site-packages/ts/model_loader.py", line 159, in _load_default_handler
2024-04-23T05:00:38,829 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9001-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:38,830 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name, "ts.torch_handler")
2024-04-23T05:00:38,830 [WARN ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9003-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:38,830 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
2024-04-23T05:00:38,830 [WARN ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9003-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:38,830 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-23T05:00:38,830 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9003 in 1 seconds.
2024-04-23T05:00:38,830 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9003-policy_vs_doc_model_tar_gz_1.0-stdout
2024-04-23T05:00:38,841 [INFO ] W-9000-policy_vs_doc_model_tar_gz_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:38,855 [INFO ] W-9001-policy_vs_doc_model_tar_gz_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9001-policy_vs_doc_model_tar_gz_1.0-stderr
2024-04-23T05:00:38,858 [INFO ] W-9003-policy_vs_doc_model_tar_gz_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9003-policy_vs_doc_model_tar_gz_1.0-stderr

Installation instructions

Dockerfile:

FROM pytorch/torchserve:latest-gpu

# Install dependencies
RUN pip install torch setfit sagemaker-inference

# Copy config.properties
COPY config.properties /home/model-server/conf/config.properties
# Copy policy_vs_doc_model.tar.gz
COPY policy_vs_doc_model.tar.gz /home/model-server/model-store/policy_vs_doc_model.tar.gz

USER root
RUN chmod +x /home/model-server/conf/config.properties

ENV TS_CONFIG_FILE /home/model-server/conf/config.properties

build and run docker:

docker build -t torchserve-setfit .
docker run --rm -it --gpus all torchserve-setfit:latest

Model Packaing

whats inside the policy_vs_doc_model.tar.gz:

tar -tzf policy_vs_doc_model.tar.gz
._.
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./
./._model.safetensors
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./model.safetensors
./._2_Normalize
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./2_Normalize/
./._model_head.pkl
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./model_head.pkl
./._1_Pooling
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./1_Pooling/
./._tokenizer_config.json
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./tokenizer_config.json
./._special_tokens_map.json
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./special_tokens_map.json
./._config.json
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./config.json
./policy_vs_doc_model.tar.gz
./._config_sentence_transformers.json
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./config_sentence_transformers.json
./._tokenizer.json
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./tokenizer.json
./._config_setfit.json
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./config_setfit.json
./._code
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./code/
./._README.md
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./README.md
./._sentence_bert_config.json
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./sentence_bert_config.json
./._vocab.txt
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./vocab.txt
./._.ipynb_checkpoints
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./.ipynb_checkpoints/
./._modules.json
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./modules.json
./.ipynb_checkpoints/._README-checkpoint.md
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./.ipynb_checkpoints/README-checkpoint.md
./code/._requirements.txt
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./code/requirements.txt
./code/._inference.py
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./code/inference.py
./code/._.ipynb_checkpoints
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./code/.ipynb_checkpoints/
./code/.ipynb_checkpoints/._requirements-checkpoint.txt
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./code/.ipynb_checkpoints/requirements-checkpoint.txt
./code/.ipynb_checkpoints/._inference-checkpoint.py
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./code/.ipynb_checkpoints/inference-checkpoint.py
./1_Pooling/._config.json
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
./1_Pooling/config.json

tree view (info: extracted policy_vs_doc_model.tar.gz to model dir so we can tree view it):

tree model
model/
├── 1_Pooling
│   └── config.json
├── 2_Normalize
├── README.md
├── code
│   ├── inference.py
│   └── requirements.txt
├── config.json
├── config_sentence_transformers.json
├── config_setfit.json
├── model.safetensors
├── model_head.pkl
├── modules.json
├── sentence_bert_config.json
├── special_tokens_map.json
├── tokenizer.json
├── tokenizer_config.json
└── vocab.txt

3 directories, 15 files

./code/inference.py:

import ast

from sagemaker_inference import encoder, decoder
from setfit import SetFitModel


def model_fn(model_dir):
    model = SetFitModel.from_pretrained(model_dir)
    model.to('cuda')
    print(f"model loaded successfully {model}")
    return model


def input_fn(input_data, content_type):
    """A default input_fn that can handle JSON, CSV and NPZ formats.

    Args:
        input_data: the request payload serialized in the content_type format
        content_type: the request content_type

    Returns: input_data deserialized into the expected format. Currently expected
        format is {"inputs": ["q1", "q2", ...]}
    """
    decoded = None
    try:
        print(f"input_data: {input_data}, content_type: {content_type}")
        decoded = decoder.decode(input_data, content_type)
        print(f"decoded input: {decoded}, content_type: {content_type}")
        return ast.literal_eval(str(decoded))
    except Exception as e:
        print(f"invalid input. input: {decoded}, error: {e}")
        raise e


def output_fn(prediction, accept):
    """A default output_fn for PyTorch. Serializes predictions from predict_fn to JSON, CSV or NPY format.

    Args:
        prediction: a prediction result from predict_fn
        accept: type which the output data needs to be serialized

    Returns: output data serialized
    """
    print(f"prediction: {prediction}, prediction type: {type(prediction)}, accept: {accept}")
    encoded = encoder.encode(prediction, accept)
    print(f"encoded output: {encoded}, content_type: {accept}")
    return encoded


def predict_fn(data, model):
    """A default predict_fn for PyTorch. Calls a model on data deserialized in input_fn.
    Runs prediction on GPU if cuda is available.

    Args:
        data: input data for prediction deserialized by input_fn
        model: PyTorch model loaded in memory by model_fn

    Returns: a prediction
    """
    try:
        # device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        # model.to(device)
        # model.eval()
        # with torch.no_grad():
        #     return model(data.to(device))

        print(f"data: {data}, data_type: {type(data)}")
        inputs = data.get("inputs", None)
        if inputs is None:
            raise Exception(f"\"inputs\" not found: {data}")
        return model.predict(inputs)
    except Exception as e:
        print(f"predict_fn error: {e}")
        raise e

./code/requirements.txt:

sagemaker-inference==1.10.1
setfit==1.0.1
transformers==4.37.2

config.properties

inference_address=http://127.0.0.1:8080
management_address=http://127.0.0.1:8081
metrics_address=http://127.0.0.1:8082
number_of_netty_threads=32
job_queue_size=1000
model_store=/home/model-server/model-store
#workflow_store=/home/model-server/wf-store
enable_metrics_api=true
load_models=policy_vs_doc_model.tar.gz

Versions

used docker image: pytorch/torchserve:latest-gpu

Repro instructions

Possible Solution

@geraldstanje geraldstanje changed the title error loading model Load model failed - error: Worker died Apr 23, 2024
@mreso
Copy link
Collaborator

mreso commented Apr 25, 2024

Hi @geraldstanje
Seems like you're trying to deploy a model using a Sagemaker example. SageMake uses TorchServe for model deployment but the model artifact you're creating can not directly be deployed with TorchServe. Is there a specific tutorial you're following?

Not too familiar with Sagemaker itself but the inference.py script you're providing looks like you're trying to deploy a setfit model. You will need to integrate this into a TorchServe handler and package it with the model-archiver into a tar.gz file which will add important meta information. Please have a look as our XGBoost example which should be easily adaptable to your use case as you can basically deploy any framework or library through this approach.

Let me know if you have further questions.

@geraldstanje
Copy link
Author

geraldstanje commented May 4, 2024

@mreso thanks for pointing out - is there a simple way to convertit to run setFit models with TorchServe? can i copy the code i have into a BaseHandler and implement those functions?

does the sagemaker return the same datatype / format as the BaseHandler, what is required?

@agunapal
Copy link
Collaborator

agunapal commented May 7, 2024

cc @namannandan

@mreso
Copy link
Collaborator

mreso commented May 9, 2024

@geraldstanje yes, you basically follow the XGBoost example to create your own handler or if your model is a HuggingFace model from their transformers library you can just follow one of of these examples:
https://github.com/pytorch/serve/tree/master/examples/Huggingface_Transformers

Let me know if you're having problems converting your example.

@geraldstanje
Copy link
Author

geraldstanje commented May 12, 2024

@mreso thanks - how is sagemaker than be able to use torchServe if they dont implement the ts.torch_handler.base_handler? lets say i take this as an example: https://github.com/aws/amazon-sagemaker-examples/blob/main/frameworks/pytorch/get_started_mnist_deploy.ipynb

i looked at https://github.com/pytorch/serve/tree/master/examples/xgboost_classfication.
i have the trained setfit model here: torchserve/setfit-test-model:

ls -la torchserve/setfit-test-model/1_Pooling/
total 12
drwx------ 2 ubuntu ubuntu 4096 May 13 03:52 .
drwx------ 4 ubuntu ubuntu 4096 May 13 03:52 ..
-rw------- 1 ubuntu ubuntu  296 May 13 03:52 config.json

ls -la torchserve/setfit-test-model/
total 89728
drwx------ 4 ubuntu ubuntu     4096 May 13 03:52 .
drwx------ 3 ubuntu ubuntu     4096 May 13 03:52 ..
drwx------ 2 ubuntu ubuntu     4096 May 13 03:52 1_Pooling
drwx------ 2 ubuntu ubuntu     4096 May 13 03:52 2_Normalize
-rw------- 1 ubuntu ubuntu     7586 May 13 03:52 README.md
-rw------- 1 ubuntu ubuntu      660 May 13 03:52 config.json
-rw------- 1 ubuntu ubuntu      164 May 13 03:52 config_sentence_transformers.json
-rw------- 1 ubuntu ubuntu      116 May 13 03:52 config_setfit.json
-rw------- 1 ubuntu ubuntu 90864192 May 13 03:52 model.safetensors
-rw------- 1 ubuntu ubuntu    13431 May 13 03:52 model_head.pkl
-rw------- 1 ubuntu ubuntu      349 May 13 03:52 modules.json
-rw------- 1 ubuntu ubuntu       53 May 13 03:52 sentence_bert_config.json
-rw------- 1 ubuntu ubuntu      695 May 13 03:52 special_tokens_map.json
-rw------- 1 ubuntu ubuntu   711649 May 13 03:52 tokenizer.json
-rw------- 1 ubuntu ubuntu     1433 May 13 03:52 tokenizer_config.json
-rw------- 1 ubuntu ubuntu   231508 May 13 03:52 vocab.txt

how can i create the model.pt for the torch-model-archiver?

torch-model-archiver --model-name SetFitModel --version 1.0 --serialized-file torchserve/setfit-test-model/model.pt --handler ./setfit_handler_generalized.py --extra-files "./torchserve/setfit-test-model/config.json,./torchserve/setfit-test-model/config_sentence_transformers.json,./torchserve/setfit-test-model/config_sentence_transformers.json,./torchserve/setfit-test-model/config_setfit.json,./torchserve/setfit-test-model/model.safetensors,./torchserve/setfit-test-model/model_head.pkl,./torchserve/setfit-test-model/modules.json,./torchserve/setfit-test-model/sentence_bert_config.json,./torchserve/setfit-test-model/special_tokens_map.json,./torchserve/setfit-test-model/tokenizer.json,./torchserve/setfit-test-model/tokenizer_config.json,./torchserve/setfit-test-model/vocab.txt,./1_Pooling/config.json"

@namannandan
Copy link
Collaborator

namannandan commented May 14, 2024

@geraldstanje to answer your question

how is sagemaker than be able to use torchServe if they dont implement the ts.torch_handler.base_handler?

The PyTorch inference containers that are compatible with SageMaker install a package called the SageMaker PyTorch Inference Toolkit which provides a handler implementation that is compatible with TorchServe and plugs in the input_fn, predict_fn and output_fn that you provide in the inference.py script above. For reference, please see

If you'd like to create a custom docker container that is SageMaker compatible, I would suggest starting out with a SageMaker PyTorch Inference Container as the base image and build on top of it. For ex: 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.2.0-gpu-py310-cu118-ubuntu20.04-sagemaker.

If you would like to use TorchServe natively on SageMaker, here's an example on the same: https://github.com/aws/amazon-sagemaker-examples/blob/main/inference/torchserve/mme-gpu/torchserve_multi_model_endpoint.ipynb

Also, looking at the error logs, I see from the traceback that the model load failed because the handler was unable to find a necessary module:

ModuleNotFoundError: No module named '124636bff1364f2bbc7f71a8667110af'

Could you please check if all the required dependencies to load the model are either installed in the container or included in the model archive?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants