-
Notifications
You must be signed in to change notification settings - Fork 829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IllegalMonitorState Exceptions on running benchmark tests #82
Comments
@dhaniram-kshirsagar can you attach your benchmark report please? |
@chauhang : The We will look into the metrics logging related issue. |
Please see the s3 url for the Jmeter reports. |
@harshbafna This looks like a bug fix. Can you explain why it is marked as Enhancement? |
It's a bug. |
This happens on CPU as well. |
As indicated in the previous comment the The We will update the system metric documentation and add the following description :
|
Update documentation in following commit in |
Removing the launch blocker tags for this, since the benchmark still runs. I'm not closing it, though, because we don't know whether this condition affects subsequent benchmarks (e.g., by not releasing system resources). |
PR #172 fixes the exception trace logged on worker scale down / model unregistration. |
@harshbafna Please see comments in the PR and see if no other side effects are caused due to removing code for terminating the child processes. |
Seeing many illegalmonitorstate exceptions on running the benchmarks tests. Tested on p2.8xlarge
with the default configurations using the config.properties in the benchmark folder and adding model store path to it.
Output
(pytorch_p36) ubuntu@ip-172-31-41-247:~/serve/benchmarks$ python benchmark.py throughput --ts http://127.0.0.1:8080
Running benchmark throughput with model resnet-18
Processing jmeter output
Output available at /tmp/TSBenchmark/out/throughput/resnet-18
Report generated at /tmp/TSBenchmark/out/throughput/resnet-18/report/index.html
[{'throughput_resnet-18_Inference_Request_Average': 670,
'throughput_resnet-18_Inference_Request_Median': 526,
'throughput_resnet-18_Inference_Request_Throughput': 137.0,
'throughput_resnet-18_Inference_Request_aggregate_report_90_line': 681,
'throughput_resnet-18_Inference_Request_aggregate_report_99_line': 8585,
'throughput_resnet-18_Inference_Request_aggregate_report_error': '0.00%'}]
Jmeter reports: Don't show any errors
https://sagemaker-pt.s3-us-west-2.amazonaws.com/bench.zip
Even though hits/sec reaches 200/sec, in the metrics logged on the console there is never more than one active request at any time. So unclear if the metrics are getting logged properly. The Requests2xx.Count is always 1 as below:
2020-03-07 01:58:50,800 [INFO ] W-9015-resnet-18_1.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:ip-172-31-41-247,timestamp:null
Errors in TrochServe console logs
2020-03-07 01:58:51,109 [INFO ] W-9008-resnet-18_1.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:ip-172-31-41-247,timestamp:null
2020-03-07 01:58:51,109 [DEBUG] W-9008-resnet-18_1.0 org.pytorch.serve.wlm.Job - Waiting time: 0, Backend time: 38
2020-03-07 01:58:51,115 [INFO ] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 40
2020-03-07 01:58:51,115 [INFO ] W-9023-resnet-18_1.0 ACCESS_LOG - /127.0.0.1:50756 "POST /predictions/resnet-18 HTTP/1.1" 200 50
2020-03-07 01:58:51,115 [INFO ] W-9023-resnet-18_1.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:ip-172-31-41-247,timestamp:null
2020-03-07 01:58:51,115 [DEBUG] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.Job - Waiting time: 0, Backend time: 49
2020-03-07 01:58:51,115 [INFO ] W-9023-resnet-18_1.0-stdout MODEL_METRICS - PredictionTime.Milliseconds:36.98|#ModelName:resnet-18,Level:Model|#hostname:ip-172-31-41-247,requestID:63c3d128-d2ab-4e60-8ab7-fa9905b6de93,timestamp:1583546331
2020-03-07 01:58:51,130 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.ModelVersionedRefs - Removed model: resnet-18 version: 1.0
2020-03-07 01:58:51,131 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9031-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:51,133 [WARN ] W-9031-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:51,145 [INFO ] epollEventLoopGroup-4-32 org.pytorch.serve.wlm.WorkerThread - 9031 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:51,146 [DEBUG] W-9031-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9031-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:51,148 [DEBUG] W-9031-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:51,315 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9030-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:51,316 [WARN ] W-9030-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:51,317 [INFO ] epollEventLoopGroup-4-29 org.pytorch.serve.wlm.WorkerThread - 9030 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:51,317 [DEBUG] W-9030-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9030-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:51,318 [DEBUG] W-9030-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:51,584 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9029-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:51,585 [WARN ] W-9029-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:51,585 [INFO ] epollEventLoopGroup-4-31 org.pytorch.serve.wlm.WorkerThread - 9029 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:51,586 [DEBUG] W-9029-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9029-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:51,587 [DEBUG] W-9029-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:51,855 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9028-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:51,855 [WARN ] W-9028-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:51,856 [INFO ] epollEventLoopGroup-4-28 org.pytorch.serve.wlm.WorkerThread - 9028 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:51,857 [DEBUG] W-9028-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9028-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:51,858 [DEBUG] W-9028-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:52,124 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9027-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:52,125 [WARN ] W-9027-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:52,126 [INFO ] epollEventLoopGroup-4-30 org.pytorch.serve.wlm.WorkerThread - 9027 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:52,127 [DEBUG] W-9027-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9027-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:52,129 [DEBUG] W-9027-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:52,394 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9026-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:52,394 [WARN ] W-9026-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:52,395 [INFO ] epollEventLoopGroup-4-26 org.pytorch.serve.wlm.WorkerThread - 9026 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:52,395 [DEBUG] W-9026-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9026-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:52,396 [DEBUG] W-9026-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:52,664 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9025-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:52,664 [WARN ] W-9025-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:52,665 [INFO ] epollEventLoopGroup-4-27 org.pytorch.serve.wlm.WorkerThread - 9025 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:52,665 [DEBUG] W-9025-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9025-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:52,666 [DEBUG] W-9025-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:52,932 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9024-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:52,932 [WARN ] W-9024-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:52,933 [INFO ] epollEventLoopGroup-4-24 org.pytorch.serve.wlm.WorkerThread - 9024 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:52,933 [DEBUG] W-9024-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9024-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:52,934 [DEBUG] W-9024-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:53,202 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9023-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:53,202 [WARN ] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:53,203 [INFO ] epollEventLoopGroup-4-25 org.pytorch.serve.wlm.WorkerThread - 9023 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:53,203 [DEBUG] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9023-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:53,204 [DEBUG] W-9023-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:53,359 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9022-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:53,359 [WARN ] W-9022-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:53,360 [INFO ] epollEventLoopGroup-4-17 org.pytorch.serve.wlm.WorkerThread - 9022 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:53,360 [DEBUG] W-9022-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9022-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:53,361 [DEBUG] W-9022-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:53,626 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9021-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:53,626 [WARN ] W-9021-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:53,627 [INFO ] epollEventLoopGroup-4-19 org.pytorch.serve.wlm.WorkerThread - 9021 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:53,627 [DEBUG] W-9021-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9021-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:53,628 [DEBUG] W-9021-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:53,892 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9020-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:53,893 [WARN ] W-9020-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:53,893 [INFO ] epollEventLoopGroup-4-21 org.pytorch.serve.wlm.WorkerThread - 9020 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:53,893 [DEBUG] W-9020-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9020-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:53,895 [DEBUG] W-9020-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:54,158 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9019-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:54,158 [WARN ] W-9019-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:54,159 [INFO ] epollEventLoopGroup-4-18 org.pytorch.serve.wlm.WorkerThread - 9019 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:54,159 [DEBUG] W-9019-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9019-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:54,159 [DEBUG] W-9019-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:54,427 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9018-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:54,428 [INFO ] epollEventLoopGroup-4-4 org.pytorch.serve.wlm.WorkerThread - 9018 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:54,429 [DEBUG] W-9018-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Shutting down the thread .. Scaling down.
2020-03-07 01:58:54,429 [DEBUG] W-9018-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9018-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:54,429 [DEBUG] W-9018-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:54,696 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9017-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:54,697 [WARN ] W-9017-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:54,698 [INFO ] epollEventLoopGroup-4-9 org.pytorch.serve.wlm.WorkerThread - 9017 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:54,698 [DEBUG] W-9017-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9017-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:54,698 [DEBUG] W-9017-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:54,964 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9016-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:54,965 [WARN ] W-9016-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:54,965 [INFO ] epollEventLoopGroup-4-13 org.pytorch.serve.wlm.WorkerThread - 9016 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:54,965 [DEBUG] W-9016-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9016-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:54,965 [DEBUG] W-9016-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:55,235 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9015-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:55,235 [WARN ] W-9015-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:55,235 [INFO ] epollEventLoopGroup-4-22 org.pytorch.serve.wlm.WorkerThread - 9015 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:55,235 [DEBUG] W-9015-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9015-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:55,236 [DEBUG] W-9015-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:55,389 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9014-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:55,389 [WARN ] W-9014-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:55,390 [INFO ] epollEventLoopGroup-4-5 org.pytorch.serve.wlm.WorkerThread - 9014 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:55,390 [DEBUG] W-9014-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9014-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:55,390 [DEBUG] W-9014-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:55,654 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9013-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:55,654 [WARN ] W-9013-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:55,654 [INFO ] epollEventLoopGroup-4-8 org.pytorch.serve.wlm.WorkerThread - 9013 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:55,655 [DEBUG] W-9013-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9013-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:55,655 [DEBUG] W-9013-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:55,919 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9012-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:55,919 [WARN ] W-9012-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:55,920 [INFO ] epollEventLoopGroup-4-11 org.pytorch.serve.wlm.WorkerThread - 9012 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:55,920 [DEBUG] W-9012-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9012-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:55,920 [DEBUG] W-9012-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:56,185 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9011-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:56,185 [WARN ] W-9011-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:56,185 [INFO ] epollEventLoopGroup-4-14 org.pytorch.serve.wlm.WorkerThread - 9011 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:56,186 [DEBUG] W-9011-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9011-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:56,187 [DEBUG] W-9011-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:56,449 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9010-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:56,449 [WARN ] W-9010-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:56,449 [INFO ] epollEventLoopGroup-4-20 org.pytorch.serve.wlm.WorkerThread - 9010 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:56,450 [DEBUG] W-9010-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9010-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:56,450 [DEBUG] W-9010-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:56,712 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9009-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:56,712 [INFO ] epollEventLoopGroup-4-15 org.pytorch.serve.wlm.WorkerThread - 9009 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:56,712 [WARN ] W-9009-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:56,713 [DEBUG] W-9009-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9009-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:56,713 [DEBUG] W-9009-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:56,976 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9008-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:56,976 [INFO ] epollEventLoopGroup-4-1 org.pytorch.serve.wlm.WorkerThread - 9008 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:56,976 [WARN ] W-9008-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:56,977 [DEBUG] W-9008-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9008-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:56,977 [DEBUG] W-9008-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:57,240 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9007-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:57,240 [INFO ] epollEventLoopGroup-4-23 org.pytorch.serve.wlm.WorkerThread - 9007 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:57,240 [WARN ] W-9007-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:57,240 [DEBUG] W-9007-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9007-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:57,240 [DEBUG] W-9007-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:57,391 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9006-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:57,391 [INFO ] epollEventLoopGroup-4-2 org.pytorch.serve.wlm.WorkerThread - 9006 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:57,391 [WARN ] W-9006-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:57,391 [DEBUG] W-9006-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9006-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:57,392 [DEBUG] W-9006-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:57,655 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9005-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:57,656 [INFO ] epollEventLoopGroup-4-6 org.pytorch.serve.wlm.WorkerThread - 9005 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:57,656 [WARN ] W-9005-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:57,656 [DEBUG] W-9005-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9005-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:57,656 [DEBUG] W-9005-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:57,919 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9004-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:57,919 [INFO ] epollEventLoopGroup-4-16 org.pytorch.serve.wlm.WorkerThread - 9004 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:57,919 [DEBUG] W-9004-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Shutting down the thread .. Scaling down.
2020-03-07 01:58:57,920 [DEBUG] W-9004-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9004-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:57,920 [DEBUG] W-9004-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:58,182 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9003-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:58,182 [WARN ] W-9003-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:58,182 [INFO ] epollEventLoopGroup-4-7 org.pytorch.serve.wlm.WorkerThread - 9003 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:58,183 [DEBUG] W-9003-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:58,183 [DEBUG] W-9003-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:58,443 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9002-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:58,443 [WARN ] W-9002-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker thread exception.
java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
at org.pytorch.serve.wlm.Model.pollBatch(Model.java:175)
at org.pytorch.serve.wlm.BatchAggregator.getRequest(BatchAggregator.java:33)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-07 01:58:58,443 [INFO ] epollEventLoopGroup-4-3 org.pytorch.serve.wlm.WorkerThread - 9002 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:58,444 [DEBUG] W-9002-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9002-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:58,444 [DEBUG] W-9002-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:58,706 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9001-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:58,706 [DEBUG] W-9001-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Shutting down the thread .. Scaling down.
2020-03-07 01:58:58,706 [INFO ] epollEventLoopGroup-4-10 org.pytorch.serve.wlm.WorkerThread - 9001 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:58,707 [DEBUG] W-9001-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:58,707 [DEBUG] W-9001-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:58,968 [DEBUG] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.WorkerThread - W-9000-resnet-18_1.0 State change WORKER_MODEL_LOADED -> WORKER_SCALED_DOWN
2020-03-07 01:58:58,968 [DEBUG] W-9000-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Shutting down the thread .. Scaling down.
2020-03-07 01:58:58,968 [INFO ] epollEventLoopGroup-4-12 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_SCALED_DOWN
2020-03-07 01:58:58,968 [DEBUG] W-9000-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-resnet-18_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2020-03-07 01:58:58,969 [DEBUG] W-9000-resnet-18_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2020-03-07 01:58:59,344 [INFO ] epollEventLoopGroup-3-17 org.pytorch.serve.wlm.ModelManager - Model resnet-18 unregistered.
Corresponding GPU / CPU utilizations via Cloudwatch (using the gpumon.py script from AWS)
![Screen Shot 2020-03-06 at 6 38 39 PM](https://user-images.githubusercontent.com/4461127/76135254-fa4b4400-5fd9-11ea-803d-5e1ab2d70be9.png)
The text was updated successfully, but these errors were encountered: