Update ubuntu docker version #1970

LuigiCerone · 2022-11-12T16:27:50Z

Description

Update default from Ubuntu 18.04 to Ubuntu 20.04 (LTS).

Fixes #1889

Type of change

Bug fix (non-breaking change which fixes an issue)

Feature/Issue validation/testing

I need advice on how to test the different updates.

Checklist

Should 20.04 be used here instead of latest tag?
Should 20.04 be used here instead of latest tag?
Should 20.04 be used here instead of latest tag?
Should 20.04 be used here instead of latest tag?
Should 20.04 be used here instead of latest tag?
~~This guide refers to 18.04 version, should it be updated?~~
~~This example uses 18.04 version~~
~~In the K8S folder there are some reference to 18.04 version~~

Edit after comments:

Update script where we should make 22.04 optional if people pass it in as an arg since 20.04 is now the default
Build the images and them by running an example inference, attach the log in the PR

codecov · 2022-11-15T18:15:11Z

Codecov Report

Merging #1970 (bf38f8b) into master (2edd063) will not change coverage.
The diff coverage is n/a.

❗ Current head bf38f8b differs from pull request most recent head 0fe4882. Consider uploading reports for the commit 0fe4882 to get more accurate results

@@           Coverage Diff           @@
##           master    #1970   +/-   ##
=======================================
  Coverage   53.31%   53.31%           
=======================================
  Files          70       70           
  Lines        3157     3157           
  Branches       56       56           
=======================================
  Hits         1683     1683           
  Misses       1474     1474

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

msaroufim · 2022-11-15T18:34:28Z

So ubuntu-latest in a Github runner means 20.04 but I'd rather we don't rely on an unpinned dependency and be explicit that we're using 20.04

https://github.com/actions/runner-images

So to answer your question @LuigiCerone the answer is yes to all of the above

Regarding testing for anything that's a github action that will be covered by CI so that's easy, for the doc related stuff I think also no test needed

But for the existing images you've updated, one more change we'd need to do is go here and make 22.04 optional if people pass it in as an arg since 20.04 is now the default https://github.com/pytorch/serve/blob/master/docker/build_image.sh#L97

And finally for testing can you build the docker images and run a simple inference from our docker/README.md? and attach those logs here. That's the tricky but important part from this PR. If you're up for it lmk because we'd like to merge this change before Dec 2 for a patch release but if not I can either create a new PR or if you're up for it I can push code to your branch directly

LuigiCerone · 2022-11-15T20:24:35Z

Hello @msaroufim , thanks for the useful information! I'll work on this (also the last point) and update the PR in the next few days :)

docker/build_image.sh

LuigiCerone · 2022-11-16T20:14:39Z

Hello @msaroufim, these are the logs obtained by building locally image pytorch/torchserve:latest-cpu (by running docker/build_image.sh without arguments). I did the test with model resnet-152 as explained here.

➜  docker git:(update/ubuntu_docker) docker image ls | grep torch
pytorch/torchserve                                                                  latest-cpu       7d7680068cc1   23 hours ago    2.04GB

➜  docker git:(update/ubuntu_docker) docker run --rm -it -p 8080:8080 -p 8081:8081 -p 8082:8082 -p 7070:7070 -p 7071:7071 pytorch/torchserve
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2022-11-16T20:01:00,503 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2022-11-16T20:01:00,789 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.6.0
TS Home: /home/venv/lib/python3.8/site-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Number of GPUs: 0
Number of CPUs: 4
Max heap size: 1992 M
Python executable: /home/venv/bin/python
Config file: /home/model-server/config.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8081
Metrics address: http://0.0.0.0:8082
Model Store: /home/model-server/model-store
Initial Models: N/A
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 32
Netty client threads: 0
Default workers per model: 4
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Metrics report format: prometheus
Enable metrics API: true
Workflow Store: /home/model-server/model-store
Model config: N/A
2022-11-16T20:01:00,822 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2022-11-16T20:01:00,903 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2022-11-16T20:01:01,023 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8080
2022-11-16T20:01:01,024 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2022-11-16T20:01:01,027 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://0.0.0.0:8081
2022-11-16T20:01:01,028 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2022-11-16T20:01:01,033 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://0.0.0.0:8082
Model server started.
2022-11-16T20:01:01,851 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628861
2022-11-16T20:01:01,861 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:15.432594299316406|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628861
2022-11-16T20:01:01,862 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:51.57096862792969|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628861
2022-11-16T20:01:01,863 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:77.0|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628861
2022-11-16T20:01:01,865 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:6933.99609375|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628861
2022-11-16T20:01:01,866 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:448.640625|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628861
2022-11-16T20:01:01,866 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:12.9|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628861
2022-11-16T20:02:01,744 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628921
2022-11-16T20:02:01,746 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:15.16732406616211|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628921
2022-11-16T20:02:01,749 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:51.836238861083984|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628921
2022-11-16T20:02:01,752 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:77.4|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628921
2022-11-16T20:02:01,755 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:6852.40234375|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628921
2022-11-16T20:02:01,758 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:530.015625|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628921
2022-11-16T20:02:01,761 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:13.9|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628921
2022-11-16T20:02:07,071 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 2.0 for model resnet-152-batch_v2
2022-11-16T20:02:07,073 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 2.0 for model resnet-152-batch_v2
2022-11-16T20:02:07,074 [INFO ] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelManager - Model resnet-152-batch_v2 loaded.
2022-11-16T20:02:07,076 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelManager - updateModel: resnet-152-batch_v2, count: 1
2022-11-16T20:02:07,093 [DEBUG] W-9000-resnet-152-batch_v2_2.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.8/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9000]
2022-11-16T20:02:10,114 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000
2022-11-16T20:02:10,115 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_LOG - [PID]51
2022-11-16T20:02:10,116 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_LOG - Torch worker started.
2022-11-16T20:02:10,118 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_LOG - Python runtime: 3.8.0
2022-11-16T20:02:10,118 [DEBUG] W-9000-resnet-152-batch_v2_2.0 org.pytorch.serve.wlm.WorkerThread - W-9000-resnet-152-batch_v2_2.0 State change null -> WORKER_STARTED
2022-11-16T20:02:10,128 [INFO ] W-9000-resnet-152-batch_v2_2.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000
2022-11-16T20:02:10,144 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
2022-11-16T20:02:10,156 [INFO ] W-9000-resnet-152-batch_v2_2.0 org.pytorch.serve.wlm.WorkerThread - Flushing req. to backend at: 1668628930156
2022-11-16T20:02:10,215 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_LOG - model_name: resnet-152-batch_v2, batchSize: 1
2022-11-16T20:02:12,099 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_LOG - generated new fontManager
2022-11-16T20:02:16,158 [INFO ] W-9000-resnet-152-batch_v2_2.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 5942
2022-11-16T20:02:16,160 [DEBUG] W-9000-resnet-152-batch_v2_2.0 org.pytorch.serve.wlm.WorkerThread - W-9000-resnet-152-batch_v2_2.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2022-11-16T20:02:16,161 [INFO ] W-9000-resnet-152-batch_v2_2.0 TS_METRICS - W-9000-resnet-152-batch_v2_2.0.ms:9072|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628936
2022-11-16T20:02:16,163 [INFO ] W-9000-resnet-152-batch_v2_2.0 TS_METRICS - WorkerThreadTime.ms:65|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628936
2022-11-16T20:02:16,168 [INFO ] epollEventLoopGroup-3-1 ACCESS_LOG - /172.17.0.1:64904 "POST /models?url=https://torchserve.pytorch.org/mar_files/resnet-152-batch_v2.mar&batch_size=1&max_batch_delay=50&initial_workers=1 HTTP/1.1" 200 54092
2022-11-16T20:02:16,169 [INFO ] epollEventLoopGroup-3-1 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628882
2022-11-16T20:03:01,647 [INFO ] pool-3-thread-2 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628981
2022-11-16T20:03:01,658 [INFO ] pool-3-thread-2 TS_METRICS - DiskAvailable.Gigabytes:14.998088836669922|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628981
2022-11-16T20:03:01,662 [INFO ] pool-3-thread-2 TS_METRICS - DiskUsage.Gigabytes:52.00547409057617|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628981
2022-11-16T20:03:01,662 [INFO ] pool-3-thread-2 TS_METRICS - DiskUtilization.Percent:77.6|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628981
2022-11-16T20:03:01,663 [INFO ] pool-3-thread-2 TS_METRICS - MemoryAvailable.Megabytes:6401.51171875|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628981
2022-11-16T20:03:01,663 [INFO ] pool-3-thread-2 TS_METRICS - MemoryUsed.Megabytes:980.953125|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628981
2022-11-16T20:03:01,663 [INFO ] pool-3-thread-2 TS_METRICS - MemoryUtilization.Percent:19.6|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628981
2022-11-16T20:04:01,591 [INFO ] pool-3-thread-2 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668629041
2022-11-16T20:04:01,592 [INFO ] pool-3-thread-2 TS_METRICS - DiskAvailable.Gigabytes:14.998088836669922|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668629041
2022-11-16T20:04:01,594 [INFO ] pool-3-thread-2 TS_METRICS - DiskUsage.Gigabytes:52.00547409057617|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668629041
2022-11-16T20:04:01,595 [INFO ] pool-3-thread-2 TS_METRICS - DiskUtilization.Percent:77.6|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668629041
2022-11-16T20:04:01,596 [INFO ] pool-3-thread-2 TS_METRICS - MemoryAvailable.Megabytes:6411.35546875|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668629041
2022-11-16T20:04:01,597 [INFO ] pool-3-thread-2 TS_METRICS - MemoryUsed.Megabytes:971.11328125|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668629041
2022-11-16T20:04:01,597 [INFO ] pool-3-thread-2 TS_METRICS - MemoryUtilization.Percent:19.5|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668629041
2022-11-16T20:04:16,610 [INFO ] W-9000-resnet-152-batch_v2_2.0 org.pytorch.serve.wlm.WorkerThread - Flushing req. to backend at: 1668629056610
2022-11-16T20:04:16,616 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_LOG - Backend received inference at: 1668629056
2022-11-16T20:04:16,924 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_METRICS - HandlerTime.Milliseconds:307.83|#ModelName:resnet-152-batch_v2,Level:Model|#hostname:4d2ed7855de0,requestID:fea7e8cf-0c4e-4a21-9003-60ec6fe551f1,timestamp:1668629056
2022-11-16T20:04:16,926 [INFO ] W-9000-resnet-152-batch_v2_2.0-stdout MODEL_METRICS - PredictionTime.Milliseconds:307.93|#ModelName:resnet-152-batch_v2,Level:Model|#hostname:4d2ed7855de0,requestID:fea7e8cf-0c4e-4a21-9003-60ec6fe551f1,timestamp:1668629056
2022-11-16T20:04:16,926 [INFO ] W-9000-resnet-152-batch_v2_2.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 311
2022-11-16T20:04:16,928 [INFO ] W-9000-resnet-152-batch_v2_2.0 ACCESS_LOG - /172.17.0.1:56654 "PUT /predictions/resnet-152-batch_v2 HTTP/1.1" 200 336
2022-11-16T20:04:16,931 [INFO ] W-9000-resnet-152-batch_v2_2.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668628882
2022-11-16T20:04:16,932 [DEBUG] W-9000-resnet-152-batch_v2_2.0 org.pytorch.serve.job.Job - Waiting time ns: 541400, Backend time ns: 322506900
2022-11-16T20:04:16,934 [INFO ] W-9000-resnet-152-batch_v2_2.0 TS_METRICS - QueueTime.ms:0|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668629056
2022-11-16T20:04:16,935 [INFO ] W-9000-resnet-152-batch_v2_2.0 TS_METRICS - WorkerThreadTime.ms:14|#Level:Host|#hostname:4d2ed7855de0,timestamp:1668629056

msaroufim

Looks great! Thank you!

maaquib

Register model on Ubuntu20.04+TS0.6.1 Fails with

$ docker run --rm -it -v/home/ubuntu/Downloads/model_store/:/home/model-server/model-store -p8080:8080 -p8081:8081 pytorch/torchserve:latest-cpu  serve

...
2022-11-16T23:57:50,418 [WARN ] W-9000-resnet-18_1.0-stderr MODEL_LOG - Traceback (most recent call last):
2022-11-16T23:57:50,419 [WARN ] W-9000-resnet-18_1.0-stderr MODEL_LOG -   File "/home/venv/lib/python3.8/site-packages/ts/model_service_worker.py", line 16, in <module>
2022-11-16T23:57:50,419 [WARN ] W-9000-resnet-18_1.0-stderr MODEL_LOG -     from ts.metrics.metric_cache_yaml_impl import MetricsCacheYamlImpl
2022-11-16T23:57:50,419 [WARN ] W-9000-resnet-18_1.0-stderr MODEL_LOG -   File "/home/venv/lib/python3.8/site-packages/ts/metrics/metric_cache_yaml_impl.py", line 5, in <module>
2022-11-16T23:57:50,420 [WARN ] W-9000-resnet-18_1.0-stderr MODEL_LOG -     import yaml
2022-11-16T23:57:50,420 [WARN ] W-9000-resnet-18_1.0-stderr MODEL_LOG - ModuleNotFoundError: No module named 'yaml'
...

@LuigiCerone Can you add PyYAML as a dependency. This is a miss from #1954

Update: Building with dev image pulls in the dependencies. Seems like we don't use Dockerfile for official release anymore. Approving

Update ubuntu docker version

e4b2098

LuigiCerone force-pushed the update/ubuntu_docker branch from 78c54b7 to e4b2098 Compare November 12, 2022 16:28

maaquib requested review from msaroufim and lxning November 15, 2022 17:48

Merge branch 'master' into update/ubuntu_docker

e4d23d3

maaquib linked an issue Nov 15, 2022 that may be closed by this pull request

security issue #1971

Closed

msaroufim requested review from rohithkrn and agunapal November 15, 2022 18:07

LuigiCerone and others added 2 commits November 15, 2022 21:52

Fix ubuntu docker version

13b9ca7

Merge branch 'master' into update/ubuntu_docker

738bcb4

maaquib reviewed Nov 16, 2022

View reviewed changes

docker/build_image.sh Outdated Show resolved Hide resolved

maaquib approved these changes Nov 16, 2022

View reviewed changes

Merge branch 'master' into update/ubuntu_docker

bf38f8b

msaroufim approved these changes Nov 16, 2022

View reviewed changes

msaroufim self-requested a review November 17, 2022 00:00

maaquib requested changes Nov 17, 2022

View reviewed changes

maaquib approved these changes Nov 17, 2022

View reviewed changes

agunapal approved these changes Nov 17, 2022

View reviewed changes

msaroufim approved these changes Nov 17, 2022

View reviewed changes

Fix build docker script

0fe4882

agunapal merged commit 95c5052 into pytorch:master Nov 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update ubuntu docker version #1970

Update ubuntu docker version #1970

LuigiCerone commented Nov 12, 2022 •

edited

codecov bot commented Nov 15, 2022 •

edited

msaroufim commented Nov 15, 2022

LuigiCerone commented Nov 15, 2022 •

edited

LuigiCerone commented Nov 16, 2022

msaroufim left a comment

maaquib left a comment •

edited

Update ubuntu docker version #1970

Update ubuntu docker version #1970

Conversation

LuigiCerone commented Nov 12, 2022 • edited

Description

Type of change

Feature/Issue validation/testing

Checklist

codecov bot commented Nov 15, 2022 • edited

Codecov Report

msaroufim commented Nov 15, 2022

LuigiCerone commented Nov 15, 2022 • edited

LuigiCerone commented Nov 16, 2022

msaroufim left a comment

Choose a reason for hiding this comment

maaquib left a comment • edited

Choose a reason for hiding this comment

LuigiCerone commented Nov 12, 2022 •

edited

codecov bot commented Nov 15, 2022 •

edited

LuigiCerone commented Nov 15, 2022 •

edited

maaquib left a comment •

edited