-
Notifications
You must be signed in to change notification settings - Fork 816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Newer Base Image KServe Container fails with exec /usr/local/bin/dockerd-entrypoint.sh: exec format error #3033
Comments
Before it gets asked here, yes I have tried to capture logs from within the deployed container, however the container does not even start so no other logs are recorded (other than the liveness probe and queue-proxy failing and all of that) |
Thanks for reporting..looking into this. Able to repro the error. Earlier we didn't move to 22.04 as the ubuntu 22.04 runners were flaky. I will try running CI on 22.04 to see if its resolved now. |
@tylertitsworth Please pull the submodules before you build kfs image
I am able to build it with 22.04 after doing this
|
@agunapal In the build script I use to build this container I pull submodules (https://github.com/intel/ai-containers/blob/main/pytorch/serving/build-kfs.sh#L9) I am able to build the container, however, my issue is when it is deployed to k8s. |
@agunapal any update on this? Is there any misunderstanding I can help alleviate? |
Hi @tylertitsworth I understand the problem. I will get back to you this week. |
On ubuntu 22.04, tried running grpc testcases..these worked
So, it may be something specific to docker/kserve.. Will try the steps you have mentioned |
🐛 Describe the bug
The public TorchServe KFS Image that was recently updated for 0.10.0 has
ubuntu:20.04
as its base.Intel is publishing an Intel Optimized version of both the torchserve and torchserve-kfs images, which includes Intel Extension for PyTorch. However, due to Intel's Security First policies, we use
ubuntu:22.04
as our base image for both containers (soon to beubuntu:24.04
.When we deploy with the latest
0.10.0
version of torchserve on kserve, the image immediately enters theCrashLoopBackOff
state due to the following error:exec /usr/local/bin/dockerd-entrypoint.sh: exec format error
.We determined that the solution to this issue was to change the base back to
ubuntu:20.04
, however this means that anyone who intends to create a custom torchserve-kfs container won't be able to use theubuntu:rolling
base specified in https://github.com/pytorch/serve/blob/master/docker/Dockerfile#L19.This issue is not present in the previous version my team published, only with the latest kserve and torchserve version, and I was not able to reproduce from the command line, only in my cluster.
Error logs
When using
ubuntu:23.10
, it fails during buildtime:But I am more interested in the output with
ubuntu:22.04
, which fails during deployment:Installation instructions
Install TorchServe from source? No
Are you using Docker? Yes
Model Packaing
n/a
config.properties
n/a
Versions
With
ubuntu:22.04
as baseWith
ubuntu:20.04
as baseRepro instructions
From https://github.com/intel/ai-containers,
docker tag intel/aiops/mlops-ci:b-0-ubuntu-22.04-pip-py3.10-torchserve intel/torchserve:latest
cd serving ./build-kfs.sh
kserve-torchserve
to use the new imagePossible Solution
No response
The text was updated successfully, but these errors were encountered: