KubeRay Deployment Failure with Large ServeZip File in Working_Dir #44614
Labels
bug
Something that is supposed to be working; but isn't
core
Issues that should be addressed in Ray Core
core-runtime-env
Issues related to Ray environment dependencies
@external-author-action-required
Alternate tag for PRs where the author doesn't have labeling permission.
P1
Issue that should be fixed within a few weeks
serve
Ray Serve Related Issue
What happened + What you expected to happen
I am using KubeRay with the image ray_ml:2.9.0. I created a server of size 92MB and configured it to the working_dir in the yaml. After starting, the head node's pod did not fully pull the zip file. Checking the container's tmp folder, I found my zip package there but it was not completely downloaded, resulting in an empty folder after unzipping, which caused the deployment to fail. However, when I configure the working_dir to a smaller servezip, this problem does not occur.
Versions / Dependencies
ray_ml:2.9.0 image ubuntu18.0.4 kuberay
Reproduction script
pass
The text was updated successfully, but these errors were encountered: