Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models deployed with ModelMesh-Serving get restarted on upgrade #486

Open
vaibhavjainwiz opened this issue Jan 31, 2024 · 3 comments
Open
Labels
enhancement New feature or request

Comments

@vaibhavjainwiz
Copy link

vaibhavjainwiz commented Jan 31, 2024

Describe the bug

KServe community follow an approach to release all repos together irrespective if there are code changes in independent repos or not.
For example, In release v0.11.2, a new tag is created for all repos(modelmesh, modelmesh-runtime-adaptor, rest-proxy etc).
Earlier we had created an image for v0.11.1 but with the new release (v0.11.2), we had created a new set of images by pointing to the v0.11.2 tag.
Now when we roll out a new modelmesh-serving(v0.11.2) controller, all deployed runtimes are restarted because the runtime sidecar container image(modelmesh, modelmesh-runtime-adaptor, rest-proxy) tag has been changed.

Problem is ModelMesh tech stack is quite stable and repos like modelmesh, modelmesh-runtime-adaptor, rest-proxy hardly see any code change between the releases. But to release complete KServe org all together we provide their new versions as well.

Expected behavior

Model runtime pod should not restarted on ModelMesh-Serving controller upgrade, if there is no change at runtime/application level.

Additional context

I am proposing every repo should be released independently only if there is a code change.
For example, On next release cycle, new tag for modelmesh-runtime-adaptor repo should only be created if there are some changes.

@vaibhavjainwiz vaibhavjainwiz added the bug Something isn't working label Jan 31, 2024
@Jooho
Copy link
Contributor

Jooho commented Jan 31, 2024

@vaibhavjainwiz If we don't set a new tag for modelmesh images that didn't change at all, the restarting issue that all the runtime pods roll out when modelmesh-serving upgraded will be solved?
@ckadner could you please review this ticket?

@rafvasq rafvasq added the enhancement New feature or request label Jan 31, 2024
@vaibhavjainwiz
Copy link
Author

@vaibhavjainwiz If we don't set a new tag for modelmesh images that didn't change at all, the restarting issue that all the runtime pods roll out when modelmesh-serving upgraded will be solved?

yes.. if we don't build new image for ModelMesh repo where code didn't change since last build then this issue would be resolved.

@ckadner
Copy link
Member

ckadner commented Feb 14, 2024

I am proposing every repo should be released independently only if there is a code change.
For example, On next release cycle, new tag for modelmesh-runtime-adaptor repo should only be created if there are some changes.

There usually are code changes in at least one of the 3 modelmesh repos for each release. At the very least we have a few security fixes. If one of the images changes, a new deployment would get rolled out.

We could make some changes to the MM controller to stagger the update/rollout over time to reduce the resource spike, e.g. one runtime kind at a time.

@ckadner ckadner removed the bug Something isn't working label Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants