feat(api): Split UserContainerLimitRequestFactor into two separate config values #367

deadlycoconuts · 2024-01-16T09:41:13Z

Context

PR #146 introduced a new config value UserContainerLimitRequestFactor that allows the resource limits of any pod deployed by the Turing API server to be set as the resource request value decided by the user, multiplied by the value of UserContainerLimitRequestFactor. However, this factor applies to both CPU and memory resources and do not offer much flexibility in adjusting the resource limit values.

In particular, we would like to allow CPU resource limit values to be set as 0, which would represent no CPU limits being applied to pods deployed by the Turing API server. Doing so would require splitting the UserContainerLimitRequestFactor config value into two separate variables - UserContainerCPULimitRequestFactor and UserContainerMemoryLimitRequestFactor.

These two values serve pretty much the same function as the original UserContainerLimitRequestFactor config value, with the additional ability to, when set as 0, not apply any CPU or memory resource limits to the pod that's being deployed.

This PR contains changes similar to what is implemented for Merlin in caraml-dev/merlin#519.

Modifications

Refactoring of UserContainerLimitRequestFactor into UserContainerCPULimitRequestFactor and UserContainerMemoryLimitRequestFactor
Introduction of conditional blocks to not apply CPU or memory limits if any of the corresponding values mentioned above are set as 0

Minor Changes

Removal of some outdated golang linters

.golangci.yml

api/turing/cluster/kubernetes_service.go

leonlnj

thanks, lgtm!

deadlycoconuts · 2024-01-25T08:11:52Z

Thanks for the review! Merging this now! 🚀

…er (#519) # Description In order to allow the resource limits of any pod deployed by the Merlin API server to be scaled automatically as a factor of the resource request value initially decided by the user, this PR introduces two new config values, `UserContainerCPULimitRequestFactor` and `UserContainerMemoryLimitRequestFactor`, which will be used to determine the resource limit when multiplied by the corresponding resource request value. In addition, this PR also allows the CPU resource limit values to be set as `0`, which would represent no resource limits being applied to pods deployed by the Merlin API server, when these new config values as set as `0`. **_This PR contains changes similar to what is implemented for Turing in caraml-dev/turing#367 🚨 **EDIT**: KServe automatically sets default resource request and limit values when they are not specified in the KService inference service specification. However, these default (e.g. CPU) values cannot be configured manually or removed (they are actually [hardcoded](https://github.com/kserve/kserve/blob/c254e704c3719d3310391ee994123d5b49588e5d/pkg/apis/serving/v1beta1/inference_service_defaults.go#L35) in the controller code itself), and as a (hopefully temporary) workaround to this, **this PR introduces another new configuration value** `UserContainerCPUDefaultLimit` **that specifies the default CPU limit of any pods deployed by the Merlin API server, when** `UserContainerCPULimitRequestFactor` **is set to** `0`. This ensures that KServe does not impose its own default CPU limit values on them. # Modifications - Introduction of the `UserContainerCPULimitRequestFactor` and `UserContainerMemoryLimitRequestFactor` config values - Introduction of conditional blocks to not apply CPU or memory limits if any of the corresponding values mentioned above are set as `0` # Tests  - Ensure existing models redeployed have their resource limit values correctly set as a factor of their resource request values # Checklist - [x] Added PR label - [x] Added unit test, integration, and/or e2e tests - [ ] Tested locally - [ ] Updated documentation - [ ] Update Swagger spec if the PR introduce API changes - [ ] Regenerated Golang and Python client if the PR introduces API changes # Release Notes ```release-note NONE ```

Split UserContainerLimitRequestFactor into two separate config values

d5c9e6a

deadlycoconuts self-assigned this Jan 16, 2024

Set UserContainerCPULimitRequestFactor as 0 in unit tests

a1de1a4

deadlycoconuts force-pushed the split_user_container_limit_request_factor branch from a70e3aa to a1de1a4 Compare January 16, 2024 10:35

deadlycoconuts added 2 commits January 17, 2024 11:57

Fix lint comment to shorten line length

09b4661

Remove outdated golangci linters

068818a

deadlycoconuts force-pushed the split_user_container_limit_request_factor branch from 99b7793 to 8c71d73 Compare January 17, 2024 06:51

Make setting of resource limits conditional

8eb734c

deadlycoconuts force-pushed the split_user_container_limit_request_factor branch from 8c71d73 to 8eb734c Compare January 17, 2024 08:37

deadlycoconuts changed the title ~~feat: Split UserContainerLimitRequestFactor into two separate config values~~ feat(api): Split UserContainerLimitRequestFactor into two separate config values Jan 17, 2024

deadlycoconuts added the enhancement New feature or request label Jan 17, 2024

deadlycoconuts commented Jan 17, 2024

View reviewed changes

.golangci.yml Show resolved Hide resolved

deadlycoconuts commented Jan 17, 2024

View reviewed changes

api/turing/cluster/kubernetes_service.go Outdated Show resolved Hide resolved

deadlycoconuts mentioned this pull request Jan 17, 2024

feat(api): Introduce CPU and memory limit request factors to API server caraml-dev/merlin#519

Merged

6 tasks

deadlycoconuts requested review from ariefrahmansyah, leonlnj and tkpd-hafizhan January 17, 2024 09:54

deadlycoconuts marked this pull request as ready for review January 17, 2024 09:54

Increase defaultMemoryLimitRequestFactor to 2

b08c568

deadlycoconuts force-pushed the split_user_container_limit_request_factor branch from 2cf892f to b08c568 Compare January 23, 2024 10:41

leonlnj approved these changes Jan 25, 2024

View reviewed changes

deadlycoconuts merged commit 189801b into caraml-dev:main Jan 25, 2024
12 checks passed

deadlycoconuts mentioned this pull request Feb 1, 2024

fix(image-builder):Refactor ensembler image naming convention #368

Merged

deadlycoconuts deleted the split_user_container_limit_request_factor branch February 1, 2024 07:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(api): Split UserContainerLimitRequestFactor into two separate config values #367

feat(api): Split UserContainerLimitRequestFactor into two separate config values #367

deadlycoconuts commented Jan 16, 2024 •

edited

Loading

leonlnj left a comment

deadlycoconuts commented Jan 25, 2024

feat(api): Split UserContainerLimitRequestFactor into two separate config values #367

feat(api): Split UserContainerLimitRequestFactor into two separate config values #367

Conversation

deadlycoconuts commented Jan 16, 2024 • edited Loading

Context

Modifications

Minor Changes

leonlnj left a comment

Choose a reason for hiding this comment

deadlycoconuts commented Jan 25, 2024

deadlycoconuts commented Jan 16, 2024 •

edited

Loading