Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to block to auto-scaling in specific situation #15375

Open
jinholee-makinarocks opened this issue Jul 5, 2024 · 1 comment
Open

How to block to auto-scaling in specific situation #15375

jinholee-makinarocks opened this issue Jul 5, 2024 · 1 comment
Labels
kind/question Further information is requested

Comments

@jinholee-makinarocks
Copy link

Ask your question here:

We use knative and kserve project in our product to provide inference services with auto-scaling. In some cases, we need to pause to the auto-scaling according to the multi-tenant resource handling.
For example,
We have two tenants allocated same amount resources as following:

  • 1st tenant : CPU 10 core, Memory 10Gib
  • 2nd tenant : CPU 10 core, Memory 10Gib

If we should deployed lots of objects, such as InferenceService or Service, 2nd tenant have exhausted your resources.
In this situation, we have to block the auto-scaling in order to prevent the 2nd tenant from consuming the resources of the 1st tenant.

Could you tell me any idea or how to approach to solve it ?

@jinholee-makinarocks jinholee-makinarocks added the kind/question Further information is requested label Jul 5, 2024
@skonto
Copy link
Contributor

skonto commented Jul 11, 2024

Hi @jinholee-makinarocks. I suspect you could map your tenant concept to one or more namespaces and apply quotas per ns to reflect the total amount of resources per tenant. Knative services are running in a ns and scaling happens within a ns, so you could restrict the resources. Are you looking for something else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants