-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] User's resources quota #211
Comments
Maybe this could be solved by implementing a custom AccessChecker to limit the users quota, I have done this. |
At present, the app level is also limited, right? |
Because I haven't seen similar PR for the time being, so I create this issue. |
Yes, I introduce a custom access checker to do following operation
And so I think the resource quotas limitation could be implemented in custom access checker. Please let me know If I misunderstand u. |
Very coincidentally, our ideas are similar :) . |
Maybe this is the best practice |
When will this feature be available? @zuston |
I think you misunderstand my thought. I implement the custom access checker to solve the problem you mentioned. You can do similar operations like me. And I think I wont submit this access checker to the uniffle codebase, because maybe it's not general. |
Well, although I think this may actually have some effect on user isolation, we can try to let users with high priority use more resources. I can understand what you mean. |
In fact, we have also achieved it, and completed the launch, the effect is still obvious, the user's resources are effectively managed, and it is easier to calculate the cost of the user's use for billing, so this issue is mentioned. |
User quota is ok for us. I think it's the part work of multi-tent user support. |
I can raise a pr if needed. @jerqi |
If the pr is large, you could write a design document first. |
any update? we also have plan to do this. may i ask whats the scope of the quota limit ? is it on single shuffle server ? or for the whole shuffle size. am thinking maybe we can do it as server level quota, so this feature can work with multiple server feature, the shuffle write could write the rest blocks to another server. |
Currently I have no ideas on concrete design. If you want to contribute this feature, it’s better to have a simple design doc for reviewing. @Gustfh Do u have some plan to invest this ticket? @smallzhongfeng |
In the versions used internally in our company, we use quotas to limit the number of apps that a single user can submit. I don't have much idea about the number of shuffle servers that a single user can use. But I will write a simple document this weekend to discuss whether there are other requirements that can be developed in the future.@jerqi @zuston @Gustfh |
Could you add some diagrams? |
OK, I will add later. |
Could you give us the authority of the |
+1 |
so it's user level quota, what if single app produce large shuffle data, then impact other app, for example a app have large shuffle data and also have lots of stage, and running for days, if you enable memory storage, this app's shuffle could live in memory for long times, am wonder should we have a quota for this situation. |
+1. I think the quota of bytes used by app/hadoop-user also should be involved in the design. And I think the different quota limitation like app-number/storage-bytes could be enabled by user. |
@smallzhongfeng If you add some extra interfaces, you should describe them in the document. |
I added a simple graphic to illustrate the process of Spark's resource limitation. A more complete pr will be proposed this week. |
This is a good suggestion. I am currently developing it, which may be implemented in the next pr. |
@smallzhongfeng @Gustfh @zuston Do you want to discuss this issue through a meeting? I will start a meeting to discuss the issue #80, I want to discuss this issue, too. There are some other issues which we need to discuss, so I will send a email to our dev mail list, and select a proper date to start the meeting. You can tell me what time you are free by the email. |
Of course, I'm looking forward to it. |
@Gustfh @smallzhongfeng I have already send an email https://lists.apache.org/thread/2jlm3fswmsxy619ldyo4px700p3ybnvc. Do you have time at 11 am (UTC +8) Thursday this week? |
Meeting link is https://meeting.tencent.com/dm/oR95wASCNe91 |
ye, we are looking forward to it |
Offline Discussion Result: |
### What changes were proposed in this pull request? For issue #211 and the design document [https://docs.google.com/document/d/1MApSMFQgoS1VAoKbZjomqSRm0iTbSuKG1yvKNlWW65c/edit?usp=sharing](https://docs.google.com/document/d/1MApSMFQgoS1VAoKbZjomqSRm0iTbSuKG1yvKNlWW65c/edit?usp=sharing) ### Why are the changes needed? Better isolation of resources between different users. ### Does this PR introduce _any_ user-facing change? Add config `rss.coordinator.quota.default.app.num` to set default app number each user and `rss.coordinator.quota.default.path` to set a path to record the number of apps that each user can run. ### How was this patch tested? Add uts.
close by #311 |
At present, we can not limit the user's resources. Maybe we can manually update the number of tasks submitted by the user through a configuration file. When the quota is exceeded, the app will be rejected, and the number of apps of different users can be used to represent resource quotas. What do u think? @jerqi
The text was updated successfully, but these errors were encountered: