New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLOUDSTACK-10006: Internal DRS-like load balancing implementation for Vmware #2189
Conversation
bfedfdc
to
321b176
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @nvazquez, thanks for this great feature, have you planned on adding marvin tests?
@blueorangutan package |
@borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-808 |
@blueorangutan test |
@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
Trillian test result (tid-1212)
|
321b176
to
e770b99
Compare
Thanks @borisstoyanov, sorry for the delay in response. I'll work on a way to test this feature |
ACS CI BVT RunSumarry: Link to logs Folder (search by build_no): https://www.dropbox.com/sh/r2si930m8xxzavs/AAAzNrnoF1fC3auFrvsKo_8-a?dl=0 Failed tests:
Skipped tests: Passed test suits: |
@nvazquez some pieces of this code are a little familiar to me :) It is great your initiative, and from what I looked into the code of this PR, the “management model” used here is similar to the one we have in our beta version of Autonomiccs [1], which was developed as a plugin for ACS. As I told you when you tested that solution for Autodesk, it does not scale well… So, we have a dataset [2] that we use to compare our management models, and we have compared this simple solution with a more comprehensive model that we developed a while ago. It turns out that this simple management approach can bring quite some problems in dynamic production environments. In some condition it will be worse than if you were not using anything at all. The dataset and the simulation tool to check the results are public [2], so anyone can check the results ;) The blue bar is one of our management models that was presented at IEEE SERVICES conference in the beginning of this year; the red line is the simple management model we made available at [1], which is similar to the one being introduced here; the yellow line is the unbalance of the cloud environment if one does not use anything at all (relying only on the allocation algorithm, the first fit was used), This figure presents the unbalance of RAM through out time in the dataset. This figure presents the unbalance of CPU through out time in the dataset The blue bar, which is one of our management models perform way better because it is a multi-dimensional management model. The message I want to send here is the following: use this solution on your cloud production environment with caution. [1] https://github.com/Autonomiccs/autonomiccs-platform |
@rafaelweingartner somehow I missed out your last comment, sorry for a bit late response :). I'll close this PR temporally |
JIRA TICKET: https://issues.apache.org/jira/browse/CLOUDSTACK-10006
Introduction
One of the most useful features provided by Vmware is DRS (Distributed Resources Scheduler), whose main job is to load balance workload within clusters when needed (given a migration threshold). However, this feature is only available for Enterprise Plus licenses. We would like a way to provide a similar feature internal to CloudStack for those Vmware licenses which don't have DRS feature available.
Usage
This feature is disabled by default, it can be activated by switching global setting:
vmware.drs.internal.enabled
totrue
(management server restart is required).When feature is active, it would set a a thread per cluster which will execute every
vmware.drs.interval
seconds:vmware.drs.internal.enabled = false
-> no action performed on clustervmware.drs.internal.enabled = true
:vmware.drs.threshold
:Good migration search works like this: