New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEATURE-3823: kvm agent hooks #3839
FEATURE-3823: kvm agent hooks #3839
Conversation
@bwsw can you add a documentation too, how to use? |
@svenvogel I'll do for sure, just would like to do it after there are no more design concerns about PR and it is approved for merge. As far as I know, there is no specific place for documentation like that. Where we would like to have it? |
@nvazquez this resembles something you did. Can you have look? |
Packaging result: ✖centos6 ✔centos7 ✔debian. JID-674 |
I tried this feature and it provided a solution my use case (map SRIOV pci device to KVM host). Is this going to be merged to the master anytime soon? |
@poussa Wow! Great you have found this useful. We hoped that some other interested users exist. Hope it will be merged as well. |
Hi @bwsw, I find the XML use case pretty similar to what's been introduced by this PR: #3510. Start and stop VM hooks looks good. In PR #3610 I'm working on adding hooks scripts for rolling maintenance and similarly to this PR, an admin can define its own scripts and has to specify a directory where to find these scripts on the agent.properties file, perhaps this property can be shared between both PRs |
@DaanHoogland @rhtyd Read that discussion: E.g. Case1. In your PR there is no way to select any of the available NVME drives installed in the platform during VM deployment and attach it to the VM using some advanced logic. Case2. when VM is deployed, for every CS account, I create VXLAN unmanaged by CS. I create a bridge and would like to add a NIC into configuration dynamically. It's impossible to implement with your PR. Case 3. I would like to find any of available Quadro RTX 4000 GPU and pass it into VM. Case 4. Read Sakari Poussa's request from mailing list:
None of those cases are possible with PR you mention. To sum up:
The PR is ready to be merged from my point of view. I suppose It's better to merge it to avoid diff accumulation. |
@bwsw I'm happy to put some effort into getting this merged. Yur code looks neat, even though I don't undertand it fully yet. A few things putting me off in your latest reply. Please have patience with me?
I imagine you mean this PR, right?
What PR is that? Not "who's!", can you please refer to bits of code and not people? I can see some frustration in this comment and I have been involved in this conflict between private clouds an public clouds a lot. My question is genuine. On a side note, It's all apache's PRs ;)
(what PR?) can you add/explain how you are adding that info? I would expect some kind of dialog going back to the MS to provide the user with a list of available resources but can not see that in the code.
So how do you guarantee the same vxlan's aren't also used by ACS? (And also, parallel to case 1 above, how is the vxlan id fed back into ACS)
In this case you do not want the user to add it, right? but have it automagically added?
Can you add a reference to the mail thread in lists.apache.org please? The snippet contains references to context. @svenvogel is asking for documentation above. Are you already using this in production? Can you add a PR to https://github.com/apache/cloudstack-documentation, please? @weizhouapache , @GabrielBrascher , @kiwiflyer thoughts/reviews? |
@bwsw can I kindly join the effort on asking for some documentation - at the bare minimum I would like to see an example of how to use this feature (some specific, very-simple scenario) here on the PR description - so we can test it - i.e. I can't test it by going through the code as I'm not a developer myself, but would like to help testing it. i.e. expand on "How Has This Been Tested" section please. thx |
// these hooks allow adding a temporary high-speed device for local VM cache device.
// simple error-prone implementation.
package groovy
import com.sun.deploy.xml.XMLNode
import groovy.util.XmlParser
import groovy.util.Node
import groovy.xml.XmlUtil
class CacheDeviceAdder {
def highSpeedDir = "/mnt/nvme0/"
final def ramGBDivider = 1024
Node diskXmlSpec(String swapFile) {
return new XmlParser().parseText(
" <disk type='file' device='disk'>\n" +
" <driver name='qemu' type='qcow2' cache='none'/>\n" +
" <source file='$swapFile'/>\n" +
" <target dev='vdb' bus='ide'/>\n" +
" <alias name='ide0-0-1'/>\n" +
" <address type='drive' controller='0' bus='0' target='0' unit='1'/>\n" +
" </disk>")
}
String transform(Object logger, String xml) {
def vmDef = new XmlParser().parseText(xml)
// get VM ram amount
def memory = Integer.parseInt(vmDef.memory.text()) / ramGBDivider
// get VM ram name
def name = vmDef.name.text()
def swapFile = highSpeedDir + name + ".qcow2"
// remove swap device if exists
"rm -f ${swapFile}".execute()
def volCmd = "qemu-img create -f qcow2 ${swapFile} ${memory}G"
// create new swap device
volCmd.execute()
def diskSpec = diskXmlSpec(swapFile)
// update XML definition
vmDef.devices[0].append(diskSpec)
// return new XML definition
return groovy.xml.XmlUtil.serialize(vmDef)
}
Object stop(Object logger, String vmName) {
def swapFile = highSpeedDir + vmName.toString() + ".qcow2"
// remove unused swap device
"rm -f ${swapFile}".execute()
return null
}
Object start(Object logger, String vmName) {
return null
}
}
new CacheDeviceAdder() This is a very simple hook which utilizes stop and transform cases:
It's very primitive and avoids many checks, but allows getting a general idea. Again, I'll add the documentation, when somebody wants to approve the idea... Because, right now there is very simple documentation:
package groovy
class AnyNameNoMatter {
String method(Object logger, String xml) {
// your code
return null // for onStart, onStop
return xml // for xml transformation
}
}
new AnyNameNoMatter() |
@blueorangutan package |
Sure, they must be enabled/disabled by the person who manages the agent after it was connected to the CloudStack, the same way, say, you change RNG and activate watchdog. ACS doesn't do that, you do it after the agent is connected and initial configuration is produced by the agent deployment/integration script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code looks good, some style remarks, no blockers.
...ypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java
Outdated
Show resolved
Hide resolved
.../kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtStartCommandWrapper.java
Outdated
Show resolved
Hide resolved
.../kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtStartCommandWrapper.java
Outdated
Show resolved
Hide resolved
...s/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtStopCommandWrapper.java
Outdated
Show resolved
Hide resolved
...ypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java
Show resolved
Hide resolved
@weizhouapache did you test and can you approve? If so i think we can merge... |
@DaanHoogland @weizhouapache please, give me one-two days to complete the refactoring as Daan proposed. |
@bwsw freeze is planned on friday end of day. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested. start/stop vm works fine.
Did not test the agent hook scripts.
…ature/3823-kvm-agent-hooks � Conflicts: � plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtStartCommandWrapper.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code lgtm
@blueorangutan package |
@DaanHoogland a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
Packaging result: ✖centos6 ✔centos7 ✔debian. JID-1048 |
@blueorangutan test |
@DaanHoogland a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
Waiting for tests results, and if all good, will merge regardless of the freeze date (as approvals and everything else is good atm) |
@blueorangutan package |
@andrijapanicsb a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
Packaging result: ✖centos6 ✔centos7 ✔debian. JID-1051 |
@blueorangutan test |
@andrijapanicsb a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
Trillian test result (tid-1242)
|
Trillian test result (tid-1244)
|
Packaging result: ✖centos6 ✔centos7 ✔debian. JID-1052 |
Description
The PR introduces the new KVM agent extension interface in the form of hooks. Every hook is implemented in Groovy like
There are 3 hooks are implemented now:
XML transformer
which is called right before VM is launched and it allows modifying XML somehow (example above)onStart
andonStop
hooks which are called right after VM state changes to started/stopped.Every hook is run inside
try {} catch (Exception e) {}
, so it can not cause agent misbehavior. If hooks are not defined, they are skipped.Also, the PR adds initial support for GitLab CICD for those, who does WIPs with GitLab, not in GitHub.
Initial RFC/Proposal: #3823
Types of changes
Screenshots (if appropriate):
How Has This Been Tested?
Tested in unit tests and manually for master.