Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

ppc64le: Hot addition vCPUs to Kata Container fails on ppc64le #1155

Closed
nitkon opened this issue Jan 20, 2019 · 8 comments
Closed

ppc64le: Hot addition vCPUs to Kata Container fails on ppc64le #1155

nitkon opened this issue Jan 20, 2019 · 8 comments

Comments

@nitkon
Copy link
Contributor

nitkon commented Jan 20, 2019

Description of problem

Hot addition of vCPUs to Kata Container fails on ppc64le

Expected result

docker run  --runtime kata-runtime --rm --cgroup-parent 0 --cpus 2 debian bash
*No error*

Actual result

 docker run  --runtime kata-runtime --rm --cgroup-parent 0 --cpus 2 debian bash
docker: Error response from daemon: OCI runtime create failed: failed to hot add vCPUs: only 0 vCPUs of 2 were added: unknown.


Kata-collect-data.sh output:-

kata-collect-data-HotAdd.log

@nitkon
Copy link
Contributor Author

nitkon commented Jan 23, 2019

Earlier for memory and CPU hotplug to work on IBM Power Systems, we needed to install some packages like powerpc-utils, ppc64_diag and librtas in the VM. However, in the recent kernel versions, there is an in-kernel hotplug feature that supports hotplugging without installing those packages.

I am planning to bump the Kernel version to 4.19 for ppc64le. I was wondering if there is anything specific that needs to be done or enabled in the kernel config file for hotplug to work.
/ cc @jodh-intel @grahamwhaley @jcvenegas

@grahamwhaley
Copy link
Contributor

Hi @nitkon - there might be clues in the config fragments over in the kata-containers/packaging#314 PR. A quick grep of 'PLUG' shows items in ACPI and PCI domains - but, that may be quite different for your architecture ;-) There is of course CONFIG_MEMORY_HOTPLUG.
On the general 4.19 item, also see #1111

@nitkon
Copy link
Contributor Author

nitkon commented Jan 27, 2019

Even after bumping to Kernel 4.19, I see the following error in the debug ...

Jan 27 20:49:57 soe13 kata-runtime[37967]: time="2019-01-27T20:49:57.269463757+05:30" level=info msg="HOTPLUGGABLE VCPUS are !(EXTRA []qemu.HotpluggableCPU=[{host-spapr-cpu-core 1 {0 0 19 0} } {host-spapr-cpu-core 1 {0 0 18 0} } {host-spapr-cpu-core 1 {0 0 17 0} } {host-spapr-cpu-core 1 {0 0 16 0} } {host-spapr-cpu-core 1 {0 0 15 0} } {host-spapr-cpu-core 1 {0 0 14 0} } {host-spapr-cpu-core 1 {0 0 13 0} } {host-spapr-cpu-core 1 {0 0 12 0} } {host-spapr-cpu-core 1 {0 0 11 0} } {host-spapr-cpu-core 1 {0 0 10 0} } {host-spapr-cpu-core 1 {0 0 9 0} } {host-spapr-cpu-core 1 {0 0 8 0} } {host-spapr-cpu-core 1 {0 0 7 0} } {host-spapr-cpu-core 1 {0 0 6 0} } {host-spapr-cpu-core 1 {0 0 5 0} } {host-spapr-cpu-core 1 {0 0 4 0} } {host-spapr-cpu-core 1 {0 0 3 0} } {host-spapr-cpu-core 1 {0 0 2 0} } {host-spapr-cpu-core 1 {0 0 1 0} } {host-spapr-cpu-core 1 {0 0 0 0} /machine/unattached/device[0]}])"

Jan 27 20:49:57 soe13 kata-runtime[37967]: time="2019-01-27T20:49:57.269563625+05:30" level=info msg="{"arguments":{"core-id":"19","driver":"host-spapr-cpu-core","id":"cpu-0","socket-id":"0","thread-id":"0"},"execute":"device_add"}" arch=ppc64le command=create container=f79375f64d7aaa0533ce62e6d2100b15727e2e55ebfc52e02423123dc8996990 name=kata-runtime pid=37967 source=virtcontainers subsystem=qmp

Jan 27 20:49:57 soe13 kata-runtime[37967]: time="2019-01-27T20:49:57.272369474+05:30" level=info msg="{"error": {"class": "GenericError", "desc": "Property '.thread-id' not found"}}" arch=ppc64le command=create container=f79375f64d7aaa0533ce62e6d2100b15727e2e55ebfc52e02423123dc8996990 name=kata-runtime pid=37967 source=virtcontainers subsystem=qmp

Probably arch specific, how are other archs doing it...

"thread-id": threadID,

@nitkon
Copy link
Contributor Author

nitkon commented Jan 28, 2019

So, if we want to handle qmp calls differently based on which arch we are running Kata on, where do we patch? I don't think that we can make arch specific changes to intel/govmm/qemu/qmp.go

Example: We do not provide thread-id when hotplugging on ppc64le.

@grahamwhaley
Copy link
Contributor

Hi @nitkon - my guess would have been govmm, but it probably depends on if the specific option is wired into govmm, or if we pass it into the govmm call....?

@jodh-intel
Copy link
Contributor

/cc @markdryan

@markdryan
Copy link
Contributor

We do already have some architecture specific code in govmm for s390x. If there's no other way to solve the issue we could introduce a ppc64Ie version of ExecuteCPUDeviceAdd. How would it differ from the existing implementation? Would it simply ignore the threadid?

@nitkon
Copy link
Contributor Author

nitkon commented Jan 28, 2019

@markdryan : Works on ppc64le if we do not pass threadID and socketID.

# docker run -it  --runtime kata-runtime --cpus 3 fedora bash
[root@ee74bd6ad33f /]# lscpu
Architecture:        ppc64le
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           4
NUMA node(s):        1
Model:               2.1 (pvr 004b 0201)
Model name:          POWER8 (architected), altivec supported
L1d cache:           64K
L1i cache:           32K
NUMA node0 CPU(s):   0-3

I will send a patch soon where we need not make arch specific changes to govmm.

nitkon added a commit to nitkon/runtime that referenced this issue Jan 28, 2019
ppc64le qemu does not need threadID and
socketID parameters when hotplugging.

Fixes: kata-containers#1155

Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
nitkon added a commit to nitkon/runtime that referenced this issue Jan 28, 2019
Revendor govmm to get the latest changes.

Fixes: kata-containers#1155

Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
nitkon added a commit to nitkon/runtime that referenced this issue Jan 28, 2019
Shortlog:

78d079d Merge pull request kata-containers#84 from nitkon/master
4692f6b qmp: Conditionally pass threadID and socketID when CPU device add
b9c8f76 Merge pull request kata-containers#85 from markdryan/fix-travis
1f51b43 Update the versions of Go used to build GoVMM
ad310f9 Fix staticcheck S1023
932fdc7 Fix staticcheck S1023
cb2ce93 Fix staticcheck S1008
f0172cd Fix staticcheck (S1002)
5f2e630 Fix staticcheck (S1025)
4beea51 Fix staticcheck (ST1005) errors

Fixes: kata-containers#1155

Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
nitkon added a commit to nitkon/runtime that referenced this issue Jan 28, 2019
ppc64le qemu does not need threadID and
socketID parameters when hotplugging.

Fixes: kata-containers#1155

Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
nitkon added a commit to nitkon/runtime that referenced this issue Jan 28, 2019
Shortlog:

78d079d Merge pull request kata-containers#84 from nitkon/master
4692f6b qmp: Conditionally pass threadID and socketID when CPU device add
b9c8f76 Merge pull request kata-containers#85 from markdryan/fix-travis
1f51b43 Update the versions of Go used to build GoVMM
ad310f9 Fix staticcheck S1023
932fdc7 Fix staticcheck S1023
cb2ce93 Fix staticcheck S1008
f0172cd Fix staticcheck (S1002)
5f2e630 Fix staticcheck (S1025)
4beea51 Fix staticcheck (ST1005) errors

Fixes: kata-containers#1155

Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants