Commits
spapr-pci-hotp…
Name already in use
Commits on Aug 14, 2014
-
spapr_pci: emit hotplug add/remove events during hotplug
This uses extension of existing EPOW interrupt/event mechanism to notify userspace tools like librtas/drmgr to handle in-guest configuration/cleanup operations in response to device_add/device_del. Userspace tools that don't implement this extension will need to be run manually in response/advance of device_add/device_del, respectively. Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr_events: event-scan RTAS interface
We don't actually rely on this interface to surface hotplug events, and instead rely on the similar-but-interrupt-driven check-exception RTAS interface used for EPOW events. However, the existence of this interface is needed to ensure guest kernels initialize the event-reporting interfaces which will in turn be used by userspace tools to handle these events, so we implement this interface as a stub. Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr_events: re-use EPOW event infrastructure for hotplug events
This extends the data structures currently used to report EPOW events to gets via the check-exception RTAS interfaces to also include event types for hotplug/unplug events. This is currently undocumented and being finalized for inclusion in PAPR specification, but we implement this here as an extension for guest userspace tools to implement (existing guest kernels simply log these events via a sysfs interface that's read by rtas_errd). Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr_pci: enable basic hotplug operations
This enables hotplug for PHB bridges. Upon hotplug we generate the OF-nodes required by PAPR specification and IEEE 1275-1994 "PCI Bus Binding to Open Firmware" for the device. We associate the corresponding FDT for these nodes with the DrcEntry corresponding to the slot, which will be fetched via ibm,configure-connector RTAS calls by the guest as described by PAPR specification. The FDT is cleaned up in the case of unplug. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commits on Aug 13, 2014
-
pci: allow 0 address for PCI IO regions
Some kernels program a 0 address for io regions. PCI 3.0 spec section 6.2.5.1 doesn't seem to disallow this. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr_pci: add ibm,configure-connector RTAS interface
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr_pci: add get-sensor-state RTAS interface
Signed-off-by: Mike Day <ncmike@ncultra.org> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr_pci: add get/set-power-level RTAS interfaces
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr_pci: add set-indicator RTAS interface
Signed-off-by: Mike Day <ncmike@ncultra.org> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr: add helper to retrieve a PHB/device DrcEntry
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr_pci: populate DRC dt entries for PHBs
Reserve 32 entries of type PCI in each PHB's initial FDT. This advertises to guests that each PHB is DR-capable device with physical hotpluggable slots. This is necessary for allowing hotplugging of devices to it later via bus rescan or guest rpaphp hotplug module. Each entry is assigned a name of "Slot <<bus_idx>*32 +1>", advertised as a hotpluggable PCI slot, and assigned to power domain -1 to indicate to the guest that power management is handled by the hardware. This models a DR-capable PCI expansion device attached to a host/lpar via a single PHB with 32 physical hotpluggable slots (as opposed to a virtual bridge device with external management console). Hotplug will be handled by the guest via bus rescan or the rpaphp hotplug module. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
-
spapr: populate DRC entries for root dt node
This add entries to the root OF node to advertise our PHBs as being DR-capable in according with PAPR specification. Each PHB is given a name of PHB<bus#>, advertised as a PHB type, and associated with a power domain of -1 (indicating to guests that power management is handled automatically by hardware). We currently allocate entries for up to 32 DR-capable PHBs, though this limit can be increased later. DrcEntry objects to track the state of the DR-connector associated with each PHB are stored in a 32-entry array, and each DrcEntry has in turn have a dynamically-sized number of child DR-connectors, which we will use later to track the state of DR-connectors associated with a PHB's physical slots. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commits on Aug 12, 2014
-
Fix the check for carry in the srad helper to properly construct the mask -- a "1ULL" must be used (instead of "1") in order to get the desired result. Example: R3 8000000000000000 R4 F3511AD4A2CD4C38 srad 3,3,4 Should *not* set XER[CA] but does without this patch. Signed-off-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
For 64 bit implementations, the special case of a shift by zero should result in the sign extension of the least significant 32 bits of the source GPR (not a direct copy of the 64 bit source GPR). Example: R3 A6212433228F41DC srawi 3,3,0 R3 expected : 00000000228F41DC R3 actual : A6212433228F41DC (without this patch) Signed-off-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
target-ppc: Bug Fix: mulldo OV Detection
Fix the code to properly detect overflow; the 128 bit signed product must have all zeroes or all ones in the first 65 bits otherwise OV should be set. Example: R3 45F086A5D5887509 R4 0000000000000002 mulldo 3,3,4 Should set XER[OV]. Signed-off-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
For 64-bit implementations, the mullw result is the 64 bit product of the sign-extended least significant 32 bits of the source registers. Fix the code to properly sign extend the source operands and produce a 64 bit product. Example: R3 00000000002F37A0 R4 41C33D242F816715 mullw 3,3,4 R3 expected : 0008C3146AE0F020 R3 actual : 000000006AE0F020 (without this patch) Signed-off-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
On 64-bit implementations, the mullwo result is the 64 bit product of the signed 32 bit operands. Fix the implementation to properly deposit the upper 32 bits into the target register. Example: R3 0407DED115077586 R4 53778DF3CA992E09 mullwo 3,3,4 R3 expected : FB9D02730D7735B6 R3 actual : 000000000D7735B6 (without this patch) Signed-off-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
The rlwimi specification includes the ROTL32 operation, which is defined to be a left rotation of two copies of the least significant 32 bits of the source GPR. The current implementation is incorrect on 64-bit implementations in that it rotates a single copy of the least significant 32 bits, padding with zeroes in the most significant bits. Fix the code to properly implement this ROTL32 operation. Also fix the special case of MB=31 and ME=0 to copy the entire contents of the source GPR. Examples: R3 FFFFFFFFFFFFFFF0 rlwimi 3,3,29,14,1 R3 expected : 1FFFFFFE3FFFFFFE R3 actual : 000000003FFFFFFE (without this patch) R3 ED7EB4DD824F0853 rlwimi 3,3,10,31,0 R3 expected : 3C214E09024F0853 R3 actual : 00000000024F0853 (without this patch) Signed-off-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
The rlwnm specification includes the ROTL32 operation, which is defined to be a left rotation of two copies of the least significant 32 bits of the source GPR. The current implementation is incorrect on 64-bit implementations in that it rotates a single copy of the least significant 32 bits, padding with zeroes in the most significant bits. Fix the code to properly implement this ROTL32 operation. Example: R3 = 0000000000000002 R4 = 7FFFFFFFFFFFFFFF rlwnm 3,3,4,31,16 R3 expected : 0000000100000001 R3 actual : 0000000000000001 (without this patch) Signed-off-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
The rlwinm specification includes the ROTL32 operation, which is defined to be a left rotation of two copies of the least significant 32 bits of the source GPR. The current implementation is incorrect on 64-bit implementations in that it rotates a single copy of the least significant 32 bits, padding with zeroes in the most significant bits. Fix the code to properly implement this ROTL32 operation. Example: R3 = F7487D82EC6F75DF rlwinm 3,3,5,12,4 R3 expected : 8DEEBBFD880EBBFD R3 actual : 00000000880EBBFD (without this fix) Signed-off-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
ppc/spapr: Fix MAX_CPUS to 255
MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This MAX_CPUS number was percolated back to "virsh capabilities" with wrong max_cpus. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
ppc: Add hw breakpoint watchpoint support
This patch adds hardware breakpoint and hardware watchpoint support for ppc. On BOOKE architecture we cannot share debug resources between QEMU and guest because: When QEMU is using debug resources then debug exception must be always enabled. To achieve this we set MSR_DE and also set MSRP_DEP so guest cannot change MSR_DE. When emulating debug resource for guest we want guest to control MSR_DE (enable/disable debug interrupt on need). So above mentioned two configuration cannot be supported at the same time. So the result is that we cannot share debug resources between QEMU and Guest on BOOKE architecture. In the current design QEMU gets priority over guest, this means that if QEMU is using debug resources then guest cannot use them and if guest is using debug resource then qemu can overwrite them. When QEMU is not able to handle debug exception then we inject program exception to guest. Yes program exception NOT debug exception and the reason is: 1) QEMU and guest not sharing debug resources 2) For software breakpoint QEMU uses a ehpriv-1 instruction; So there cannot be any reason that we are in qemu with exit reason KVM_EXIT_DEBUG for guest set debug exception, only possibility is guest executed ehpriv-1 privilege instruction and that's why we are injecting program exception. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> -
ppc: Add software breakpoint support
This patch allow insert/remove software breakpoint. When QEMU is not able to handle debug exception then we inject program exception to guest because for software breakpoint QEMU uses a ehpriv-1 instruction; So there cannot be any reason that we are in qemu with exit reason KVM_EXIT_DEBUG for guest set debug exception, only possibility is guest executed ehpriv-1 privilege instruction and that's why we are injecting program exception. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> [agraf: make deflect comment booke/book3s agnostic] Signed-off-by: Alexander Graf <agraf@suse.de>
-
ppc: synchronize excp_vectors for injecting exception
This patch synchronizes env->excp_vectors[] with env->iovr[]. This is required for using the existing interrupt injection mechanism for kvm. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
ppc: debug stub: Get trap instruction opcode from KVM
Get trap instruction opcode from KVM and this opcode will be used for setting software breakpoint in following patch Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
spapr: Locate RTAS and device-tree based on real RMA
We currently calculate the final RTAS and FDT location based on the early estimate of the RMA size, cropped to 256M on KVM since we only know the real RMA size at reset time which happens much later in the boot process. This means the FDT and RTAS end up right below 256M while they could be much higher, using precious RMA space and limiting what the OS bootloader can put there which has proved to be a problem with some OSes (such as when using very large initrd's) Fortunately, we do the actual copy of the device-tree into guest memory much later, during reset, late enough to be able to do it using the final RMA value, we just need to move the calculation to the right place. However, RTAS is still loaded too early, so we change the code to load the tiny blob into qemu memory early on, and then copy it into guest memory at reset time. It's small enough that the memory usage doesn't matter. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [agraf: fix compilation on 32bit hosts] Signed-off-by: Alexander Graf <agraf@suse.de>
-
loader: Add load_image_size() to replace load_image()
A subsequent patch to ppc/spapr needs to load the RTAS blob into qemu memory rather than target memory (so it can later be copied into the right spot at machine reset time). I would use load_image() but it is marked deprecated because it doesn't take a buffer size as argument, so let's add load_image_size() that does. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [aik: fixed errors from checkpatch.pl] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
-
spapr: Fix ibm, associativity for memory nodes
We want the associtivity lists of memory and CPU nodes to match but memory nodes have incorrect domain#3 which is zero for CPU so they won't match. This clears domain#3 in the list to match CPUs associtivity lists. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
-
spapr: Add a helper for node0_size calculation
In multiple places there is a node0_size variable calculation which assumes that NUMA node #0 and memory node #0 are the same things which they are not. Since we are going to change it and do not want to change it in multiple places, let's make a helper. This adds a spapr_node0_size() helper and makes use of it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
-
spapr: Split memory nodes to power-of-two blocks
Linux kernel expects nodes to have power-of-two size and does WARN_ON if this is not the case: [ 0.041456] WARNING: at drivers/base/memory.c:115 which is: === /* Validate blk_sz is a power of 2 and not less than section size */ if ((block_sz & (block_sz - 1)) || (block_sz < MIN_MEMORY_BLOCK_SIZE)) { WARN_ON(1); block_sz = MIN_MEMORY_BLOCK_SIZE; } === This splits memory nodes into set of smaller blocks with a size which is a power of two. This makes sure the start address of every node is aligned to the node size. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de> -
spapr: Refactor spapr_populate_memory() to allow memoryless nodes
Current QEMU does not support memoryless NUMA nodes, however actual hardware may have them so it makes sense to have a way to emulate them in QEMU. This prepares SPAPR for that. This moves 2 calls of spapr_populate_memory_node() into the existing loop over numa nodes so first several nodes may have no memory and this still will work. If there is no numa configuration, the code assumes there is just a single node at 0 and it has all the guest memory. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
-
spapr: Use DT memory node rendering helper for other nodes
This finishes refactoring by using the spapr_populate_memory_node helper for all nodes and removing leftovers from spapr_populate_memory(). This is not a part of the previous patch because the patches look nicer apart. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
-
spapr: Move DT memory node rendering to a helper
This moves recurring bits of code related to memory@xxx nodes creation to a helper. This makes use of the new helper for node@0. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alexander Graf <agraf@suse.de>
-
spapr: fix possible memory leak
get_boot_devices_list() will malloc memory, spapr_finalize_fdt doesn't free it. Signed-off-by: Chenliang <chenliang88@huawei.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Alexander Graf <agraf@suse.de>
-
PPC: mac99: Move NVRAM to page boundary when necessary
When running KVM we have to adhere to host page boundaries for memory slots. Unfortunately the NVRAM on mac99 is a 4k RAM hole inside of an MMIO flash area. So if our host is configured with 64k page size, we can't use the mac99 target with KVM. This is a real shame, as this limitation is not really an issue - we can easily map NVRAM somewhere else and at least Linux and Mac OS X use it at their new location. So in that emergency case when it's about failing to run at all and moving NVRAM to a place it shouldn't be at, choose the latter. This patch enables -M mac99 with KVM on 64k page size hosts. Signed-off-by: Alexander Graf <agraf@suse.de>