Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
vfio: Define device migration protocol v2
Replace the existing region based migration protocol with an ioctl based protocol. The two protocols have the same general semantic behaviors, but the way the data is transported is changed. This is the mandatory portion of the new protocol, it defines the 5 mandatory states for basic stop and copy migration and the protocol to move the migration data in/out of the kernel. Compared to the clarification of the v1 protocol Alex proposed: https://lore.kernel.org/r/163909282574.728533.7460416142511440919.stgit@omen This has a few deliberate functional differences: - ERROR arcs allow the device function to remain unchanged. - The protocol is not required to return to the original state on transition failure. Instead we directly return the current state, whatever it may be. Userspace can execute an unwind back to the original state, reset, or do something else without needing kernel support. This simplifies the kernel design and should userspace choose a policy like always reset, avoids doing useless work in the kernel on error handling paths. - PRE_COPY is made optional, userspace must discover it before using it. This reflects the fact that the majority of drivers we are aware of right now will not implement PRE_COPY. - segmentation is not part of the data stream protocol, the receiver does not have to reproduce the framing boundaries. The hybrid FSM for the device_state is described as a Mealy machine by documenting each of the arcs the driver is required to implement. Defining the remaining set of old/new device_state transitions as 'combination transitions' which are naturally defined as taking multiple FSM arcs along the shortest path within the FSM's digraph allows a complete matrix of transitions. A new IOCTL VFIO_DEVICE_MIG_SET_STATE is defined to replace writing to the device_state field in the region. This allows returning more information in the case of failure, and includes returning a brand new FD whenever the requested arc opens a data transfer session. The VFIO core code implements the new ioctl and provides a helper function to the driver. Using the helper the driver only has to implement 6 of the FSM arcs and the other combination transitions are elaborated consistently from those arcs. The ioctl VFIO_DEVICE_MIG_ARC_SUPPORTED is defined as a way to query the kernel for support of FSM capabilities. This allows userspace to discover optional FSM features, and provides a robust route for future expansion. Combined with the ability of the kernel to execute combination transitions there is alot of flexibility to define new arcs and states in the future while still providing a backward compatible SET_STATE interface to userspace. The existing VFIO_DEVICE_FEATURE ioctl can also be used as part of any future migration feature negotiation. Data transfer sessions are now carried over a file descriptor, instead of the region. The FD functions for the lifetime of the data transfer session. read() and write() transfer the data with normal Linux stream FD semantics. This design allows future expansion to support poll(), io_uring, and other performance optimizations. As the current qemu design requires the available data size up front the VFIO_DEVICE_MIG_FD_SEGMENT ioctl allows querying this so it can build the data frame. The complicated mmap mode for data transfer is discarded as current qemu doesn't take meaningful advantage of it, and the new qemu implementation avoids substantially all the performance penalty of using read() on the region. Change-Id: Iaf7940cd9804becf7a1040e019e39af7e0b75fa7 Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
- Loading branch information