Skip to content

Commit

Permalink
Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/…
Browse files Browse the repository at this point in the history
…qemu into staging

Migartion pull request for 20240304

- Bryan's fix on multifd compression level API
- Fabiano's mapped-ram series (base + multifd only)
- Steve's amend on cpr document in qapi/

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZeUjKhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbv5QD/ZexBUsmZA5qyxgGvZ2yvlUBEGNOvtmKY
# kRdiYPU7khMA/0N43rn4LcqKCoq4+T+EAnYizGjIyhH/7BRUyn4DUxgO
# =AeEn
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Mar 2024 01:26:02 GMT
# gpg:                using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg:                issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg:                 aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D  D1A9 3B5F CCCD F3AB D706

* tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu: (27 commits)
  migration/multifd: Document two places for mapped-ram
  tests/qtest/migration: Add a multifd + mapped-ram migration test
  migration/multifd: Add mapped-ram support to fd: URI
  migration/multifd: Support incoming mapped-ram stream format
  migration/multifd: Support outgoing mapped-ram stream format
  migration/multifd: Prepare multifd sync for mapped-ram migration
  migration/multifd: Add incoming QIOChannelFile support
  migration/multifd: Add outgoing QIOChannelFile support
  migration/multifd: Add a wrapper for channels_created
  migration/multifd: Allow receiving pages without packets
  migration/multifd: Allow multifd without packets
  migration/multifd: Decouple recv method from pages
  migration/multifd: Rename MultiFDSend|RecvParams::data to compress_data
  tests/qtest/migration: Add tests for mapped-ram file-based migration
  migration/ram: Add incoming 'mapped-ram' migration
  migration/ram: Add outgoing 'mapped-ram' migration
  migration: Add mapped-ram URI compatibility check
  migration/ram: Introduce 'mapped-ram' migration capability
  migration/qemu-file: add utility methods for working with seekable channels
  io: fsync before closing a file channel
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	migration/ram.c
  • Loading branch information
pm215 committed Mar 5, 2024
2 parents 4eac9df + 1a6e217 commit c90cfb5
Show file tree
Hide file tree
Showing 27 changed files with 1,666 additions and 160 deletions.
1 change: 1 addition & 0 deletions docs/devel/migration/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ Migration has plenty of features to support different use cases.
dirty-limit
vfio
virtio
mapped-ram
138 changes: 138 additions & 0 deletions docs/devel/migration/mapped-ram.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
Mapped-ram
==========

Mapped-ram is a new stream format for the RAM section designed to
supplement the existing ``file:`` migration and make it compatible
with ``multifd``. This enables parallel migration of a guest's RAM to
a file.

The core of the feature is to ensure that RAM pages are mapped
directly to offsets in the resulting migration file. This enables the
``multifd`` threads to write exclusively to those offsets even if the
guest is constantly dirtying pages (i.e. live migration). Another
benefit is that the resulting file will have a bounded size, since
pages which are dirtied multiple times will always go to a fixed
location in the file, rather than constantly being added to a
sequential stream. Having the pages at fixed offsets also allows the
usage of O_DIRECT for save/restore of the migration stream as the
pages are ensured to be written respecting O_DIRECT alignment
restrictions (direct-io support not yet implemented).

Usage
-----

On both source and destination, enable the ``multifd`` and
``mapped-ram`` capabilities:

``migrate_set_capability multifd on``

``migrate_set_capability mapped-ram on``

Use a ``file:`` URL for migration:

``migrate file:/path/to/migration/file``

Mapped-ram migration is best done non-live, i.e. by stopping the VM on
the source side before migrating.

Use-cases
---------

The mapped-ram feature was designed for use cases where the migration
stream will be directed to a file in the filesystem and not
immediately restored on the destination VM [#]_. These could be
thought of as snapshots. We can further categorize them into live and
non-live.

- Non-live snapshot

If the use case requires a VM to be stopped before taking a snapshot,
that's the ideal scenario for mapped-ram migration. Not having to
track dirty pages, the migration will write the RAM pages to the disk
as fast as it can.

Note: if a snapshot is taken of a running VM, but the VM will be
stopped after the snapshot by the admin, then consider stopping it
right before the snapshot to take benefit of the performance gains
mentioned above.

- Live snapshot

If the use case requires that the VM keeps running during and after
the snapshot operation, then mapped-ram migration can still be used,
but will be less performant. Other strategies such as
background-snapshot should be evaluated as well. One benefit of
mapped-ram in this scenario is portability since background-snapshot
depends on async dirty tracking (KVM_GET_DIRTY_LOG) which is not
supported outside of Linux.

.. [#] While this same effect could be obtained with the usage of
snapshots or the ``file:`` migration alone, mapped-ram provides
a performance increase for VMs with larger RAM sizes (10s to
100s of GiBs), specially if the VM has been stopped beforehand.
RAM section format
------------------

Instead of having a sequential stream of pages that follow the
RAMBlock headers, the dirty pages for a RAMBlock follow its header
instead. This ensures that each RAM page has a fixed offset in the
resulting migration file.

A bitmap is introduced to track which pages have been written in the
migration file. Pages are written at a fixed location for every
ramblock. Zero pages are ignored as they'd be zero in the destination
migration as well.

::

Without mapped-ram: With mapped-ram:

--------------------- --------------------------------
| ramblock 1 header | | ramblock 1 header |
--------------------- --------------------------------
| ramblock 2 header | | ramblock 1 mapped-ram header |
--------------------- --------------------------------
| ... | | padding to next 1MB boundary |
--------------------- | ... |
| ramblock n header | --------------------------------
--------------------- | ramblock 1 pages |
| RAM_SAVE_FLAG_EOS | | ... |
--------------------- --------------------------------
| stream of pages | | ramblock 2 header |
| (iter 1) | --------------------------------
| ... | | ramblock 2 mapped-ram header |
--------------------- --------------------------------
| RAM_SAVE_FLAG_EOS | | padding to next 1MB boundary |
--------------------- | ... |
| stream of pages | --------------------------------
| (iter 2) | | ramblock 2 pages |
| ... | | ... |
--------------------- --------------------------------
| ... | | ... |
--------------------- --------------------------------
| RAM_SAVE_FLAG_EOS |
--------------------------------
| ... |
--------------------------------

where:
- ramblock header: the generic information for a ramblock, such as
idstr, used_len, etc.

- ramblock mapped-ram header: the information added by this feature:
bitmap of pages written, bitmap size and offset of pages in the
migration file.

Restrictions
------------

Since pages are written to their relative offsets and out of order
(due to the memory dirtying patterns), streaming channels such as
sockets are not supported. A seekable channel such as a file is
required. This can be verified in the QIOChannel by the presence of
the QIO_CHANNEL_FEATURE_SEEKABLE.

The improvements brought by this feature apply only to guest physical
RAM. Other types of memory such as VRAM are migrated as part of device
states.
13 changes: 13 additions & 0 deletions include/exec/ramblock.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,19 @@ struct RAMBlock {
size_t page_size;
/* dirty bitmap used during migration */
unsigned long *bmap;

/*
* Below fields are only used by mapped-ram migration
*/
/* bitmap of pages present in the migration file */
unsigned long *file_bmap;
/*
* offset in the file pages belonging to this ramblock are saved,
* used only during migration to a file.
*/
off_t bitmap_offset;
uint64_t pages_offset;

/* bitmap of already received pages in postcopy */
unsigned long *receivedmap;

Expand Down
83 changes: 83 additions & 0 deletions include/io/channel.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ enum QIOChannelFeature {
QIO_CHANNEL_FEATURE_LISTEN,
QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY,
QIO_CHANNEL_FEATURE_READ_MSG_PEEK,
QIO_CHANNEL_FEATURE_SEEKABLE,
};


Expand Down Expand Up @@ -130,6 +131,16 @@ struct QIOChannelClass {
Error **errp);

/* Optional callbacks */
ssize_t (*io_pwritev)(QIOChannel *ioc,
const struct iovec *iov,
size_t niov,
off_t offset,
Error **errp);
ssize_t (*io_preadv)(QIOChannel *ioc,
const struct iovec *iov,
size_t niov,
off_t offset,
Error **errp);
int (*io_shutdown)(QIOChannel *ioc,
QIOChannelShutdown how,
Error **errp);
Expand Down Expand Up @@ -528,6 +539,78 @@ void qio_channel_set_follow_coroutine_ctx(QIOChannel *ioc, bool enabled);
int qio_channel_close(QIOChannel *ioc,
Error **errp);

/**
* qio_channel_pwritev
* @ioc: the channel object
* @iov: the array of memory regions to write data from
* @niov: the length of the @iov array
* @offset: offset in the channel where writes should begin
* @errp: pointer to a NULL-initialized error object
*
* Not all implementations will support this facility, so may report
* an error. To avoid errors, the caller may check for the feature
* flag QIO_CHANNEL_FEATURE_SEEKABLE prior to calling this method.
*
* Behaves as qio_channel_writev_full, apart from not supporting
* sending of file handles as well as beginning the write at the
* passed @offset
*
*/
ssize_t qio_channel_pwritev(QIOChannel *ioc, const struct iovec *iov,
size_t niov, off_t offset, Error **errp);

/**
* qio_channel_pwrite
* @ioc: the channel object
* @buf: the memory region to write data into
* @buflen: the number of bytes to @buf
* @offset: offset in the channel where writes should begin
* @errp: pointer to a NULL-initialized error object
*
* Not all implementations will support this facility, so may report
* an error. To avoid errors, the caller may check for the feature
* flag QIO_CHANNEL_FEATURE_SEEKABLE prior to calling this method.
*
*/
ssize_t qio_channel_pwrite(QIOChannel *ioc, char *buf, size_t buflen,
off_t offset, Error **errp);

/**
* qio_channel_preadv
* @ioc: the channel object
* @iov: the array of memory regions to read data into
* @niov: the length of the @iov array
* @offset: offset in the channel where writes should begin
* @errp: pointer to a NULL-initialized error object
*
* Not all implementations will support this facility, so may report
* an error. To avoid errors, the caller may check for the feature
* flag QIO_CHANNEL_FEATURE_SEEKABLE prior to calling this method.
*
* Behaves as qio_channel_readv_full, apart from not supporting
* receiving of file handles as well as beginning the read at the
* passed @offset
*
*/
ssize_t qio_channel_preadv(QIOChannel *ioc, const struct iovec *iov,
size_t niov, off_t offset, Error **errp);

/**
* qio_channel_pread
* @ioc: the channel object
* @buf: the memory region to write data into
* @buflen: the number of bytes to @buf
* @offset: offset in the channel where writes should begin
* @errp: pointer to a NULL-initialized error object
*
* Not all implementations will support this facility, so may report
* an error. To avoid errors, the caller may check for the feature
* flag QIO_CHANNEL_FEATURE_SEEKABLE prior to calling this method.
*
*/
ssize_t qio_channel_pread(QIOChannel *ioc, char *buf, size_t buflen,
off_t offset, Error **errp);

/**
* qio_channel_shutdown:
* @ioc: the channel object
Expand Down
2 changes: 2 additions & 0 deletions include/migration/qemu-file-types.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ unsigned int qemu_get_be16(QEMUFile *f);
unsigned int qemu_get_be32(QEMUFile *f);
uint64_t qemu_get_be64(QEMUFile *f);

bool qemu_file_is_seekable(QEMUFile *f);

static inline void qemu_put_be64s(QEMUFile *f, const uint64_t *pv)
{
qemu_put_be64(f, *pv);
Expand Down
13 changes: 13 additions & 0 deletions include/qemu/bitops.h
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,19 @@ static inline void clear_bit(long nr, unsigned long *addr)
*p &= ~mask;
}

/**
* clear_bit_atomic - Clears a bit in memory atomically
* @nr: Bit to clear
* @addr: Address to start counting from
*/
static inline void clear_bit_atomic(long nr, unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = addr + BIT_WORD(nr);

return qatomic_and(p, ~mask);
}

/**
* change_bit - Toggle a bit in memory
* @nr: Bit to change
Expand Down

0 comments on commit c90cfb5

Please sign in to comment.