Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add nbd stream to live migration #44640

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

effi-ofer
Copy link

@idryomov @trociny I would appreciate your input. The following PR is more of a WIP to get your feedback rather than a fully baked and tested PR. FYI @orozery

RBD images can be live-migrated from external data sources. The sources may be native rbd, raw file, or qcow2 and may be streamed from file, http, or s3. However, ceph's qcow2 implementation is incomplete and lacks encryption, compression and backing files support. As a workaround, one can live-importing qcow2 images by first mounting the source qcow2 images using qemu-nbd which exposes the image as a raw file. Unfortunately, by importing as a raw file we lose information about the sparseness of an image.

I therefore would like to suggest adding an NBD source to live-migration which will allow us to mount a qcow2 image using qemu-nbd and then live-import the image using the native NBD protocol (via a socket rather than a raw file). This has multiple advantages. It enables live import to preserve sparseness, performs better (it doesn't go through the kernel), makes it possible to execute the live import from one machine while the image is accessed from other compute nodes without having to qemu-nbd/kernel mount the image on multiple machines. And it is also a step in the direction of having full qcow2 support with multiple differential backing images.
 
Note that by creating an NBD stream, I had to 'overload' the list_snap() so it gets executed by the stream, If you find this ugly, an alternative is to introduce an NBD format rather than a stream. Let me know what you think and thanks in advance.
 

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

Copy link
Contributor

@trociny trociny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it looks like a good idea. At least I don't see reasons why not.

As I mentioned in the comment below my main concern is yet another library dependency.

I left some comments, though I have not looked carefully and most likely missed some minor things.

And we will need to update the documentation and add some tests (if we decide to have this).

And just as a note. We have the PR #44447 where we are discussing adding "sparsify" feature to raw format. Though it will still read zeroes from the source and would be less optimal than nbd_stream.

src/librbd/CMakeLists.txt Outdated Show resolved Hide resolved
src/librbd/migration/NBDStream.cc Outdated Show resolved Hide resolved
src/librbd/migration/NBDStream.cc Outdated Show resolved Hide resolved
src/librbd/migration/NBDStream.cc Outdated Show resolved Hide resolved
src/librbd/migration/NBDStream.cc Outdated Show resolved Hide resolved
src/librbd/migration/RawSnapshot.cc Outdated Show resolved Hide resolved
@effi-ofer
Copy link
Author

In general it looks like a good idea. At least I don't see reasons why not.

As I mentioned in the comment below my main concern is yet another library dependency.

I left some comments, though I have not looked carefully and most likely missed some minor things.

And we will need to update the documentation and add some tests (if we decide to have this).

And just as a note. We have the PR #44447 where we are discussing adding "sparsify" feature to raw format. Though it will still read zeroes from the source and would be less optimal than nbd_stream.

Thanks for the insightful comments @trociny. I'll proceed with proper testing/updating docs/handling your comments and update the PR when ready.

@effi-ofer
Copy link
Author

I have updated the PR as per your suggestions. Can you please take a look @trociny?

CMakeLists.txt Show resolved Hide resolved
src/librbd/migration/SourceSpecBuilder.cc Outdated Show resolved Hide resolved
src/librbd/migration/RawSnapshot.cc Outdated Show resolved Hide resolved
src/librbd/migration/SourceSpecBuilder.cc Outdated Show resolved Hide resolved
doc/rbd/rbd-live-migration.rst Outdated Show resolved Hide resolved
src/librbd/migration/NBDStream.cc Outdated Show resolved Hide resolved
}

while (nbd_aio_in_flight(nbd) > 0) {
if (nbd_poll (nbd, -1) == -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still does not make it async, as you are blocking here for the request to complete. What I meant is to register a callback in nbd_aio_pread (instead of NBD_NULL_COMPLETION), which would check the return value and call on_finish->complete(r) (on_finish could be stored in user_data in nbd_completion_callback).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The libnbd library does not spawn its own threads, so the code spawns a new thread in the read() function. The read function then immediately returns without waiting for the actual processing of the read in ReadRequest. This is similar to how the File stream handles aio. But in addition, I switched to using the nbd_completion_callback. Is that resonable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not quite sure I understand what you mean by "the code spawns a new thread in the read() function".

Anyway, My idea is that if you set a read callback you don't need call nbd_poll to wait until the read is complete and may just return immediately. And when the callback is callet it will complete the on_finish context. At least this is how I suppose the nbd_aio API could be used here (I have not tried it myself).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I wasn't clear. The call to boost::asio::post(m_strand, ctx { ctx->send_read(); }); in NBDStream::read() immediately returns control, so we are not blocking. But regardless, I still believe we need the nbd_poll. Libnbd uses an event loop (see https://manpages.ubuntu.com/manpages/focal/man3/libnbd.3.html) and we either call nbd_poll or implement our our own version of checking for events. BTW, I updated the code in question and pushed an update. If you'll take a look you'll see I switched to using the callback, but I still needed to keep the nbd_poll call. Without it, we never get into the callback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds weird to me. I don't have experience with libnbd but but your current aio version does not make much sense to me. I don't see why to use aio read if you are still blocking on poll. Then your initial "sync" version would be easier.

I am going to try your code this weekend and see what is going there.

auto rc = nbd_aio_pread(nbd, boost::asio::buffer_cast<void *>(buffer),
byte_length, byte_offset,
(nbd_completion_callback) { .callback=task_completed,
.user_data=on_finish }, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still in the process of trying it but here is one thing that I just noticed is definitely wrong. When I suggested to just provide on_finish as user_data and complete it in the callback, I missed that we are doing it in the for loop for every extent. And we may compete it only once when all callbacks are completed. So it should be a bit more complicated. E.g. C_Gather may be used (you may find many examples in librbd source), or you could implement your own "completion" class.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. Good catch. I'll see what I can do to handle it better. BTW, I hope you and you loved ones are safe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @trociny I hope all is well. I recognize that you might have more pressing issues to deal with, so take your time. I pushed back the original handling of the on_finish which calls it once the requests are fully served. The current version uses a boost asio strand to avoid blocking the main caller. I am not certain we really gain much benefit from the using the async version of the api other than possibly batching them. And if I look at the file stream handling, I see that there a strand is also used, but the data is read serially. We can switch back to that model if you prefer it.

@trociny
Copy link
Contributor

trociny commented Apr 18, 2022

@effi-ofer I read more carefully the libnbd(3) man page and see now what you meant. Ok, so using nbd asynchronous API is not so simple as I initially assumed. We still could eventually use it but not in the way you implemented. Because the way you are using it does not differ from using the sync API, just more complicated. It looks like to use it effectively we would need to have a separate thread for polling. If you don't have plans to work on this (at least not in this initial PR) then I suggest returning back to your sync version.

And it might make sense to learn @idryomov opinion about this feature first, if we want to add libnbd dependency. So you would not waste your time working on a feature that will not be merged.

@effi-ofer
Copy link
Author

Hey @trociny it's great hearing from you. I hope all is well. The PR used a boost thread similar to how the file source is done. Is this what you mean by a different thread for polling? Or did you had in mind something else? I spoke with @idryomov last Monday or Tue and made sure he's aware of the libnbd dependency.

@trociny
Copy link
Contributor

trociny commented Apr 24, 2022

The PR used a boost thread similar to how the file source is done. Is this what you mean by a different thread for polling?

No. My idea was that NBDStream::read would just call nbd_aio_pread passing on_finish to nbd_completion_callback, and return. You would not need to use a boost thread here. But then you would need to do polling somewhere and then call nbd_aio_notify_read, which would call the completion callback (at least this is how I understand it works). And you would need a separate thread for polling. Initially I thought you would need to just call nbd_aio_pread passing the callback, and the libnbd would do all the magic (polling and triggering the callback) itself, but I was wrong.

The way you implemented (nbd_aio_pread and then poll waiting for the read complete in the same thread) does not differ from just calling the sync nbd_pread, just it is more complicated. It is probably ok (at least as the first implementation) and I suggest just return to you fist sync version. Or may be someone else has a suggestion how we could use nbd_aio API effectively here.

@effi-ofer
Copy link
Author

So basically add a thread that will wait for nbd requests, right? I don't mind making this change, but note again that this differs from how the file stream is handled. Also, we can't switch to the sync calls without a separate thread because then the caller will hang. Of course we could just do the sync calls within a separate thread. Or maybe that's what you meant.

@trociny
Copy link
Contributor

trociny commented Apr 26, 2022

If you decide to use the libnbd sync API (or async API but in the "sync" way you do currently, i.e. polling in the same thread, which I don't like) then yes, you should do it the same way as we have for file stream, which uses "sync" calls.

And currently I don't insist on using async API for this implementation. I suggested it initially because I thought it would be rather easy to use, which is not. And the way with a thread is a way that I only see, but people might have some other suggestions. And actually I have no idea how much it will improve things in comparison with the simple "sync" implementation, unless we try both.

@effi-ofer
Copy link
Author

Understood. Thanks for the clarification. I'll switch to the sync api with the boost thread.

@djgalloway djgalloway changed the base branch from master to main May 27, 2022 16:10
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Jul 29, 2022
@effi-ofer
Copy link
Author

Moving out of stale. Waiting for review.

@github-actions github-actions bot removed the stale label Jul 29, 2022
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Jan 12, 2023
@idryomov idryomov removed the stale label Jan 27, 2023
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Mar 28, 2023
@idryomov idryomov removed the stale label Mar 28, 2023
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label May 27, 2023
@idryomov idryomov removed the stale label May 29, 2023
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@effi-ofer effi-ofer requested a review from a team as a code owner September 10, 2023 09:41
@effi-ofer effi-ofer changed the title WIP: add nbd stream to live migration add nbd stream to live migration Oct 19, 2023
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Dec 18, 2023
@idryomov idryomov removed the stale label Dec 18, 2023
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Feb 16, 2024
@idryomov idryomov removed the stale label Feb 16, 2024
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Apr 16, 2024
@idryomov idryomov removed the stale label Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants