librbd: initial set of changes to migration API for instant-restore #37714

dillaman · 2020-10-20T07:00:14Z

Start the process of allowing a JSON-encoded migration source-spec string to be provided. This implies that migration should always attempt to first open the destination image for execute/commit/abort actions. This also includes an initial implementation of a raw, file-based image source IO path handler.

src/librbd/migration/ImageDispatch.cc

The partial result should be based upon buffer-extent positions not the original image-extent positions. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

trociny · 2020-10-22T08:45:33Z

src/cls/rbd/cls_rbd_types.h

-      overlap == ms.overlap && flatten == ms.flatten &&
-      mirroring == ms.mirroring && mirror_image_mode == ms.mirror_image_mode &&
-      state == ms.state && state_description == ms.state_description;
+      image_id == ms.image_id && source_spec == ms.source_spec &&


nit: just wondering, wouldn't it be more correct to compare the source spec just after comparing the header type?

It wouldn't make any difference in real life though I suppose...

src/librbd/migration/NativeFormat.cc

trociny · 2020-10-22T12:02:07Z

And just wondering, what was the reason to use json for the migration source spec? It seems that "ceph native encoding" could work here too, similarly how it is used e.g. for encoding different journal event types.

dillaman · 2020-10-22T13:07:31Z

And just wondering, what was the reason to use json for the migration source spec? It seems that "ceph native encoding" could work here too, similarly how it is used e.g. for encoding different journal event types.

We want someone to be able to define it on the CLI without using a hexeditor, so once we have a JSON/XML format, we might was well just keep it in the single format instead of creating yet another format (that would also need to be extensible to permit new formats, streams, etc).

trociny · 2020-10-22T13:39:00Z

We want someone to be able to define it on the CLI without using a hexeditor, so once we have a JSON/XML format, we might was well just keep it in the single format instead of creating yet another format (that would also need to be extensible to permit new formats, streams, etc).

We could make the CLI/API convert the user format to a native ceph encoding (Anyway, I suppose we might want to validate somehow the user input before storing it in the image migration header?) And users can't extent a format until it is supported by librbd, so this extension could be added to the native encoding.

But I don't have a strong opinion here. Just ignore if you like your way.

The initial hooks merely abstract the standard RBD parent/child clone relationship. The basic interface, however, is abstract enough to allow third party data formats and streams to be eventually integrated with RBD live-migration. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

The migration open source image state machine will handle how to initialize the source ImageCtx depending on the format and protocol utilized for the source image. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

… spec This source-spec will include a JSON-encoded structure describing the source format and protocol for accessing the source image data. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman · 2020-10-22T14:35:14Z

If we had an intermediate binary format, the OSDs (cls_rbd methods) would still need to treat the source_spec as binary blob and not attempt to decode it or else adding new client-side only features would require OSD upgrades. On the client-side, the next PR adds basic validation (valid JSON, can properly open the source before creating the destination image w/ the embedded JSON). This basic validation could probably be extended to support fail or warn on unknown attributes (i.e. don't let an image be created with attributes that are not used / understood) or warn on unused/unknown arguments on open.

The journal was all binary encoded for performance/space efficiency reasons which I don't think applies here. I think at the end of it all, JSON can provide the exact same set of forward and backward protection as native encoding w/o all the extra boilerplate code. I would think at best you just a few bytes of space savings here and there since you wouldn't have to encode key names and non-text values can be binary encoded. Just my opinion.

trociny

LGTM

trociny · 2020-10-24T17:33:40Z

@dillaman There are test failures [1], cli and python, and valgrind reported a memory leak in deep copy [2].

[1] https://pulpito.ceph.com/trociny-2020-10-23_10:42:30-rbd-wip-mgolub-testing-distro-basic-smithi/
[2] http://qa-proxy.ceph.com/teuthology/trociny-2020-10-23_10:42:30-rbd-wip-mgolub-testing-distro-basic-smithi/5551482/teuthology.log

The legacy migration source format has explicit pool, namespace, and image parameters. This helper method will convert it into the JSON format that will be used for the new-style migration source-spec. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

When using advanced, non-legacy migration sources like a remote Ceph cluster or a flat file, this API will return the JSON-encoded description of the migration source. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

This will help ensure that the source ImageCtx can be made optional when migrating from a read-only, non-native image source. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

With the exception of the 'prepare' command, always attempt to open the destination image first. This will better align with the ability for read-only, non-native images to be used as a source image. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

The file stream helper will translate byte-extent IO read requests to ASIO file read requests on the specified backing file. This can be layered with file format helpers to translate image IO extents down to the backing file. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

The raw format maps an abstract, fully allocated, raw block device image to the migration source IO API. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman · 2020-10-25T23:56:36Z

@trociny Thanks. The first two (cli and python) were due to typos during rebases and have been corrected. I'll fix the memory leak tomorrow as a new appended commit.

dillaman · 2020-10-26T14:09:00Z

https://pulpito.ceph.com/jdillaman-2020-10-26_10:08:16-rbd-wip-jd-testing-distro-basic-smithi/
https://pulpito.ceph.com/jdillaman-2020-10-26_10:23:51-rbd-wip-jd-testing-distro-basic-smithi/

src/librbd/migration/ImageDispatch.cc

Tweak the format interface's 'read' call to just let the existing read request continue passed the migration layer if the native format is in use. Otherwise, the provided AioCompletion would need to be wrapped with another AioCompletion to prevent the original image dispatch spec from getting lost. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

Prevent re-using the same AioCompletion between multiple ImageDispatchSpec objects to prevent the possibility of a memory leak. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

trociny · 2020-10-26T17:45:57Z

@dillaman Jenkins failed on assertion inTestLibRBD.FlushEmptyOpsOnExternalSnapshot, but I suspect it is related to my recent changes to the watcher. I am going to investigate this tomorrow (if the cause is not obvious for you).

dillaman · 2020-10-26T17:59:28Z

@dillaman Jenkins failed on assertion inTestLibRBD.FlushEmptyOpsOnExternalSnapshot, but I suspect it is related to my recent changes to the watcher. I am going to investigate this tomorrow (if the cause is not obvious for you).

Yeah, it's an issue related to the watcher changes. Looks like the SafeTimer isn't actually configured to be safe (m_safe_timer = new SafeTimer(cct, m_lock, false)) so there is a possibility for the timer to fire but not get invoked w/ m_lock locked and therefore race w/ any of the cancel_event calls.

dillaman · 2020-10-26T17:59:43Z

jenkins test make check

trociny · 2020-10-28T14:50:40Z

Yeah, it's an issue related to the watcher changes. Looks like the SafeTimer isn't actually configured to be safe (m_safe_timer = new SafeTimer(cct, m_lock, false)) so there is a possibility for the timer to fire but not get invoked w/ m_lock locked and therefore race w/ any of the cancel_event calls.

pushed #37880 for this.

trociny · 2020-10-29T13:23:11Z

https://pulpito.ceph.com/jdillaman-2020-10-26_10:08:16-rbd-wip-jd-testing-distro-basic-smithi/
https://pulpito.ceph.com/jdillaman-2020-10-26_10:23:51-rbd-wip-jd-testing-distro-basic-smithi/

@dillaman I missed this before merging (the job was still running when I was looking and later I forgot to check its status before merging), but there was one failure [1], though I have no idea if it is related. If I interpret the logs correctly It seems like the qemu got stuck?

Just wanted to let you know in case you had not seen this and would be interested.

[1] http://qa-proxy.ceph.com/teuthology/jdillaman-2020-10-26_10:23:51-rbd-wip-jd-testing-distro-basic-smithi/5560517/teuthology.log

dillaman · 2020-10-29T13:37:56Z

The dynamic_features_no_cache has been getting randomly stuck for a while. I tried to look into it a couple weeks ago by increasing the log levels but then it failed to repeat. I'll open a ticket against it so that I don't forget about it [1]

[1] https://tracker.ceph.com/issues/48038

dillaman added feature rbd labels Oct 20, 2020

trociny reviewed Oct 21, 2020

View reviewed changes

src/librbd/migration/ImageDispatch.cc Outdated Show resolved Hide resolved

librbd: fix issue with image-extent based ReadResult handler

e943ac8

The partial result should be based upon buffer-extent positions not the original image-extent positions. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman force-pushed the wip-librbd-migration-1 branch 2 times, most recently from b502669 to c5742c8 Compare October 22, 2020 00:52

trociny reviewed Oct 22, 2020

View reviewed changes

src/librbd/migration/NativeFormat.cc Outdated Show resolved Hide resolved

Jason Dillaman added 3 commits October 22, 2020 09:40

librbd: refresh parent now utilizes migration open source request

8ad19af

The migration open source image state machine will handle how to initialize the source ImageCtx depending on the format and protocol utilized for the source image. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

librbd: expand migration spec to include an optional free-form source…

23d28ed

… spec This source-spec will include a JSON-encoded structure describing the source format and protocol for accessing the source image data. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman force-pushed the wip-librbd-migration-1 branch from c5742c8 to 114ad87 Compare October 22, 2020 14:06

trociny approved these changes Oct 23, 2020

View reviewed changes

trociny added the wip-mgolub-testing label Oct 23, 2020

Jason Dillaman added 7 commits October 24, 2020 13:51

librbd: pass source ImageCtx to migration API helper methods

bf0198a

This will help ensure that the source ImageCtx can be made optional when migrating from a read-only, non-native image source. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

librbd: increase debug logging in ImageCtx

41a32b1

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

librbd: migration raw format source

81ee1cd

The raw format maps an abstract, fully allocated, raw block device image to the migration source IO API. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman force-pushed the wip-librbd-migration-1 branch from 114ad87 to 81ee1cd Compare October 25, 2020 23:55

trociny reviewed Oct 26, 2020

View reviewed changes

src/librbd/migration/ImageDispatch.cc Outdated Show resolved Hide resolved

Jason Dillaman added 2 commits October 26, 2020 12:01

librbd/io: ensure ImageDispatchSpec cannot re-use AioCompletion

32767bb

Prevent re-using the same AioCompletion between multiple ImageDispatchSpec objects to prevent the possibility of a memory leak. Signed-off-by: Jason Dillaman <dillaman@redhat.com>

dillaman force-pushed the wip-librbd-migration-1 branch from abd22c1 to 32767bb Compare October 26, 2020 16:01

trociny merged commit ec64935 into ceph:master Oct 26, 2020

dillaman deleted the wip-librbd-migration-1 branch October 26, 2020 19:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

librbd: initial set of changes to migration API for instant-restore #37714

librbd: initial set of changes to migration API for instant-restore #37714

dillaman commented Oct 20, 2020

trociny Oct 22, 2020

trociny commented Oct 22, 2020

dillaman commented Oct 22, 2020

trociny commented Oct 22, 2020

dillaman commented Oct 22, 2020

trociny left a comment

trociny commented Oct 24, 2020

dillaman commented Oct 25, 2020

dillaman commented Oct 26, 2020 •

edited

Loading

trociny commented Oct 26, 2020

dillaman commented Oct 26, 2020

dillaman commented Oct 26, 2020

trociny commented Oct 28, 2020

trociny commented Oct 29, 2020

dillaman commented Oct 29, 2020 •

edited

Loading

librbd: initial set of changes to migration API for instant-restore #37714

librbd: initial set of changes to migration API for instant-restore #37714

Conversation

dillaman commented Oct 20, 2020

trociny Oct 22, 2020

Choose a reason for hiding this comment

trociny commented Oct 22, 2020

dillaman commented Oct 22, 2020

trociny commented Oct 22, 2020

dillaman commented Oct 22, 2020

trociny left a comment

Choose a reason for hiding this comment

trociny commented Oct 24, 2020

dillaman commented Oct 25, 2020

dillaman commented Oct 26, 2020 • edited Loading

trociny commented Oct 26, 2020

dillaman commented Oct 26, 2020

dillaman commented Oct 26, 2020

trociny commented Oct 28, 2020

trociny commented Oct 29, 2020

dillaman commented Oct 29, 2020 • edited Loading

dillaman commented Oct 26, 2020 •

edited

Loading

dillaman commented Oct 29, 2020 •

edited

Loading