Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

librbd: integrate journaling support for IO operations #6541

Merged
merged 45 commits into from Nov 17, 2015

Conversation

dillaman
Copy link

No description provided.

@dillaman dillaman changed the title librbd: integrate journaling support for IO paths librbd: integrate journaling support for IO operations Nov 11, 2015
@jdurgin
Copy link
Member

jdurgin commented Nov 13, 2015

I haven't gotten through all of this yet (up to the meat of it now, 'librdb: initial interface with journal library', but have a couple comments. I really like the i/o path refactoring - much more consistent now.

For enabling the journal, we'd mentioned doing a hidden snapshot as a basis for mirroring at that point. Were you thinking of doing that at another time, like when mirroring is enabled, or just didn't need it yet?

The cmake files will need updating for this (probably in later branches too).

@dillaman
Copy link
Author

@jdurgin I punted on the maintenance level stuff for the time being. The hidden snapshot for mirroring should be added later -- I was thinking as part of the mirroring daemon since it makes for a consistent workflow (it will already need to remove the snapshot after it catches up).

Definitely will break cmake -- I will add a new commit to the end to patch up the CMakefile.

Jason Dillaman added 22 commits November 12, 2015 20:17
When enabled, all mutable operations against an RBD image
will be recorded to a journal prior to changing the underlying
RBD image.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This context is used for more than just write requests. It is always
tied to an AioCompletion, so moved request reference counting
management within class.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This better reflects the fact that these represent requests against
an object extent and helps differentiate it from the future
AioImageRequest classes.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
New work queue has the ability to suspend write operations, which
should occur when exclusive locking is enabled and the client doesn't
own the lock.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Move aio_read/write/discard/flush to new AioImageRequest classes
in support of a unified aio queue / journaling.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The ImageCtx is known so there is no need to pass it with each
function call.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Required moving non-AIO read/write/discard methods to
AioImageRequestWQ to avoid deadlock on lock request.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Previously the ImageWatcher stored delayed ops that were waiting
on the image exclusive lock.  This management has been moved to the
AioImageRequestWQ to ensure requests are processed in-order.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
ImageWatcher is no longer responsible for queueing write ops while
waiting for the exclusive lock.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
New journal entries to cover AIO write/discard/flush operations.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman added 13 commits November 12, 2015 23:27
Cache writeback should be delayed until after journal event has been
commmitted to disk.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
When caching is disabled, the AioCompletion notifies the journal that
the update is safe.  When caching is enabled, writeback can result
in partial write extents being overwritten (and no longer associated
to the original journal event).  In this case, the writeback handler
is responsible for informing the journal when writes are safe.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Valgrind-related updates to the journal library required tweaks to
current implementation of librbd journaling.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This will ensure that the journal is properly opened to handle
appending events.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
There is a possibility of a race if the JournalTrimmer destructor is
waiting for an async op to complete but the op is flagged as complete
while the mutex is being held.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
@jdurgin
Copy link
Member

jdurgin commented Nov 17, 2015

Looking good in general. Will run through a subset of the rbd suite to check for regressions.

@jdurgin
Copy link
Member

jdurgin commented Nov 17, 2015

At least a couple real failures here, from copy.sh and read-flags.sh:

http://pulpito.ceph.com/joshd-2015-11-16_22:08:47-rbd-wip-11287-rebased---basic-vps/

@dillaman
Copy link
Author

@jdurgin The read-flags tests are only working on the jewel branch (needs merged back to master). I'll take a look at the other failures.

http://qa-proxy.ceph.com/teuthology/teuthology-2015-11-13_23:00:08-rbd-master---basic-multi/1148949/teuthology.log

Jason Dillaman added 2 commits November 17, 2015 08:52
If another (independent) queue was processing, drain could
block waiting.  Instead, allow drain to exit quickly if
no items are being processed and the queue is empty for
the current WQ.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Ensure all AIO to the parent image is properly flushed and assert
that all work queues are empty before closing the parent image.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jason Dillaman added 2 commits November 17, 2015 10:22
Add a new convenience method to ImageCtx for handling flush
requests and cleanup flush handling with dealing with the cache.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This ensures that the exclusive lock is properly acquired prior to
writing to the new image.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
@dillaman
Copy link
Author

@jdurgin I had fixes for the exposed issues on my other journaling branch. I cherry-picked and locally retested librbd_python and copy.sh successfully.

jdurgin added a commit that referenced this pull request Nov 17, 2015
librbd: integrate journaling support for IO operations

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
@jdurgin jdurgin merged commit 46249fe into ceph:master Nov 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants