New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
librbd: integrate journaling support for IO operations #6541
Conversation
I haven't gotten through all of this yet (up to the meat of it now, 'librdb: initial interface with journal library', but have a couple comments. I really like the i/o path refactoring - much more consistent now. For enabling the journal, we'd mentioned doing a hidden snapshot as a basis for mirroring at that point. Were you thinking of doing that at another time, like when mirroring is enabled, or just didn't need it yet? The cmake files will need updating for this (probably in later branches too). |
@jdurgin I punted on the maintenance level stuff for the time being. The hidden snapshot for mirroring should be added later -- I was thinking as part of the mirroring daemon since it makes for a consistent workflow (it will already need to remove the snapshot after it catches up). Definitely will break cmake -- I will add a new commit to the end to patch up the CMakefile. |
When enabled, all mutable operations against an RBD image will be recorded to a journal prior to changing the underlying RBD image. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This context is used for more than just write requests. It is always tied to an AioCompletion, so moved request reference counting management within class. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This better reflects the fact that these represent requests against an object extent and helps differentiate it from the future AioImageRequest classes. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
New work queue has the ability to suspend write operations, which should occur when exclusive locking is enabled and the client doesn't own the lock. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Move aio_read/write/discard/flush to new AioImageRequest classes in support of a unified aio queue / journaling. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The ImageCtx is known so there is no need to pass it with each function call. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Required moving non-AIO read/write/discard methods to AioImageRequestWQ to avoid deadlock on lock request. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Previously the ImageWatcher stored delayed ops that were waiting on the image exclusive lock. This management has been moved to the AioImageRequestWQ to ensure requests are processed in-order. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
ImageWatcher is no longer responsible for queueing write ops while waiting for the exclusive lock. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
New journal entries to cover AIO write/discard/flush operations. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Cache writeback should be delayed until after journal event has been commmitted to disk. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
When caching is disabled, the AioCompletion notifies the journal that the update is safe. When caching is enabled, writeback can result in partial write extents being overwritten (and no longer associated to the original journal event). In this case, the writeback handler is responsible for informing the journal when writes are safe. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Valgrind-related updates to the journal library required tweaks to current implementation of librbd journaling. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This will ensure that the journal is properly opened to handle appending events. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
There is a possibility of a race if the JournalTrimmer destructor is waiting for an async op to complete but the op is flagged as complete while the mutex is being held. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Looking good in general. Will run through a subset of the rbd suite to check for regressions. |
At least a couple real failures here, from copy.sh and read-flags.sh: http://pulpito.ceph.com/joshd-2015-11-16_22:08:47-rbd-wip-11287-rebased---basic-vps/ |
@jdurgin The read-flags tests are only working on the jewel branch (needs merged back to master). I'll take a look at the other failures. |
If another (independent) queue was processing, drain could block waiting. Instead, allow drain to exit quickly if no items are being processed and the queue is empty for the current WQ. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Ensure all AIO to the parent image is properly flushed and assert that all work queues are empty before closing the parent image. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Add a new convenience method to ImageCtx for handling flush requests and cleanup flush handling with dealing with the cache. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
This ensures that the exclusive lock is properly acquired prior to writing to the new image. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
@jdurgin I had fixes for the exposed issues on my other journaling branch. I cherry-picked and locally retested librbd_python and copy.sh successfully. |
librbd: integrate journaling support for IO operations Reviewed-by: Josh Durgin <jdurgin@redhat.com>
No description provided.