New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd-mirror: reduce memory footprint during journal replay #10341
Conversation
4f54add
to
182ef6c
Compare
@@ -323,7 +324,7 @@ int register_journal(rados_ioctx_t ioctx, const char *image_name) { | |||
return r; | |||
} | |||
|
|||
journal::Journaler journaler(io_ctx, image_id, JOURNAL_CLIENT_ID, 0); | |||
journal::Journaler journaler(io_ctx, image_id, JOURNAL_CLIENT_ID, {}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default commit_interval
is 5, but the test was using value 0. If the default value does not matter for the test please ignore this comment!
The same for the remaining similar changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ratcheted it down in the cases where the test case truly depended on looking at the commit position. But in general, this parameter only has any effect after a crash (i.e. maximum number of seconds of events that would need to be replayed in the worst case).
316474b
to
52ad1ba
Compare
|
||
// trim empty player to prefetch the next available object | ||
for (auto &player_pair : m_object_players) { | ||
ObjectPlayerPtr object_player(player_pair.second); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dillaman What is the purpose of this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently nothing -- will clean up
Additional runtime configuration settings will be needed. The new class will avoid the need to expand the constructor. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Support fetching the full object or incremental chunks (with a minimum of at least a single decoded entry if available). Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Journal playback will need to read at least a full entry which was currently limited to the maximum object size. In memory constrained environment, this new optional limit will set a fix upper bound on memory usage. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Previously it was prefetching up to 2 object sets worth of journal data objects which consumed too much memory. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Now that it's possible for the ObjectPlayer to only read a partial subset of available entries, the JournalPlayer needs to detect that more entries might be available. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
rbd-mirror debugging involved potentially thousands of journals concurrently running. The instance address will correlate log messages between journals. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
If a future flush is requested at the exact same moment that an overflow is detected, the two threads will deadlock since locks are not taken in a consistent order. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
When streaming playback, avoid the unnecessary watch delay when one or more entries have been pruned. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Operation request op finish events should not be fire and forget. Instead, ensure the event is committed to the journal before completing the op. This will avoid several possible split-brain events during mirroring. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Ensure that, by default, IO journal events are broken up into manageable sizes when factoring in that an rbd-mirror daemon might be replaying events from thousands of images. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Fixes: http://tracker.ceph.com/issues/16223 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
When multiple pools are being replicated, start the shut down process concurrently across all pool replayers. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Fixed lockdep issue from status update callback and fixed the potential for a stuck status state. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The cancel request could race with the actual scheduling of the image sync operation. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
librbd will replay these ops when opening an image, so rbd-mirror should also ensure these ops are replayed. Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
rbd-mirror: reduce memory footprint during journal replay #10341
rbd-mirror: reduce memory footprint during journal replay #10341
rbd-mirror: reduce memory footprint during journal replay #10341
lgtm |
No description provided.