Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd-mirror: reduce memory footprint during journal replay #10341

Merged
merged 16 commits into from Jul 30, 2016

Conversation

dillaman
Copy link

No description provided.

@@ -323,7 +324,7 @@ int register_journal(rados_ioctx_t ioctx, const char *image_name) {
return r;
}

journal::Journaler journaler(io_ctx, image_id, JOURNAL_CLIENT_ID, 0);
journal::Journaler journaler(io_ctx, image_id, JOURNAL_CLIENT_ID, {});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default commit_interval is 5, but the test was using value 0. If the default value does not matter for the test please ignore this comment!
The same for the remaining similar changes.

Copy link
Author

@dillaman dillaman Jul 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ratcheted it down in the cases where the test case truly depended on looking at the commit position. But in general, this parameter only has any effect after a crash (i.e. maximum number of seconds of events that would need to be replayed in the worst case).

@dillaman dillaman force-pushed the wip-16223 branch 3 times, most recently from 316474b to 52ad1ba Compare July 21, 2016 11:30

// trim empty player to prefetch the next available object
for (auto &player_pair : m_object_players) {
ObjectPlayerPtr object_player(player_pair.second);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dillaman What is the purpose of this line?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently nothing -- will clean up

Jason Dillaman added 16 commits July 21, 2016 12:52
Additional runtime configuration settings will be needed. The
new class will avoid the need to expand the constructor.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Support fetching the full object or incremental chunks (with a
minimum of at least a single decoded entry if available).

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Journal playback will need to read at least a full entry which was
currently limited to the maximum object size. In memory constrained
environment, this new optional limit will set a fix upper bound on
memory usage.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Previously it was prefetching up to 2 object sets worth of journal
data objects which consumed too much memory.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Now that it's possible for the ObjectPlayer to only read a
partial subset of available entries, the JournalPlayer needs
to detect that more entries might be available.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
rbd-mirror debugging involved potentially thousands of journals
concurrently running. The instance address will correlate log
messages between journals.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
If a future flush is requested at the exact same moment that an
overflow is detected, the two threads will deadlock since locks
are not taken in a consistent order.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
When streaming playback, avoid the unnecessary watch delay when
one or more entries have been pruned.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Operation request op finish events should not be fire and forget.
Instead, ensure the event is committed to the journal before
completing the op. This will avoid several possible split-brain
events during mirroring.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Ensure that, by default, IO journal events are broken up into manageable
sizes when factoring in that an rbd-mirror daemon might be replaying
events from thousands of images.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
When multiple pools are being replicated, start the shut down
process concurrently across all pool replayers.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Fixed lockdep issue from status update callback and fixed the
potential for a stuck status state.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The cancel request could race with the actual scheduling of the image
sync operation.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
librbd will replay these ops when opening an image, so rbd-mirror
should also ensure these ops are replayed.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
@dillaman dillaman changed the title [DNM] rbd-mirror: reduce memory footprint during journal replay rbd-mirror: reduce memory footprint during journal replay Jul 21, 2016
@trociny trociny self-assigned this Jul 21, 2016
trociny pushed a commit that referenced this pull request Jul 25, 2016
rbd-mirror: reduce memory footprint during journal replay #10341
trociny pushed a commit that referenced this pull request Jul 27, 2016
rbd-mirror: reduce memory footprint during journal replay #10341
trociny pushed a commit that referenced this pull request Jul 28, 2016
rbd-mirror: reduce memory footprint during journal replay #10341
@trociny
Copy link
Contributor

trociny commented Jul 30, 2016

lgtm

@trociny trociny merged commit df2aa58 into ceph:master Jul 30, 2016
@dillaman dillaman deleted the wip-16223 branch July 30, 2016 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants