Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jewel: rbd: rbd-mirror: reduce memory footprint during journal replay #10684

Merged
30 commits merged into from
Aug 19, 2016

Conversation

@dillaman
Copy link
Author

retest this please

@dillaman
Copy link
Author

(unittest_bluefs seg fault causing test failure)

@ghost
Copy link

ghost commented Aug 12, 2016

hey jenkins, test this please (unittest_bluefs seg fault causing test failure)

@ghost ghost changed the title jewel: rbd-mirror: reduce memory footprint during journal replay DNM: jewel: rbd-mirror: reduce memory footprint during journal replay Aug 12, 2016
@ghost ghost self-assigned this Aug 12, 2016
@ghost
Copy link

ghost commented Aug 17, 2016

@dillaman could you please rebase ? #10678 has been merged, hence the conflicts.

@dillaman
Copy link
Author

@dachary rebased

@ghost
Copy link

ghost commented Aug 17, 2016

@dillaman thanks ! There is an actual compilation problem though:

CXX      test/rbd_mirror/image_sync/unittest_rbd_mirror-test_mock_SyncPointPruneRequest.o
test/rbd_mirror/image_sync/test_mock_ImageCopyRequest.cc:632:7: error: redefinition of ‘class rbd::mirror::image_sync::TestMockImageSyncImageCopyRequest_EmptySnapSeqs_Test’
 TEST_F(TestMockImageSyncImageCopyRequest, EmptySnapSeqs) {
       ^
test/rbd_mirror/image_sync/test_mock_ImageCopyRequest.cc:610:7: error: previous definition of ‘class 

Jason Dillaman and others added 16 commits August 17, 2016 13:22
Fixes: http://tracker.ceph.com/issues/16489
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit c97f724)
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 48f301d)
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 03c2aec)
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 1fc2754)
Fixes: http://tracker.ceph.com/issues/16349
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 2f55aa5)
The watching object name is changed when renaming an old format image,
so unregister the watcher before the rename, and register back after,
to avoid "Transport endpoint is not connected" error.

Fixes: http://tracker.ceph.com/issues/16321
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
(cherry picked from commit 1a3973c)
Snapshot rename operations utilize the (cluster) unique snapshot
sequence to prevent attempts at replays. When mirroring to a
different cluster, these sequences will not align.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 2f4cb26)
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 77699bf)

Conflicts:
	src/test/librbd/mock/MockOperations.h: no shrink restriction
Remote peers need a key to map snapshot ids between clusters.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit f70b90c)
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 57cd75e)
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 270cb74)

Conflicts:
	src/librbd/journal/Replay.cc: no snap limit restriction
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit fdfca55)
Fixes: http://tracker.ceph.com/issues/16622
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 4df913d)
Additional runtime configuration settings will be needed. The
new class will avoid the need to expand the constructor.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit dad8328)
Support fetching the full object or incremental chunks (with a
minimum of at least a single decoded entry if available).

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit f7362e9)
Journal playback will need to read at least a full entry which was
currently limited to the maximum object size. In memory constrained
environment, this new optional limit will set a fix upper bound on
memory usage.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 8c1877b)
Jason Dillaman and others added 14 commits August 17, 2016 13:22
Previously it was prefetching up to 2 object sets worth of journal
data objects which consumed too much memory.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 2666d36)
Now that it's possible for the ObjectPlayer to only read a
partial subset of available entries, the JournalPlayer needs
to detect that more entries might be available.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 28d5ca1)
rbd-mirror debugging involved potentially thousands of journals
concurrently running. The instance address will correlate log
messages between journals.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 11475f4)
If a future flush is requested at the exact same moment that an
overflow is detected, the two threads will deadlock since locks
are not taken in a consistent order.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 2c65471)
When streaming playback, avoid the unnecessary watch delay when
one or more entries have been pruned.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 08a8ee9)
Operation request op finish events should not be fire and forget.
Instead, ensure the event is committed to the journal before
completing the op. This will avoid several possible split-brain
events during mirroring.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 47e0fbf)

Conflicts:
	src/test/librbd/operation/test_mock_ResizeRequest.cc: no shrink restriction
Ensure that, by default, IO journal events are broken up into manageable
sizes when factoring in that an rbd-mirror daemon might be replaying
events from thousands of images.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 11d7500)
Fixes: http://tracker.ceph.com/issues/16223
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 24883e0)
When multiple pools are being replicated, start the shut down
process concurrently across all pool replayers.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 73cdd08)
Fixed lockdep issue from status update callback and fixed the
potential for a stuck status state.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 0275c7c)
The cancel request could race with the actual scheduling of the image
sync operation.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit e6cdf95)
librbd will replay these ops when opening an image, so rbd-mirror
should also ensure these ops are replayed.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 862e581)
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 574be74)
Fixes: http://tracker.ceph.com/issues/16539
Signed-off-by: Mykola Golub <mgolub@mirantis.com>
(cherry picked from commit 06a333b)
@ghost ghost changed the title DNM: jewel: rbd-mirror: reduce memory footprint during journal replay jewel: rbd-mirror: reduce memory footprint during journal replay Aug 17, 2016
ghost pushed a commit that referenced this pull request Aug 17, 2016
… during journal replay

Reviewed-by: Loic Dachary <ldachary@redhat.com>
@ghost
Copy link

ghost commented Aug 18, 2016

Testing in progress at http://tracker.ceph.com/issues/16904#note-5

@ghost
Copy link

ghost commented Aug 18, 2016

@dillaman it passed a rbd suite run modulo (I think ;-) known false negatives http://tracker.ceph.com/issues/16904#note-5. Good to merge ?

@dillaman
Copy link
Author

👍

@ghost ghost merged commit b98e27c into ceph:jewel Aug 19, 2016
@dillaman dillaman deleted the wip-16904-jewel branch August 19, 2016 17:08
@ghost ghost changed the title jewel: rbd-mirror: reduce memory footprint during journal replay jewel: rbd: rbd-mirror: reduce memory footprint during journal replay Aug 25, 2016
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant