disable restart for destroy-pending repl-dev by JacksonYao287 · Pull Request #605 · eBay/HomeStore

JacksonYao287 · 2024-12-11T03:31:45Z

when we try to destroy a repl_dev , we will first mark it to destroy-pending state, and then a background gc thread will try to find it periodically and permanently destroy it. however, if crash happens before it is permanently destroyed, then there will be some issue left.

1 a destroy-pending repl-dev will not be put into m_rd_map , so raft_group_config_found will return a nullptr for this repl_dev, and thus repl_dev->restart will cause a nullpointer fault(fixed).

2 when permanently destroy a repl_dev we will

    m_rd_sb.destroy();
    m_raft_config_sb.destroy();
    m_data_journal->remove_store();
    logstore_service().destroy_log_dev(m_data_journal->logdev_id());

if crash happens after m_rd_sb.destroy(), but before m_data_journal->remove_store() , we will have no chance to reclaim log related resource for this repl_dev.

this pr checks and reclaims the resource when start and destory repl_dev superblk only after all the related resource are reclaimed

codecov-commenter · 2024-12-11T04:01:16Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 10.00000% with 9 lines in your changes missing coverage. Please review.

Project coverage is 66.51%. Comparing base (1a0cef8) to head (4933dfb).
Report is 103 commits behind head on master.

Files with missing lines	Patch %	Lines
src/lib/logstore/log_dev.cpp	0.00%	2 Missing and 1 partial ⚠️
src/lib/logstore/log_store_service.cpp	0.00%	2 Missing and 1 partial ⚠️
src/lib/replication/service/raft_repl_service.cpp	0.00%	2 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@             Coverage Diff             @@
##           master     #605       +/-   ##
===========================================
+ Coverage   56.51%   66.51%   +10.00%     
===========================================
  Files         108      109        +1     
  Lines       10300    10836      +536     
  Branches     1402     1484       +82     
===========================================
+ Hits         5821     7208     +1387     
+ Misses       3894     2921      -973     
- Partials      585      707      +122

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

xiaoxichen

lgtm in general.

src/lib/replication/service/raft_repl_service.cpp

xiaoxichen

also one thing i am not clear ,

void LogDev::handle_unopened_log_stores(bool format) {

this seems trying to similar target to GC the leaked logstores, it doesnt take care the logdev but it is not hard to add that when a logdev has zero open log store we GC the logdev?

I think better to have one clear solution, if we believe we have handled the GC here, the handle_unopened_log_stores should be change to ASSERT or LOG_ERROR

xiaoxichen

lgtm

feel free to merge if want to create a ticket and work on enhancement later

JacksonYao287 · 2024-12-16T08:03:47Z

I think better to have one clear solution

this is a good question. after go through the code, the logic is as follows.

1 destroy log store
for a particular log dev , a log store will be put into pending gc state(in m_garbage_store_ids) in two cases:
1 if not been opened before logdev starting
2 be removed(remove_log_store) at runtime.

when log truncation happens , if all the log entries of this log store is truncated, it will be permanently destroyed.

2 destroy logdev
it will release all the chunks of this logdev, and delete the metablk of it. and as a result , all the log store of this logdev is also detroyed , since all the logstore id of this logdev in the the metablk of this logdev.

coming to repl_dev case, if we do not open log dev, it will also be permanently destroyed.

JacksonYao287 · 2024-12-16T08:26:54Z

src/lib/replication/service/raft_repl_service.cpp

+        // skip it.
+
+        // 3 logdev will be destroyed in delete_unopened_logdevs() if we don't open it(create repl_dev) here, so skip
+        // it.


we need do nothing here, since if we do not create the repl_dev , the related log_dev will not be opened, as a result , the log dev will be eventually detroyed at

HomeStore/src/lib/replication/service/raft_repl_service.cpp

Line 176 in 4933dfb

hs()->logstore_service().delete_unopened_logdevs();

JacksonYao287 marked this pull request as draft December 11, 2024 03:35

JacksonYao287 marked this pull request as ready for review December 11, 2024 07:11

JacksonYao287 requested review from sanebay, xiaoxichen and yamingk December 11, 2024 08:29

Hooper9973 mentioned this pull request Dec 12, 2024

PG cleanup for moved out member eBay/HomeObject#242

Merged

xiaoxichen reviewed Dec 16, 2024

View reviewed changes

src/lib/replication/service/raft_repl_service.cpp Show resolved Hide resolved

xiaoxichen reviewed Dec 16, 2024

View reviewed changes

src/lib/replication/service/raft_repl_service.cpp Outdated Show resolved Hide resolved

xiaoxichen reviewed Dec 16, 2024

View reviewed changes

xiaoxichen previously approved these changes Dec 16, 2024

View reviewed changes

JacksonYao287 added 3 commits December 16, 2024 01:05

disable restart for destroy-pending repl-dev

8a50fa5

guarantee reclaim all the stale repl_dev resource after restart

3cfcb45

update

4933dfb

JacksonYao287 dismissed xiaoxichen’s stale review via 4933dfb December 16, 2024 08:23

JacksonYao287 force-pushed the donot-restart-destroy-pending-repl branch from fba9034 to 4933dfb Compare December 16, 2024 08:23

JacksonYao287 commented Dec 16, 2024

View reviewed changes

JacksonYao287 requested a review from xiaoxichen December 16, 2024 08:27

xiaoxichen approved these changes Dec 16, 2024

View reviewed changes

JacksonYao287 merged commit 6756b81 into eBay:master Dec 17, 2024
21 checks passed

JacksonYao287 deleted the donot-restart-destroy-pending-repl branch December 17, 2024 00:19

hkadayam pushed a commit to hkadayam/HomeStore that referenced this pull request Aug 7, 2025

disable restart for destroy-pending repl-dev (eBay#605)

56c8f35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

disable restart for destroy-pending repl-dev#605

disable restart for destroy-pending repl-dev#605
JacksonYao287 merged 3 commits intoeBay:masterfrom
JacksonYao287:donot-restart-destroy-pending-repl

JacksonYao287 commented Dec 11, 2024 •

edited

Loading

Uh oh!

codecov-commenter commented Dec 11, 2024 •

edited

Loading

Uh oh!

xiaoxichen left a comment

Uh oh!

Uh oh!

Uh oh!

xiaoxichen left a comment

Uh oh!

xiaoxichen left a comment

Uh oh!

JacksonYao287 commented Dec 16, 2024 •

edited

Loading

Uh oh!

JacksonYao287 Dec 16, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JacksonYao287 commented Dec 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Dec 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

xiaoxichen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

xiaoxichen left a comment

Choose a reason for hiding this comment

Uh oh!

xiaoxichen left a comment

Choose a reason for hiding this comment

Uh oh!

JacksonYao287 commented Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JacksonYao287 Dec 16, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JacksonYao287 commented Dec 11, 2024 •

edited

Loading

codecov-commenter commented Dec 11, 2024 •

edited

Loading

JacksonYao287 commented Dec 16, 2024 •

edited

Loading