Skip to content

Refactor and support option migration for db with multiple CFs#14059

Closed
hx235 wants to merge 2 commits intofacebook:mainfrom
hx235:migrate_cf_2
Closed

Refactor and support option migration for db with multiple CFs#14059
hx235 wants to merge 2 commits intofacebook:mainfrom
hx235:migrate_cf_2

Conversation

@hx235
Copy link
Contributor

@hx235 hx235 commented Oct 16, 2025

Context/Summary:
This PR adds multi-cf support to option migration. The original implementation sets options, opens db, compacts files and reopens the db in almost all the three branches below. Such design makes expanding to multi-cf difficult as it needs to change all these places within each of the branch causing code redundancy.

Status OptionChangeMigration(std::string dbname, const Options& old_opts,
                             const Options& new_opts) {
  if (old_opts.compaction_style == CompactionStyle::kCompactionStyleFIFO) {
    // LSM generated by FIFO compaction can be opened by any compaction.
    return Status::OK();
  } else if (new_opts.compaction_style ==
             CompactionStyle::kCompactionStyleUniversal) {
    return MigrateToUniversal(dbname, old_opts, new_opts);
  } else if (new_opts.compaction_style ==
             CompactionStyle::kCompactionStyleLevel) {
    return MigrateToLevelBase(dbname, old_opts, new_opts);
  } else if (new_opts.compaction_style ==
             CompactionStyle::kCompactionStyleFIFO) {
    return CompactToLevel(old_opts, dbname, 0, 0 /* l0_file_size */, true);
  } else {
    return Status::NotSupported(
        "Do not how to migrate to this compaction style");
  }
}

Therefore this PR

  • Refactor the option migration implementation by moving the common parts into the high-level OptionChangeMigration() through PrepareNoCompactionCFDescriptors() and OpenDBWithCFs() so MigrateAllCFs() can focus on compaction only.
  • Treat the original OptionChangeMigration() API as a special case of the multi-cf version option migration
  • Add multiple-cf support

A few notes:

  • CompactToLevel() originally modifies the compaction-related options conditionally before doing compaction. This is moved into earlier steps through ApplySpecialSingleLevelSettings() in PrepareNoCompactionCFDescriptors()
  • MigrateToUniversal() originally opens the db twice with essentially the same option. This PR reduces that to one open
  • Option migration does not always use the old option to compact the db and reopen the db after migration, see return CompactToLevel(new_opts, dbname, new_opts.num_levels - 1,/*l0_file_size=*/0, false);. PrepareNoCompactionCFDescriptors() is where we handle those decisions.

Test plan:

  • Existing UTs
  • New UTs

@meta-cla meta-cla bot added the CLA Signed label Oct 16, 2025
@hx235 hx235 changed the title Support option migration for db with multiple CFs Refactor and support option migration for db with multiple CFs Oct 16, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 16, 2025

@hx235 has imported this pull request. If you are a Meta employee, you can view this in D84852970.

@hx235 hx235 force-pushed the migrate_cf_2 branch 4 times, most recently from 3cadcba to 632e2ce Compare October 17, 2025 18:12
@hx235
Copy link
Contributor Author

hx235 commented Oct 17, 2025

crash test failed with an irrelevant error ReadAsync failed with Not implemented: ReadAsync: ROCKSDB_IOURING_PRESENT is not set

Copy link
Contributor

@xingbowang xingbowang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I am curious that whether our existing crash test support config migration. If so, do we need to expand to support multi-cf?

// violated.
//
// WARNING: using this to migrate from non-FIFO to FIFO compaction
// with `max_table_files_size` > 0 can cause the whole DB to be dropped right
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the data dropping concern, is it possible to add an extra flag "fail_on_data_deletion=true" during the migration, so that if there is any data was deleted during migration, the migration would fail, and print how much data would be deleted. If data deletion is intentional, client could set the flag to false and then retry, to make sure it is indeed the intended behavior. I know we don't like to add new flags. However, given the potential impact, accident data deletion, I felt it is worth to add a new flag for this if possible.

Copy link
Contributor Author

@hx235 hx235 Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that if there is any data was deleted during migration, the migration would fail, and print how much data would be deleted. ... client could set the flag to false and then retry,

That's a good suggestion. The challenge is correct estimation of to-be-delete data without actually doing the compaction so users can retry later. I will need to think more in a different PR. The API has been in here for 5+ years and hopefully the WARNING (and the nature of FIFO compaction dropping data) exist long enough so the existing users won't make the mistake.

new_opts.compaction_style == CompactionStyle::kCompactionStyleLevel) &&
new_opts.num_levels == 1) ||
new_opts.compaction_style == CompactionStyle::kCompactionStyleFIFO) {
base_opts->target_file_size_base = 999999999999999;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UINT64_MAX?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - it ends up having to be UINT64_MAX/2 to avoid some overflow in later computation with target_file_size_base.

Copy link
Contributor Author

@hx235 hx235 Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there exists a subtle bug in the later computation with target_file_size_base in MultiplyCheckOverflow()

uint64_t MultiplyCheckOverflow(uint64_t op1, double op2) {

uint64_t MultiplyCheckOverflow(uint64_t op1, double op2) {
  if (op1 == 0 || op2 <= 0) {
    return 0;
  }


  // Buggy check!
  if (std::numeric_limits<uint64_t>::max() / op1 < op2) {
    return op1;
  }


  return static_cast<uint64_t>(op1 * op2);
}

Such check does not catch the troublesome case where op1 * op2 = std::numeric_limits<uint64_t>::max(). It's troublesome because op2 is double and op1 * op2 will be promoted to double first. std::numeric_limits<uint64_t>::max() can't be precisely represented in double and can be rounded up to the next precisely represented double 2^64 during the promotion, if rounding up has less error than rounding down. Therefore the rounded-up promotion result will be greater than std::numeric_limits<uint64_t>::max() hence casting error such as in https://github.com/facebook/rocksdb/actions/runs/19486118069/job/55768715545?pr=14059

For this case, fixing < to be <= is enough (see #14132 for the fix)

But I’m worried about op1 * op2 equals a number slightly smaller than std::numeric_limits<uint64_t>::max() but still can’t precisely represent and get rounded up to 2^64 similarly. The std::numeric_limits<uint64_t>::max() / op1 <= op2 doesn’t seem enough to catch this case. Before I figure out whether this case is possible, I will use the old/existing value 999999999999999 instead of making it bigger (or more likely to hit this bug if this bug exists).

@hx235
Copy link
Contributor Author

hx235 commented Nov 17, 2025

I am curious that whether our existing crash test support config migration. If so, do we need to expand to support multi-cf?

Option migration is on my TODO list to add to stress/crash test. FIFO is more challenging but possible.

return s;
}

// Step 2: Prepare no-compaction CF descriptors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused by the naming here, what does PrepareNoCompaction* mean?

Copy link
Contributor Author

@hx235 hx235 Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means creating CF options that lead to no compaction happening in that CF to rely on manual compaction.

cro.target_level = dest_level;

if (dest_level == 0) {
// cannot use kForceOptimized because the compaction is expected to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why do we want a single L0 file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the history 420bdb4, it was to prevent trivial move to L0 violating assertion. Will add some note.


// Step 6: Reopen DB if needed to rewrite manifest
if (s.ok() && any_need_reopen) {
db.reset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check DB close status?

Copy link
Contributor Author

@hx235 hx235 Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - had to fix a few more places here where reset() has been used to close the db.

Copy link
Contributor

@cbi42 cbi42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hx235 hx235 force-pushed the migrate_cf_2 branch 3 times, most recently from 49e423e to 3436d15 Compare November 19, 2025 11:49
@meta-codesync
Copy link

meta-codesync bot commented Nov 19, 2025

@hx235 merged this pull request in 57a6fb9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants