Skip to content

Rename and clarify CompactionJobStats::has_num_input_records for clarity and set true by default #13929

Closed
hx235 wants to merge 1 commit intofacebook:mainfrom
hx235:rename_has_num_input_records
Closed

Rename and clarify CompactionJobStats::has_num_input_records for clarity and set true by default #13929
hx235 wants to merge 1 commit intofacebook:mainfrom
hx235:rename_has_num_input_records

Conversation

@hx235
Copy link
Contributor

@hx235 hx235 commented Sep 9, 2025

Context/Summary:
Internally CompactionJobStats ::num_input_records is only used for input record count verification and such verification always checks for CompactionJobStats::has_num_input_records (now renamed) before using this field. This is needed because the CompactionJobStats::num_input_records gets its number from CompactionIterator::NumInputEntryScanned() in a subcompaction and this number can be inaccurate purposefully to increase performance, see CompactionIterator::must_count_input_entries for more.

  • This PR renames the CompactionJobStats::has_num_input_records to more explicit naming and adds more comments. Not a behavior change.

Also, aggregation of CompactionJobStats::has_num_input_records among all subcompactions is done by AND operation so it's false if any of the subcompaction has this field being false. The default value of this field should be "true" in order to not mistakenly "false" by default. We are currently fine because CompactionJobStats::Reset() that sets the value to be true is always called before such aggregation.

  • This PR changes the default value to be true.
  • Resumable compaction development plans to set CompactionJobStats::has_num_input_records to be false if the previous compaction carries inaccurate records. In order for this not be overwritten by the subsequent progress in here, this PR also changes this = to AND operation and +=. With the default value CompactionJobStats::has_num_input_records now to be true (or Reset() already called) and CompactionJobStats::num_input_records=0 already, this will not a behavior change.

Test:

  • Existing UT to test "...changes the default value to be true" is safe.

@meta-cla meta-cla bot added the CLA Signed label Sep 9, 2025
@hx235 hx235 changed the title Rename has_num_input_records for clarity and set true by default Rename CompactionJobStats::has_num_input_records for clarity and set true by default Sep 9, 2025
@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this in D82014912.

@hx235
Copy link
Contributor Author

hx235 commented Sep 9, 2025

Irrelevant gcc_no_test_run/clang-analyze failure since they show up in other simple change like https://github.com/facebook/rocksdb/actions/runs/17566358798/job/49894009961?pr=13924 and https://github.com/facebook/rocksdb/actions/runs/17566358798/job/49894009939?pr=13924

Copy link
Contributor

@jaykorean jaykorean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to add the context in the summary? Also a new test that would fail before the change and succeed after the change would be greatly helpful.

sub_compact->compaction_job_stats.has_accurate_num_input_records &=
c_iter->HasNumInputEntryScanned();
sub_compact->compaction_job_stats.num_input_records =
sub_compact->compaction_job_stats.num_input_records +=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean the number of input records has been wrong for when max_subcompactions > 1? I'm little surprised that this wasn't caught by CompactionJobStatsTest::CompactionJobStatsTest

Copy link
Contributor Author

@hx235 hx235 Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be wrong even when max_subcompactions = 1 (edited). But it is not causing our trouble now because we always check CompactionJobStats::has_num_input_records before using this field. And we only use this field in input count verification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be wrong even when max_subcompactions = 0

Would you be able to elaborate little more? Also what does max_subcompactions = 0 mean? My understanding is that max_subcompactions = 1 means no sub compaction which is the default value.

If there's no sub_compaction (basically no chunking, compaction = one sub compaction), wouldn't we have the correct record count in the current code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant 1 sorry. By 1 subcompaction, that one subcompaction will still use c_iter and get the NumInputEntryScanned() from it. NumInputEntryScanned() can be inaccurate due to the reason cited in the PR summary https://github.com/facebook/rocksdb/pull/13929/files#diff-e6c876f655a21865c0f3dff94b9763f1bd40cf88a8a86f04868201b2e845a890R186-R199

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought must_count_input_entries is always true for compactions. I thought it was false for Flushes only, but I just learned that sub_compact->compaction->DoesInputReferenceBlobFiles() being false can set must_count_input_entries false. Now it makes sense to me that a single subcompaction's input record count could have been wrong in some cases like referencing blob files (not sure if there are other cases that we set must_count_input_entries false, though).

@hx235
Copy link
Contributor Author

hx235 commented Sep 9, 2025

Would you be able to add the context in the summary? Also a new test that would fail before the change and succeed after the change would be greatly helpful.

Hi - will add that when it's ready for review. Thanks!

Added more context in the summary. It should not be a behavior change as we already do the right thing somehow by luck to prevent us getting into inaccuracy.

@hx235 hx235 marked this pull request as draft September 9, 2025 19:38
@hx235 hx235 force-pushed the rename_has_num_input_records branch from 5867ca1 to dab1920 Compare September 9, 2025 21:36
@hx235 hx235 changed the title Rename CompactionJobStats::has_num_input_records for clarity and set true by default Rename and clarify CompactionJobStats::has_num_input_records for clarity and set true by default Sep 9, 2025
@hx235 hx235 force-pushed the rename_has_num_input_records branch from dab1920 to 57dcf58 Compare September 9, 2025 22:00
@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this in D82014912.

@hx235 hx235 requested review from cbi42 and jaykorean September 9, 2025 23:09
@hx235 hx235 marked this pull request as ready for review September 9, 2025 23:10
// True if `num_input_records` is accurate across all subcompactions.
// See CompactionIterator::must_count_input_entries for some implementation
// details why `num_input_records` may not be accurate.
bool has_accurate_num_input_records = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this ever be false now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@jaykorean jaykorean Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know that it's AND. My rephrased question would be "would stats.has_accurate_num_input_records ever be false"?

But, as I now realize that must_count_input_entries can be false in some cases, stats.has_accurate_num_input_records can be false. My original question was under wrong assumption that must_count_input_entries is always true for Compactions and false for Flushes only.

Copy link
Contributor

@jaykorean jaykorean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification!

// True if `num_input_records` is accurate across all subcompactions.
// See CompactionIterator::must_count_input_entries for some implementation
// details why `num_input_records` may not be accurate.
bool has_accurate_num_input_records = true;
Copy link
Contributor

@jaykorean jaykorean Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know that it's AND. My rephrased question would be "would stats.has_accurate_num_input_records ever be false"?

But, as I now realize that must_count_input_entries can be false in some cases, stats.has_accurate_num_input_records can be false. My original question was under wrong assumption that must_count_input_entries is always true for Compactions and false for Flushes only.

sub_compact->compaction_job_stats.has_accurate_num_input_records &=
c_iter->HasNumInputEntryScanned();
sub_compact->compaction_job_stats.num_input_records =
sub_compact->compaction_job_stats.num_input_records +=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought must_count_input_entries is always true for compactions. I thought it was false for Flushes only, but I just learned that sub_compact->compaction->DoesInputReferenceBlobFiles() being false can set must_count_input_entries false. Now it makes sense to me that a single subcompaction's input record count could have been wrong in some cases like referencing blob files (not sure if there are other cases that we set must_count_input_entries false, though).

@facebook-github-bot
Copy link
Contributor

@hx235 merged this pull request in 799f83a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants