[DocDB] PITR - Single file compaction does not reset Hybrid Time filter set during restore #17261

sanketkedia · 2023-05-10T15:47:15Z

Jira Link: DB-6496

Description

If there's only one input file for compaction then the output file generated by compaction still contains the hybrid time filter. It is not reset. This is because the largest and smallest frontiers are cloned as is from the first input file and then iterating over the subsequent input files, we update them. During update, we reset the filter. So if there's only one input file the filter is not reset.

for (size_t level_idx = 0; level_idx < compact_->compaction->num_input_levels(); level_idx++) {
    for (FileMetaData *fmd : *compact_->compaction->inputs(level_idx) ) {
      out.meta.UpdateBoundariesExceptKey(fmd->smallest, UpdateBoundariesType::kSmallest);
      out.meta.UpdateBoundariesExceptKey(fmd->largest, UpdateBoundariesType::kLargest);
    }
}

void UserFrontier::Update(const UserFrontier* rhs, UpdateUserValueType type, UserFrontierPtr* out) {
  if (!rhs) {
    return;
  }
  if (*out) {
    (**out).Update(*rhs, type);
  } else {
    *out = rhs->Clone();
  }
}

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

…e filter Summary: If there's only one input file for compaction then the output file generated by compaction still contains the hybrid time filter. It is not reset. This is because the largest and smallest frontiers are cloned as is from the first input file and then iterating over the subsequent input files, we update them. During update, we reset the filter. So if there's only one input file the filter is not reset. This is a perf fix rather than correctness. If we have this filter then we go through the hassle of creating a filtering iterator wrapper over the regular rocksdb iterator and comparing every key against it which is not necessary given that there are no records in the file that will match Jira: DB-6496 Test Plan: ybd release --cxx_test docdb-test --gtest-filter DocDBTests/DocDBTestWrapper.SetHybridTimeFilterSingleFile/0 ybd release --cxx_test docdb-test --gtest-filter DocDBTests/DocDBTestWrapper.SetHybridTimeFilterSingleFile/1 Reviewers: bogdan, sergei Reviewed By: sergei Subscribers: zdrudi, mhaddad, slingam, ybase Differential Revision: https://phabricator.dev.yugabyte.com/D25216

…eset Hybrid Time filter Summary: Original commit: 639133f / D25216 If there's only one input file for compaction then the output file generated by compaction still contains the hybrid time filter. It is not reset. This is because the largest and smallest frontiers are cloned as is from the first input file and then iterating over the subsequent input files, we update them. During update, we reset the filter. So if there's only one input file the filter is not reset. This is a perf fix rather than correctness. If we have this filter then we go through the hassle of creating a filtering iterator wrapper over the regular rocksdb iterator and comparing every key against it which is not necessary given that there are no records in the file that will match Jira: DB-6496 Test Plan: ybd release --cxx_test docdb-test --gtest-filter DocDBTests/DocDBTestWrapper.SetHybridTimeFilterSingleFile/0 ybd release --cxx_test docdb-test --gtest-filter DocDBTests/DocDBTestWrapper.SetHybridTimeFilterSingleFile/1 Reviewers: sergei, bogdan Reviewed By: bogdan Subscribers: ybase, slingam, mhaddad, zdrudi Differential Revision: https://phorge.dev.yugabyte.com/D26093

…eset Hybrid Time filter Summary: Original commit: 639133f / D25216 If there's only one input file for compaction then the output file generated by compaction still contains the hybrid time filter. It is not reset. This is because the largest and smallest frontiers are cloned as is from the first input file and then iterating over the subsequent input files, we update them. During update, we reset the filter. So if there's only one input file the filter is not reset. This is a perf fix rather than correctness. If we have this filter then we go through the hassle of creating a filtering iterator wrapper over the regular rocksdb iterator and comparing every key against it which is not necessary given that there are no records in the file that will match Jira: DB-6496 Test Plan: ybd release --cxx_test docdb-test --gtest-filter DocDBTests/DocDBTestWrapper.SetHybridTimeFilterSingleFile/0 ybd release --cxx_test docdb-test --gtest-filter DocDBTests/DocDBTestWrapper.SetHybridTimeFilterSingleFile/1 Reviewers: sergei, bogdan Reviewed By: bogdan Subscribers: zdrudi, mhaddad, slingam, ybase Differential Revision: https://phorge.dev.yugabyte.com/D26092

…eset Hybrid Time filter Summary: Original commit: 639133f / D25216 If there's only one input file for compaction then the output file generated by compaction still contains the hybrid time filter. It is not reset. This is because the largest and smallest frontiers are cloned as is from the first input file and then iterating over the subsequent input files, we update them. During update, we reset the filter. So if there's only one input file the filter is not reset. This is a perf fix rather than correctness. If we have this filter then we go through the hassle of creating a filtering iterator wrapper over the regular rocksdb iterator and comparing every key against it which is not necessary given that there are no records in the file that will match Jira: DB-6496 Test Plan: ybd release --cxx_test docdb-test --gtest-filter DocDBTests/DocDBTestWrapper.SetHybridTimeFilterSingleFile/0 ybd release --cxx_test docdb-test --gtest-filter DocDBTests/DocDBTestWrapper.SetHybridTimeFilterSingleFile/1 Reviewers: sergei, bogdan Reviewed By: bogdan Subscribers: ybase, slingam, mhaddad, zdrudi Differential Revision: https://phorge.dev.yugabyte.com/D26091

sanketkedia added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels May 10, 2023

sanketkedia self-assigned this May 10, 2023

sanketkedia added this to To do in PITR via automation May 10, 2023

yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels May 10, 2023

yugabyte-ci assigned bmatican and sanketkedia and unassigned sanketkedia and bmatican May 16, 2023

yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label May 16, 2023

sanketkedia added 2.14 Backport Required 2.16 Backport Required 2.18 Backport Required labels Jun 8, 2023

sanketkedia closed this as completed Jun 13, 2023

PITR automation moved this from To do to Done Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DocDB] PITR - Single file compaction does not reset Hybrid Time filter set during restore #17261

[DocDB] PITR - Single file compaction does not reset Hybrid Time filter set during restore #17261

sanketkedia commented May 10, 2023 •

edited by jira bot

[DocDB] PITR - Single file compaction does not reset Hybrid Time filter set during restore #17261

[DocDB] PITR - Single file compaction does not reset Hybrid Time filter set during restore #17261

Comments

sanketkedia commented May 10, 2023 • edited by jira bot

Description

Warning: Please confirm that this issue does not contain any sensitive information

sanketkedia commented May 10, 2023 •

edited by jira bot