Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FIFO causing overlapping seqnos in L0 files due to overlapped seqnos between ingested files and memtable's #10777

Closed

Conversation

hx235
Copy link
Contributor

@hx235 hx235 commented Oct 5, 2022

Context:
Same as #5958 (comment) but apply the fix to FIFO Compaction case
Repro:

COERCE_CONTEXT_SWICH=1 make -j56 db_stress

./db_stress --acquire_snapshot_one_in=0 --adaptive_readahead=0 --allow_data_in_errors=True --async_io=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=18 --bottommost_compression_type=disable --bytes_per_sync=262144 --cache_index_and_filter_blocks=0 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=0 --charge_file_metadata=1 --charge_filter_construction=1 --charge_table_reader=1 --checkpoint_one_in=0 --checksum_type=kCRC32c --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=0 --compact_range_one_in=1000 --compaction_pri=3 --open_files=-1 --compaction_style=2 --fifo_allow_compaction=1 --compaction_ttl=0 --compression_max_dict_buffer_bytes=8388607 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=zlib --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=/dev/shm/rocksdb_test0/rocksdb_crashtest_whitebox --db_write_buffer_size=8388608 --delpercent=4 --delrangepercent=1 --destroy_db_initially=1 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=1 --fail_if_options_file_error=1 --file_checksum_impl=none --flush_one_in=1000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=0 --get_property_one_in=0 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=15 --index_type=3 --ingest_external_file_one_in=100 --initial_auto_readahead_size=0 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=True --log2_keys_per_lock=10 --long_running_snapshots=0 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=16384 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=100000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=1048576 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=4194304 --memtable_prefix_bloom_size_ratio=0.5 --memtable_protection_bytes_per_key=1 --memtable_whole_key_filtering=1 --memtablerep=skip_list --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --num_file_reads_for_auto_readahead=0 --num_levels=1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=32 --open_write_fault_one_in=0 --ops_per_thread=200000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=1 --pause_background_one_in=0 --periodic_compaction_seconds=0 --prefix_size=8 --prefixpercent=5 --prepopulate_block_cache=0 --progress_reports=0 --read_fault_one_in=0 --readahead_size=16384 --readpercent=45 --recycle_log_file_num=1 --reopen=20 --ribbon_starting_level=999 --snapshot_hold_ops=1000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --subcompactions=2 --sync=0 --sync_fault_injection=0 --target_file_size_base=524288 --target_file_size_multiplier=2 --test_batches_snapshots=0 --top_level_index_pinning=3 --unpartitioned_pinning=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=1 --use_merge=0 --use_multiget=1 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=0 --verify_db_one_in=1000 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=zstd --write_buffer_size=524288 --write_dbid_to_manifest=0 --writepercent=35
    

put or merge error: Corruption: force_consistency_checks(DEBUG): VersionBuilder: L0 file #479 with seqno 23711 29070 vs. file #482 with seqno 27138 29049

Summary:
FIFO only does intra-L0 compaction in the following four cases. For other cases, FIFO drops data instead of compacting on data, which is irrelevant to the overlapping seqno issue we are solving.

Some additional clean-up included in this PR:

  • Renamed earliest_memtable_seqno to earliest_mem_seqno for consistent naming
  • Added comment about earliest_memtable_seqno in related APIs
  • Made parameter earliest_memtable_seqno constant and required

Test:

  • make check
  • New unit test TEST_P(DBCompactionTestFIFOCheckConsistencyWithParam, FlushAfterIntraL0CompactionWithIngestedFile)corresponding to the above 4 cases, which will fail accordingly without the fix
  • Regular CI stress run on this PR + stress test with aggressive value [CI only]Aggressive crash test value #10761 and on FIFO compaction only

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 changed the title [Draft] Fix overlapping seqno between files in L0 FIFO Compaction due to file ingestion's seqno overlapping with memtable's [Draft] Fix overlapping seqno between files in L0 due to incorrectly including ingested files for FIFO Compaction Oct 5, 2022
@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from bd0a275 to e3c745f Compare October 6, 2022 05:45
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from e3c745f to c0ed7f2 Compare October 6, 2022 07:01
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 changed the title [Draft] Fix overlapping seqno between files in L0 due to incorrectly including ingested files for FIFO Compaction [Draft] Fix overlapping seqnos in L0 files cased by overlapped seqnos between ingested files and memtable (FIFO Compaction case) Oct 6, 2022
@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from c0ed7f2 to c1121da Compare October 6, 2022 21:08
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from c1121da to df03338 Compare October 7, 2022 23:26
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from df03338 to cded3e7 Compare October 8, 2022 00:44
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Please mention both bug fixes in HISTORY.md and remove unrelated info from the description

// SST files of seqnos overlap with memtables' seqnos. Such SST file can
// exist when it's ingested to L0.
if (prev_file != nullptr &&
f->fd.largest_seqno <= earliest_memtable_seqno) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it check prev_file's largest_seqno?

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from cded3e7 to e78552c Compare October 10, 2022 22:43
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from e78552c to b119fa0 Compare October 11, 2022 20:28
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235
Copy link
Contributor Author

hx235 commented Oct 11, 2022

Update: fix a corner case where we didn't pass in earliest_memtable_seqno in the path of FIFO's CompactRange

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from b119fa0 to 9c21e01 Compare October 11, 2022 20:29
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from 9c21e01 to 694697c Compare October 12, 2022 00:10
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 changed the title [Draft] Fix overlapping seqnos in L0 files cased by overlapped seqnos between ingested files and memtable (FIFO Compaction case) Fix overlapping seqnos in L0 files cased by overlapped seqnos between ingested files and memtable (FIFO Compaction case) Oct 12, 2022
@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from 694697c to d90f3c1 Compare October 12, 2022 01:43
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from 8fca84b to 94d2ae1 Compare October 21, 2022 21:19
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from 94d2ae1 to 8cb4715 Compare October 21, 2022 21:24
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from 8cb4715 to cfdb48e Compare October 21, 2022 21:40
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235
Copy link
Contributor Author

hx235 commented Oct 21, 2022

@ajkr - added UT. I noticed the merged PR e267909#diff-d8fb3d50749aa69b378de447e3d9cf2f48abe0281437f010b5d61365a7b813fd. I took a look and don't think it will impact our PR but will still stress-run over the weekend before landing, just in case.

@hx235 hx235 changed the title Fix overlapping seqnos in L0 files caused by overlapped seqnos between ingested files and memtable (FIFO Compaction case) Fix FIFO causing overlapping seqnos in L0 files due to overlapped seqnos between ingested files and memtable's Oct 21, 2022
@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from cfdb48e to ebbe9a0 Compare October 21, 2022 23:59
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 force-pushed the fifo_overlap_file_ingest_with_memtable branch from ebbe9a0 to 8c65521 Compare October 24, 2022 17:19
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

// And in this case, Flush() will return Status::Corruption() caught by
// `force_consistency_checks=1`
EXPECT_EQ(3, NumTableFilesAtLevel(0));
EXPECT_OK(Flush());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether this should fail or not without the fix is an open question (see my other complaints about the existence of the smallest_seqno validation). It might be interesting to check Get(Key(1)) returned wrong result before the fix (without the smallest_seqno validation) and the correct result after the fix.

Copy link
Contributor Author

@hx235 hx235 Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right - good idea. Will fix this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #10776

"PickCompactionToWarm",
"CompactRange", "CompactFile"));

TEST_P(DBCompactionTestFIFOCheckConsistencyWithParam,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test case!

facebook-github-bot pushed a commit that referenced this pull request Oct 31, 2022
Summary:
**Context/Summary:**

#10777 mistakenly added a history entry under 7.8 release but the PR is not included in 7.8. This mistake was due to rebase and merge didn't realize it was a conflict when "## Unreleased" was changed to "## 7.8.0 (10/22/2022)".

Pull Request resolved: #10898

Test Plan: Make check

Reviewed By: akankshamahajan15

Differential Revision: D40861001

Pulled By: hx235

fbshipit-source-id: b2310c95490f6ebb90834a210c965a74c9560b51
hx235 added a commit to hx235/rocksdb that referenced this pull request Nov 29, 2022
…pped seqnos between ingested files and memtable's (facebook#10777)"

This reverts commit fc74abb.
facebook-github-bot pushed a commit that referenced this pull request Jan 3, 2023
…rash test (#11063)

Summary:
**Context/Summary:**
#10777 was reverted (#10999) due to internal blocker and replaced with a better fix #10922. However, the revert also reverted the `Options::CompactionOptionsFIFO::allow_compaction` stress/crash coverage added by the PR.

It's an useful coverage cuz setting `Options::CompactionOptionsFIFO::allow_compaction=true` will [increase](https://github.com/facebook/rocksdb/blob/7.8.fb/db/version_set.cc#L3255) the compaction score of L0 files for FIFO and then trigger more FIFO compaction. This speed up discovery of bug related to FIFO compaction like #10955. To see the speedup, compare the failure occurrence in following commands with `Options::CompactionOptionsFIFO::allow_compaction=true/false`

```
--fifo_allow_compaction=1 --acquire_snapshot_one_in=10000 --adaptive_readahead=0 --allow_concurrent_memtable_write=0 --allow_data_in_errors=True --async_io=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=8.869062094789008 --bottommost_compression_type=none --bytes_per_sync=0 --cache_index_and_filter_blocks=1 --cache_size=8388608 --cache_type=lru_cache --charge_compression_dictionary_building_buffer=1 --charge_file_metadata=1 --charge_filter_construction=0 --charge_table_reader=1 --checkpoint_one_in=1000000 --checksum_type=kxxHash --clear_column_family_one_in=0 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_pri=3 --compaction_style=2 --compaction_ttl=0 --compression_max_dict_buffer_bytes=8589934591 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=xpress --compression_use_zstd_dict_trainer=1 --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --data_block_index_type=0 --db=/dev/shm/rocksdb_test/rocksdb_crashtest_whitebox --db_write_buffer_size=1048576 --delpercent=4 --delrangepercent=1 --destroy_db_initially=1 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=1 --expected_values_dir=/dev/shm/rocksdb_test/rocksdb_crashtest_expected --fail_if_options_file_error=1 --file_checksum_impl=xxh64 --flush_one_in=1000000 --format_version=4 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=10 --index_type=2 --ingest_external_file_one_in=100 --initial_auto_readahead_size=16384 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=False --log2_keys_per_lock=10 --long_running_snapshots=0 --manual_wal_flush_one_in=0 --mark_for_compaction_one_file_in=10 --max_auto_readahead_size=524288 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=25000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=1048576 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=0 --memtable_prefix_bloom_size_ratio=0.01 --memtable_protection_bytes_per_key=1 --memtable_whole_key_filtering=1 --memtablerep=skip_list --min_write_buffer_number_to_merge=2 --mmap_read=0 --mock_direct_io=True --nooverwritepercent=1 --num_file_reads_for_auto_readahead=2 --open_files=-1 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=0 --open_write_fault_one_in=0 --ops_per_thread=40000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=3 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=7 --prefixpercent=5 --prepopulate_block_cache=0 --preserve_internal_time_seconds=3600 --progress_reports=0 --read_fault_one_in=1000 --readahead_size=0 --readpercent=15 --recycle_log_file_num=1 --reopen=0 --ribbon_starting_level=999 --secondary_cache_fault_one_in=0  --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --stats_dump_period_sec=0 --subcompactions=2 --sync=0 --sync_fault_injection=0 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=1 --unpartitioned_pinning=1 --use_direct_io_for_flush_and_compaction=1 --use_direct_reads=1 --use_full_merge_v1=1 --use_merge=0 --use_multiget=0 --use_put_entity_one_in=0 --user_timestamp_size=0 --value_size_mult=32 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --verify_iterator_with_expected_state_one_in=5 --verify_sst_unique_id_in_manifest=1 --wal_bytes_per_sync=0 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=1 --writepercent=65
```

Therefore this PR is adding it back to stress/crash test.

Pull Request resolved: #11063

Test Plan: Rehearsal stress test to make sure stress/crash test is stable

Reviewed By: ajkr

Differential Revision: D42283650

Pulled By: hx235

fbshipit-source-id: 132e6396ab6e24d8dcb8fe51c62dd5211cdf53ef
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants