Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memtable_list: do not roll back memtable flush on CF drop #144

Closed
isaac-io opened this issue Aug 30, 2022 · 7 comments · Fixed by #145
Closed

memtable_list: do not roll back memtable flush on CF drop #144

isaac-io opened this issue Aug 30, 2022 · 7 comments · Fixed by #145
Assignees
Labels
enhancement New feature or request Upstreamable can be upstreamed to RocksDB
Milestone

Comments

@isaac-io
Copy link
Contributor

When installing memtable flush results, if the flush was unsuccessful (due to an error during the flush or the manifest write) or if the CF was dropped, the memtable is rolled back into a flushable state. However, this is unnecessary since it reactivates the pending flush flag on the relevant CF even though no flush can take place after CF drop, and as in #126 it also causes an issue for the upcoming improvements to the WriteBufferManager immutable memory tracking (#113) because it marks the memtable memory as ready for flush again for a short period before finally getting freed when the memtable is dropped during the destruction of the dropped CF (and this might upset the WBM behaviour because in the future we plan to rely on the amount of memory that isn't already being flushed for triggering flushes rather than on the amount of mutable memory alone as is done today).

This is a continuation of #126 (I simply missed this rollback), and since the rollback only marks the memtable as ready for flush again and clears related flush state, there's no harm in skipping it and just letting the memtable drop in that state.

@isaac-io isaac-io added enhancement New feature or request Upstreamable can be upstreamed to RocksDB labels Aug 30, 2022
@isaac-io isaac-io self-assigned this Aug 30, 2022
isaac-io added a commit that referenced this issue Aug 30, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
isaac-io added a commit that referenced this issue Aug 30, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
isaac-io added a commit that referenced this issue Aug 30, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
isaac-io added a commit that referenced this issue Aug 30, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
isaac-io added a commit that referenced this issue Aug 31, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
@Yuval-Ariel
Copy link
Contributor

Yuval-Ariel commented Sep 5, 2022

to introduce cf drop, use flag --clear_column_family_one_in .
note this only works with:
test_batches_snapshots=false
expected_values_dir=None
backup_one_in=0
checkpoint_one_in=0

@assaf-speedb
Copy link
Contributor

The white box is failing.

Verification failed for column family 0 key 0000000000038053000000000000004F000000000000003B0000000000000009000000000000004F000000000000003E787878787878 (1638432): Unexpected value found
No writes or ops?
Verification failed :(

my command:
time make db_stress; time CRASH_TEST_EXT_ARGS="--clear_column_family_one_in=5 --test_batches_snapshots=0 --expected_values_dir="" --backup_one_in=0 --checkpoint_one_in=0" make whitebox_crash_test; echo $?

@isaac-io
Copy link
Contributor Author

Thanks! Can you also post the db_stress command that the crash test executed when this failed?

@assaf-speedb
Copy link
Contributor

Sure. BTW, the black box failed as well - for the same reason:

Whitebox
./db_stress --acquire_snapshot_one_in=10000 --adaptive_readahead=1 --allow_concurrent_memtable_write=1 --async_io=1 --avoid_flush_during_recovery=1 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --block_size=4096 --bloom_bits=10 --bottommost_compression_type=lz4hc --cache_index_and_filter_blocks=0 --cache_size=8388608 --checkpoint_one_in=0 --checksum_type=kCRC32c --clear_column_family_one_in=4 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_ttl=0 --compare_full_db_state_snapshot=0 --compression_max_dict_buffer_bytes=67108863 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=xpress --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --customopspercent=0 --data_block_hash_table_util_ratio=0.63 --data_block_index_type=1 --db=/dev/shm/rocksdb.j1e0/rocksdb_crashtest_whitebox --db_write_buffer_size=0 --delpercent=1 --delrangepercent=5 --destroy_db_initially=0 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_compaction_filter=0 --enable_pipelined_write=0 --expected_values_dir= --experimental_mempurge_threshold=6.023655548032458 --fail_if_options_file_error=0 --file_checksum_impl=big --flush_one_in=1000000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=100000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=13 --index_type=2 --iterpercent=19 --key_len_percent_dist=14,8,6,1,8,27,11,11,8,6 --kill_exclude_prefixes=WritableFileWriter::Append,WritableFileWriter::WriteBuffered --kill_random_test=88889 --level_compaction_dynamic_level_bytes=False --long_running_snapshots=1 --mark_for_compaction_one_file_in=0 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=10485760 --max_key_len=10 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=0 --memtable_prefix_bloom_size_ratio=0.01 --memtable_whole_key_filtering=1 --memtablerep=skip_list --mmap_read=1 --mock_direct_io=False --nooverwritepercent=50 --num_iterations=91 --open_files=500000 --open_metadata_write_fault_one_in=8 --open_read_fault_one_in=32 --open_write_fault_one_in=16 --ops_per_thread=20000000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=0 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=1 --progress_reports=0 --read_fault_one_in=1000 --readpercent=23 --recycle_log_file_num=1 --reopen=20 --reserve_table_reader_memory=1 --ribbon_starting_level=3 --secondary_cache_fault_one_in=32 --seed=3062778522 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=1048576 --subcompactions=1 --sync=False --sync_fault_injection=False --sync_wal_one_in=100000 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=2 --unpartitioned_pinning=2 --use_block_based_filter=0 --use_clock_cache=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=True --use_merge=0 --use_multiget=1 --user_timestamp_size=0 --value_size_mult=32 --verify_before_write=True --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=0 --writepercent=52

Blackbox
./db_stress --acquire_snapshot_one_in=0 --adaptive_readahead=0 --allow_concurrent_memtable_write=1 --allow_setting_blob_options_dynamically=1 --async_io=0 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=0 --backup_max_size=104857600 --backup_one_in=0 --batch_protection_bytes_per_key=0 --blob_compaction_readahead_size=4194304 --blob_compression_type=lz4 --blob_file_size=268435456 --blob_garbage_collection_age_cutoff=0.5 --blob_garbage_collection_force_threshold=0.75 --block_size=16384 --bloom_bits=13 --bottommost_compression_type=none --cache_index_and_filter_blocks=1 --cache_size=8388608 --checkpoint_one_in=0 --checksum_type=kxxHash64 --clear_column_family_one_in=5 --column_families=1 --compact_files_one_in=1000000 --compact_range_one_in=0 --compaction_ttl=1 --compare_full_db_state_snapshot=0 --compression_max_dict_buffer_bytes=7 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=xpress --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --customopspercent=0 --data_block_hash_table_util_ratio=0.73 --data_block_index_type=1 --db=/dev/shm/rocksdb.qGR3/rocksdb_crashtest_blackbox --db_write_buffer_size=134217728 --delpercent=9 --delrangepercent=4 --destroy_db_initially=0 --detect_filter_construct_corruption=0 --disable_wal=0 --enable_blob_files=1 --enable_blob_garbage_collection=1 --enable_compaction_filter=1 --enable_pipelined_write=0 --expected_values_dir= --experimental_mempurge_threshold=0.3731507821551916 --fail_if_options_file_error=1 --file_checksum_impl=big --flush_one_in=1000000 --format_version=2 --get_current_wal_file_one_in=0 --get_live_files_one_in=100000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=8 --index_type=0 --iterpercent=0 --key_len_percent_dist=2,16,10,3,40,29 --level_compaction_dynamic_level_bytes=False --long_running_snapshots=1 --mark_for_compaction_one_file_in=10 --max_background_compactions=1 --max_bytes_for_level_base=67108864 --max_key=1048576 --max_key_len=6 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=16 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=1048576 --memtable_prefix_bloom_size_ratio=0 --memtable_whole_key_filtering=0 --memtablerep=skip_list --min_blob_size=8 --mmap_read=1 --mock_direct_io=False --nooverwritepercent=0 --num_iterations=77 --open_files=-1 --open_metadata_write_fault_one_in=8 --open_read_fault_one_in=32 --open_write_fault_one_in=0 --ops_per_thread=100000000 --optimize_filters_for_memory=0 --paranoid_file_checks=1 --partition_filters=0 --partition_pinning=0 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefix_size=-1 --prefixpercent=0 --prepopulate_block_cache=1 --progress_reports=0 --read_fault_one_in=1000 --readpercent=39 --recycle_log_file_num=1 --reopen=0 --reserve_table_reader_memory=0 --ribbon_starting_level=6 --secondary_cache_fault_one_in=32 --seed=3069121425 --set_options_one_in=0 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=0 --sst_file_manager_bytes_per_truncate=0 --subcompactions=3 --sync=False --sync_fault_injection=False --sync_wal_one_in=100000 --target_file_size_base=16777216 --target_file_size_multiplier=1 --test_batches_snapshots=0 --top_level_index_pinning=3 --unpartitioned_pinning=3 --use_block_based_filter=0 --use_clock_cache=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=False --use_merge=1 --use_multiget=0 --user_timestamp_size=0 --value_size_mult=32 --verify_before_write=True --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --wal_compression=none --write_buffer_size=33554432 --write_dbid_to_manifest=0 --writepercent=48

@isaac-io
Copy link
Contributor Author

Pretty sure that it's unrelated to the change in this issue because you have --clear_column_family_one_in=4 with --column_families=1, and in that case no drop is actually performed, so the code changes of this issue aren't reached. I'll try to reproduce locally on this branch and on main.

@isaac-io isaac-io removed this from the v2.1.0 milestone Oct 2, 2022
@isaac-io isaac-io added this to the v2.2.0 milestone Oct 26, 2022
@Guyme
Copy link

Guyme commented Nov 3, 2022

@Yuval-Ariel - please run full cycle as it seams to be unrelated to this change

isaac-io added a commit that referenced this issue Nov 4, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
@Yuval-Ariel Yuval-Ariel removed their assignment Nov 6, 2022
@Yuval-Ariel
Copy link
Contributor

QA passed on c6567a6

isaac-io added a commit that referenced this issue Nov 7, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
Yuval-Ariel pushed a commit that referenced this issue Nov 15, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
Yuval-Ariel pushed a commit that referenced this issue Nov 25, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
Yuval-Ariel pushed a commit that referenced this issue Nov 25, 2022
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
Yuval-Ariel pushed a commit that referenced this issue Apr 30, 2023
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
Yuval-Ariel pushed a commit that referenced this issue May 4, 2023
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
udi-speedb pushed a commit that referenced this issue Nov 13, 2023
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
udi-speedb pushed a commit that referenced this issue Nov 15, 2023
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
udi-speedb pushed a commit that referenced this issue Dec 3, 2023
This is a continuation of #126 with changes that were missed there in case
a CF drop was encountered after the flush was completed but before or during
manifest write.

Rolling back on CF drop is meaningless because a flush is not possible
in that state, and the only path forwards is for the memtables to be freed
shortly afterwards, so making them available for flush again is useless
at best.

Additionally, this might be needed for the WriteBufferManager changes in
in the future for triggering flushes based on immutable memory as well,
and rolling back flushes causes the memtable memory to become ready for
flush again for a brief period of time until it is dropped, which might
wrongly affect the WBM decisions.

Finally, the code installs a new version even if no changes are made. This
is unnecessary and as such that part is moved into the version-mutating
code path only.

This PR also adds two regression tests, so we could rely on rollback not
being performed on CF drop.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Upstreamable can be upstreamed to RocksDB
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants