Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote partition_manifest validation for topic recovery #16915

Merged
merged 13 commits into from
Apr 2, 2024

Conversation

andijcr
Copy link
Contributor

@andijcr andijcr commented Mar 6, 2024

Introduced a new function to perform metadata validation for a remote topic to be used during the process of topic recovery.

For each partition, checks if the manifest exists in the cloud and optionally checks if the file can be decoded and self-consistent up to N most recent segment_meta,

For each partition, the S3 cost is 1 HeadObject request for the partition_manifest OR 1Get + 1HeadObject request for N segment_meta.

in general, we could limit the depth of the check by

  • num segments
  • offset
  • total time

num segments correlates directly with the speed of the operation while being deterministic.
offset could map better with what the user wants, but it's more challenging to implement with the current reverse iteration implementation of segment_meta_cstore.
total time, it's a strong guarantee, but the final result depends on the load of the system.

This PR implements a max_num_segment limit, as it's easier to reason about. A follow-up can implement the other two modes if there is a request for this.

The check is performed in parallel for each partition with a cap.

The result can be

passed   // checks are on
missing_manifest // allowed for topics that did not have time to upload a manifest

failure    // decoding failure or metadata inconsistencies; should stop 
download_issue // cloud storage configuration error or service error; should stop

The validation mode is driven by a new (nullopt) topic property and a cluster-level default + force flag.

A new topic_property allows users to define validation manually by editing the topic_manifest.json file.
otherwise, cluster-level defaults will be used.

    enum_property<model::recovery_validation_mode>
      cloud_storage_recovery_topic_validation_mode;
    property<uint16_t> cloud_storage_recovery_topic_validation_depth;
    property<bool> cloud_storage_recovery_topic_force_ovveride_cfg;

The last one is an escape hatch to override recovery_checks with the cluster defaults

Fixes https://github.com/redpanda-data/core-internal/issues/1138
Fixes https://github.com/redpanda-data/core-internal/issues/1139

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

Improvements

  • before creating a recovery topic, perform metadata validation on the cloud data to ensure that each partition can be recovered successfully

@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch from 6ea2d18 to de76f5f Compare March 6, 2024 14:30
src/v/cloud_storage/anomalies_detector.h Outdated Show resolved Hide resolved
src/v/cloud_storage/remote.cc Outdated Show resolved Hide resolved
src/v/cloud_storage/remote.cc Outdated Show resolved Hide resolved
src/v/config/configuration.cc Outdated Show resolved Hide resolved
src/v/cluster/topic_recovery_validator.cc Show resolved Hide resolved
@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch 2 times, most recently from 2b779c2 to 338a119 Compare March 20, 2024 10:38
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Mar 20, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/46492#018e5ba2-f06b-4254-944c-8f621456b8ca:

"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_config_batches.num_messages=2.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast2.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_missing_partition.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.cluster_config_test.ClusterConfigTest.test_rpk_export_import"
"rptest.tests.control_character_flag_test.ControlCharacterPermittedAfterUpgrade.test_upgrade_from_pre_v23_2.initial_version=.23.1.1"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_vcluster_id.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_empty_segments.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/46492#018e5ba2-f06e-4180-b3a8-65034779a9d0:

"rptest.tests.consumer_group_recovery_test.ConsumerOffsetsRecoveryTest.test_consumer_offsets_partition_recovery"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast2.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_missing_partition.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_vcluster_id.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_empty_segments.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/46492#018e5ba2-f067-4f2b-88c7-069a925426c0:

"rptest.tests.cluster_recovery_test.ClusterRecoveryTest.test_bootstrap_with_recovery"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_config_batches.num_messages=2.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast1.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast3.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_admin_api_recovery.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_missing_segment.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_time_based_retention.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_no_data.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/46492#018e5ba2-f071-43b5-b667-bd2aa49f158f:

"rptest.tests.retention_policy_test.ShadowIndexingCloudRetentionTest.test_topic_recovery_retention_settings"
"rptest.tests.cluster_recovery_test.ClusterRecoveryTest.test_basic_controller_snapshot_restore"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast3.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast1.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_missing_segment.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_admin_api_recovery.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_time_based_retention.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_no_data.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/46492#018e5bb4-6654-4dfa-8349-e920128e17e2:

"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_aborted_tx.recovery_overrides=.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.cluster_recovery_test.ClusterRecoveryTest.test_basic_controller_snapshot_restore"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore.message_size=5000.num_messages=100000.recovery_overrides=.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTest.test_recover"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_config_batches.num_messages=2.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_missing_segment.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast1.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast3.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_admin_api_recovery.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_time_based_retention.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.cluster_config_test.ClusterConfigTest.test_restart"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_no_data.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/46492#018e5bb4-665b-4689-b62f-152e1645b0c1:

"rptest.tests.retention_policy_test.ShadowIndexingCloudRetentionTest.test_topic_recovery_retention_settings"
"rptest.tests.consumer_group_recovery_test.ConsumerOffsetsRecoveryTest.test_consumer_offsets_partition_recovery"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_aborted_tx.recovery_overrides=.retention.local.target.bytes.1024.redpanda.remote.write.True.redpanda.remote.read.True.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore.message_size=5000.num_messages=100000.recovery_overrides=.retention.local.target.bytes.1024.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_missing_partition.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast2.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_vcluster_id.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_empty_segments.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/46492#018e5bb4-6658-4926-a2da-c93177cd2c1b:

"rptest.tests.e2e_shadow_indexing_test.ShadowIndexingManyPartitionsTest.test_many_partitions_recovery"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_aborted_tx.recovery_overrides=.retention.local.target.bytes.1024.redpanda.remote.write.True.redpanda.remote.read.True.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore.message_size=5000.num_messages=100000.recovery_overrides=.retention.local.target.bytes.1024.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_missing_partition.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast2.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.cluster_config_test.ClusterConfigTest.test_rpk_export_import"
"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_vcluster_id.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_empty_segments.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/46492#018e5bb4-6656-4377-8779-0cf4949f088a:

"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=False.num_to_upgrade=0.with_tiered_storage=False"
"rptest.tests.cluster_recovery_test.ClusterRecoveryTest.test_bootstrap_with_recovery"
"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=True.num_to_upgrade=0.with_tiered_storage=False"
"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTest.test_recover_after_delete_records"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore.message_size=5000.num_messages=100000.recovery_overrides=.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_config_batches.num_messages=2.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_aborted_tx.recovery_overrides=.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast1.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_missing_segment.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_fast3.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_time_based_retention.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_admin_api_recovery.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_no_data.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/46523#018e5e0a-4a9c-47db-8868-807498b8dbfe:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=check_manifest_existence"

new failures in https://buildkite.com/redpanda/redpanda/builds/46523#018e5e0a-4a94-40ed-b0a7-6143a871d571:

"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"
"rptest.tests.control_character_flag_test.ControlCharacterPermittedAfterUpgrade.test_upgrade_from_pre_v23_2.initial_version=.22.3.11"
"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=check_manifest_own_metadata_only"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=no_check"

new failures in https://buildkite.com/redpanda/redpanda/builds/46523#018e5e0a-4aa0-43dd-a7bc-96bd7db712e5:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=check_manifest_existence"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=check_manifest_own_metadata_only"

new failures in https://buildkite.com/redpanda/redpanda/builds/46523#018e5e0a-4a99-4e35-a423-269656f9ef63:

"rptest.tests.cluster_config_test.ClusterConfigTest.test_rpk_export_import"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=no_check"

new failures in https://buildkite.com/redpanda/redpanda/builds/46523#018e5e1c-34e7-4b8f-a78e-a84eed8c4ce4:

"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=check_manifest_existence"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=check_manifest_own_metadata_only"

new failures in https://buildkite.com/redpanda/redpanda/builds/46523#018e5e1c-34ef-4c03-a951-f8ebde0cd29b:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=check_manifest_existence"

new failures in https://buildkite.com/redpanda/redpanda/builds/46523#018e5e1c-34ec-4da9-a494-3ddde49d513c:

"rptest.tests.cluster_config_test.ClusterConfigTest.test_rpk_export_import"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=no_check"

new failures in https://buildkite.com/redpanda/redpanda/builds/46523#018e5e1c-34ea-49ca-b480-2c2e5c118ba8:

"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=False.num_to_upgrade=0.with_tiered_storage=False"
"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=True.num_to_upgrade=0.with_tiered_storage=False"
"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=check_manifest_own_metadata_only"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=no_check"

new failures in https://buildkite.com/redpanda/redpanda/builds/46568#018e61e6-c6cb-4fca-a45b-d3cbeeefa737:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/46568#018e61e6-c6c8-45c0-be1d-eb9cb0d6cf85:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/46568#018e61e6-c6c5-4606-a862-8228315cfe14:

"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"

new failures in https://buildkite.com/redpanda/redpanda/builds/46568#018e61f9-fa6d-4d1e-8c80-810f4abe6242:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/46568#018e61f9-fa6a-4301-a79c-38f16f6b0550:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/46568#018e61f9-fa64-4d64-941f-b783c746bf60:

"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"

new failures in https://buildkite.com/redpanda/redpanda/builds/46568#018e61f9-fa67-48b5-ab62-9c2d18000713:

"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=True.num_to_upgrade=0.with_tiered_storage=False"
"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=False.num_to_upgrade=0.with_tiered_storage=False"

new failures in https://buildkite.com/redpanda/redpanda/builds/46717#018e76e5-1db9-4459-884e-3cda38001308:

"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTest.test_reset_from_cloud.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"

new failures in https://buildkite.com/redpanda/redpanda/builds/46717#018e76e5-1db3-4876-91dc-3e97ffba9112:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/46717#018e76e5-1dbe-4cb0-bad6-94ded757fb18:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/46717#018e76f7-fcd1-441a-a00f-0362c30c0046:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/46717#018e76f7-fcd7-496f-a234-a2ab12290165:

"rptest.tests.topic_creation_test.CreateTopicsTest.test_no_log_bloat_when_recreating_existing_topics"

new failures in https://buildkite.com/redpanda/redpanda/builds/46717#018e76f7-fcd4-4ea2-a1a5-2ff4321e2916:

"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=False.num_to_upgrade=0.with_tiered_storage=False"
"rptest.tests.random_node_operations_test.RandomNodeOperationsTest.test_node_operations.enable_failures=True.num_to_upgrade=0.with_tiered_storage=False"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/46933#018e821a-f5a5-45c6-8123-08a27295983c:

"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTest.test_reset_from_cloud.cloud_storage_type=CloudStorageType.S3"

@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch 4 times, most recently from 291d47f to 6396208 Compare March 21, 2024 15:45
@CLAassistant
Copy link

CLAassistant commented Mar 21, 2024

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ andijcr
❌ nobody


nobody seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch from 6396208 to a6bf506 Compare March 21, 2024 15:51
@andijcr
Copy link
Contributor Author

andijcr commented Mar 21, 2024

force push to fix conflicts and restart ci

src/v/ssx/future-util.h Outdated Show resolved Hide resolved
src/v/cloud_storage/anomalies_detector.cc Show resolved Hide resolved
src/v/cloud_storage/anomalies_detector.h Outdated Show resolved Hide resolved
src/v/cloud_storage/anomalies_detector.cc Outdated Show resolved Hide resolved
src/v/cloud_storage/anomalies_detector.h Outdated Show resolved Hide resolved
src/v/cluster/topic_recovery_validator.cc Outdated Show resolved Hide resolved
src/v/cluster/topic_recovery_validator.cc Outdated Show resolved Hide resolved
src/v/config/configuration.cc Outdated Show resolved Hide resolved
src/v/config/configuration.cc Outdated Show resolved Hide resolved
src/v/config/configuration.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The high level structure seems reasonable. Left a bunch of nits, and some questions about how this is exposed via configs.

Also I think the topic recovery validator and new bits in the anomaly detector could use some unit testing.

Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments/questions.

how do you feel about test coverage and what testing do you think we need to do?

agree with pretty much all of andrews comments as well.

src/v/ssx/future-util.h Outdated Show resolved Hide resolved
src/v/cloud_storage/anomalies_detector.cc Show resolved Hide resolved
src/v/model/metadata.h Outdated Show resolved Hide resolved
src/v/model/model.cc Outdated Show resolved Hide resolved
src/v/model/metadata.h Outdated Show resolved Hide resolved
src/v/cluster/types.h Outdated Show resolved Hide resolved
src/v/kafka/server/handlers/topics/types.cc Outdated Show resolved Hide resolved
src/v/cluster/topic_recovery_validator.cc Outdated Show resolved Hide resolved
src/v/cluster/topic_recovery_validator.cc Outdated Show resolved Hide resolved
src/v/cluster/topic_recovery_validator.cc Outdated Show resolved Hide resolved
@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch from a6bf506 to fc65155 Compare March 25, 2024 17:42
@andijcr
Copy link
Contributor Author

andijcr commented Mar 25, 2024

force push: rebase, fixing merge conflicts, addressed some of the comments.

TBD:

  • content and level of log messages in topic_recovery_validator
  • unit test for topic_recovery validator

waiting ci to check regessions

@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch 2 times, most recently from 64a4d10 to 0fdd16c Compare March 27, 2024 14:04
@andijcr
Copy link
Contributor Author

andijcr commented Mar 27, 2024

fixed merge conflict, added unit test for anomalies detector, structured topic_recovery_validator with an internal class to factor common parts

TBD: extend TopicRecoveryTest to test more combinations of recovery checks

@andijcr andijcr requested review from andrwng and dotnwat March 27, 2024 22:25
max_num_segments limits the total number of segment_meta that will be
checked in a run. To check a segment meta the code checks the existance
of its segment in remote storage. This done with a HEAD request.

the limit is counted from most recent to oldest, so that a depth of 1 will
only cause the newest segment to be checked.

an appropriate low number will limit the scrub to only the stm manifest
section.

we could limit by
- num segments
- offset
- total time

a limit by num segments correlates directly with the speed
of the operation while being deterministic.

offset could map better with what the user wants, but the current implementation
of reverse iteration in segment_meta_cstore makes it a bit involved to implement.

total time it's a strong guarantee but the final result depends on the load of the system.

This commit implements a limit by num of segments with the parameter
max_num_segments. The implementation of this mode it's easier to reason about,
and other PRs can implement the other two modes.
@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch from dc06b08 to 4119e98 Compare March 28, 2024 12:58
@andijcr
Copy link
Contributor Author

andijcr commented Mar 28, 2024

last failure was interesting, i had to manually cast the mode enum to its unsigned type, otherwise it would not be deserializable as uint. Other enums are fine. i didn't investigate the reason (too many jumps), but i suspect that ODR violation might be the problem and i'm hitting the wrong impl
from the fix:

inline void rjson_serialize(
  json::Writer<json::StringBuffer>& w, const cluster::recovery_checks& rc) {
    w.StartObject();
    // TODO investigate the reason. seems like a manually casting to uint16_t of
    // rc.mode enum is needed, otherwise we get an assertion at decoding time,
    // when we try to read an unsigned but the json value is not tagged as
    // unsigned
    write_member(
      w,
      "mode",
      static_cast<std::underlying_type_t<decltype(rc.mode)>>(rc.mode));
    write_member(w, "max_segment_depth", rc.max_segment_depth);
    w.EndObject();
}

@andijcr
Copy link
Contributor Author

andijcr commented Mar 28, 2024

#17198 issue is known

Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my last review I hadn't considered adding the option to the create request rather than the topic property, and the more I think about it the more natural it seems. WDYT?

src/v/cloud_storage/remote.cc Outdated Show resolved Hide resolved
src/v/cluster/types.h Outdated Show resolved Hide resolved
src/v/cluster/types.h Outdated Show resolved Hide resolved
src/v/cluster/topic_recovery_validator.cc Outdated Show resolved Hide resolved
Comment on lines 19 to 22
class partition_validator {
public:
// Each partition gets a separate retry_chain_node and a logger tied to it.
// From this a common retry_chain_logger is created
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth unit testing this partition validator on its own? Defining it in the .cc file seems to discourage testing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth unit testing this partition validator on its own? Defining it in the .cc file seems to discourage testing

good point. @andijcr we want unit testing to be the first line of defense for testing new code.

however, if we have some testing that we feel like will be exercising this, i'm ok if we do this as a follow up rather than continuing to iterate on this large PR. if we don't think any of this code is getting exercised, then we need to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/rptest/tests/topic_recovery_test.py::TopicRecoveryTest.[test_prevent_topic_recovery|test_many_partitions] are written to test this code. i'll get a unit test going like for anomalies_detector. the dimension are [validation mode, cardinality 4], [partition damage, cardinality ~6], [download results, cardinatily ~3]

src/v/cluster/topic_recovery_validator.h Outdated Show resolved Hide resolved
Comment on lines +492 to +494
auto validation_map = co_await maybe_validate_recovery_topic(
assignable_config, bucket, _cloud_storage_api.local(), _as.local());
if (std::ranges::any_of(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider the timeout here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean the timeout for the create_topics operations?
yes we should (https://github.com/redpanda-data/core-internal/issues/1221)

in the ducktape local environment, and from some testing also on a cluster with s3, the current defaults seems to be ok.

setting the cluster default validation to no_check would work as an escape hatch in case this operation becomes the bottleneck.

maybe i can return an optional<map<partition_id, validation_res>> to remove the runtime cost in case of nocheck

partition_manifest_exists chains a check for a serde manifest and a json
manifest
describes how to perform validation for a partition.

one of
manifest existance only
manifest file integrity and metadata check on contained objects
no_check
@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch from 4119e98 to 5cf31d6 Compare March 29, 2024 11:13
cluster level default values

cloud_storage_recovery_topic_validation_mode is the
recovery_validation_mode

cloud_storage_recovery_topic_validation_depth is the validation depth,
meaninful when validation_mode is model::recovery_validation_mode::check_manifest_own_metadata_only,

cloud_storage_recovery_topic_force_ovveride_cfg is the escape hatch to
use cluster level defaults instead of the topic properties
@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch from 5cf31d6 to 36f0e99 Compare March 29, 2024 12:51
@andijcr
Copy link
Contributor Author

andijcr commented Mar 29, 2024

as discussed, removed recovery_checks from topic_properites and updated the rest of the code.

Not checks are performed based on cluster properties.

@andijcr andijcr requested a review from andrwng March 29, 2024 12:56
@@ -988,6 +988,40 @@ ss::future<download_result> remote::segment_exists(
existence_check_type::segment);
}

ss::future<remote::partition_manifest_existence>
remote::partition_manifest_exists(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need a dedicated method to check manifests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to perform an existence check (HEAD instead of GET).
since partition_manifest could be serde/json format, i chained the checks here.

I guess would be difficult to find manifest.json in the wild, but I don't think we have a policy that prevents reading very old data

, rev_id_{rev_id}
, op_rtc_{retry_chain_node{
as,
300s,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting timeout in the root rtc is very often an error. It only works if the object is used immediately after creation.
You can create a temporary rtc based on this one right before invoking method of the _remote and pass the timeout to the c-tor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so something like
opt_rtc_{as} {...} // have to check syntax

and in do_validate_manifest_existence() and do_validate_manifest_metadata()

auto rtc=retry_chain_node(&opt_rtc_, 300s, config::shard_local_cfg().cloud_storage_initial_backoff_ms.value())?

would the op_logger_ still work like this, or do i need to move it after rtc?

src/v/cluster/topic_recovery_validator.cc Outdated Show resolved Hide resolved
src/v/cluster/topic_recovery_validator.cc Outdated Show resolved Hide resolved
src/v/cluster/topic_recovery_validator.cc Outdated Show resolved Hide resolved
tools/offline_log_viewer/controller.py Outdated Show resolved Hide resolved
new function that will perform validation for all the partition of a
topic, in parallel.
returns a map(partition_id, validation_result) that can be used to
decide if to block topic recovery in case of fatal error

the function will read topic_property::recovery_checks and the cluster
defaults to perform either manifest existance or metadata checks, via
the anomalies_detector class.

parallelism is limited, and the checks are performed with a long timeout
to be resiliant against backoff requests.

for each partition the possible result can be

passed <- validation successful
no_manifest <- no manifest in cloud storage, allowed
failure <- some inconsistencies that needs intervention
download_issue <- likely misconfigured cloud storage or service issue

passed/no_manifest can be accepted for recovery, while
failure/download_issue should raise an error

error logs will point out partitions that did not pass the check
if any partition fails validation, stop creation. a missing manifest is
not considered a failure, for the purpose of recovery
for consistency between version, be explicit on the default behavior of
checks during recovery
simple test to ensure that a higher number of topics/partitions are not
detrimental to the system
and an optional topic to restore, and an optional dict of topic
properties to add during recovey
test that missing segments will stop recovery early
when the check mode is no_check, validation is skipped.

this commit reduces the runtime cost by not populating the
map<partition_id, validation_result>.

on the caller side, this is already interpreted as "validation ok"
@andijcr andijcr force-pushed the feat/topic_recovery_prevalidation branch from 36f0e99 to 0ce9a2a Compare March 29, 2024 21:50
@andijcr
Copy link
Contributor Author

andijcr commented Mar 29, 2024

updated on comments, the major change is match partition_probe to determine which segment_meta anomaly to consider fatal

@andijcr andijcr requested a review from andrwng March 29, 2024 21:57
@andijcr andijcr merged commit f8982ed into redpanda-data:dev Apr 2, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants