Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tx/tm_stm: fix unboundedness of _pid_tx_id #18198

Merged
merged 5 commits into from
May 2, 2024

Conversation

bharathv
Copy link
Contributor

@bharathv bharathv commented May 1, 2024

If there is a new producer_id, we do not need older incarnations of the
same id as the corresponding transactions are sealed before
re-registration.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

  • none

@bharathv bharathv marked this pull request as ready for review May 1, 2024 21:24
@bharathv bharathv requested a review from a team as a code owner May 1, 2024 21:24
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented May 1, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/48581#018f3649-8a4a-4640-89b0-210831c63dff:

"rptest.tests.read_replica_e2e_test.ReadReplicasUpgradeTest.test_upgrades.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48581#018f3649-8a52-47cc-a991-97884e9ce7c9:

"rptest.tests.upgrade_test.UpgradeFromPriorFeatureVersionCloudStorageTest.test_rolling_upgrade.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48581#018f3651-fb61-49b7-96ed-0e06eb403027:

"rptest.tests.read_replica_e2e_test.ReadReplicasUpgradeTest.test_upgrades.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_cross_shard.num_to_upgrade=2.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_shadow_indexing.num_to_upgrade=2.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48581#018f3651-fb6a-4f79-bbb9-598969acd494:

"rptest.tests.upgrade_test.UpgradeFromPriorFeatureVersionCloudStorageTest.test_rolling_upgrade.cloud_storage_type=CloudStorageType.S3"

@@ -1297,6 +1316,7 @@ ss::future<cluster::init_tm_tx_reply> tx_gateway_frontend::do_init_tm_tx(

tx = r.value();
init_tm_tx_reply reply;
model::producer_identity rolled_pid = tx.pid;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use last_pid to cleanup overwritten producer state here ? It seems that it has exactly the same purpose as the rolled_pid

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I couldn't quite grok the difference between pid, last_pid and rolled_pid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As chatted offline, last_pid seems somewhat sketchy and not updated in all cases, added a clarifying comment.

@mmaslankaprv
Copy link
Member

/ci-repeat

@@ -307,6 +307,7 @@ class tm_stm final : public raft::persisted_stm<> {
kafka::transactional_id,
std::chrono::milliseconds,
model::producer_identity,
model::producer_identity,
model::producer_identity);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this would benefit from parameter names, 3 parameters with the same type but without names is a bit too much :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -1297,6 +1316,7 @@ ss::future<cluster::init_tm_tx_reply> tx_gateway_frontend::do_init_tm_tx(

tx = r.value();
init_tm_tx_reply reply;
model::producer_identity rolled_pid = tx.pid;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I couldn't quite grok the difference between pid, last_pid and rolled_pid

If there is a new producer_id, we do not need older incarnations of the
same id as the corresponding transactions are sealed before
re-registration.
@bharathv bharathv merged commit 5ee7ff3 into redpanda-data:dev May 2, 2024
18 checks passed
@bharathv bharathv deleted the tx_fix_unbounded_map branch May 2, 2024 19:38
@vbotbuildovich
Copy link
Collaborator

/backport v24.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-18198-v23.2.x-742 remotes/upstream/v23.2.x
git cherry-pick -x d0c55096c8c16d921022e2c4c771577b13fab280 1186a6ed6168ebd63c3881c2020567e5ec7af001 be961f8011eff9f87a2026783fe583defeafd86f 834025cef5eecc93b37c880cc2922c2b79288cd9 4c1e6620eba70af4a69019f89691a30fe593ed0f

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants