Skip to content

changefeedccl: support alternative kafka hash algo#161265

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
rharding6373:20260114-kafka-hashing-option
Jan 21, 2026
Merged

changefeedccl: support alternative kafka hash algo#161265
craig[bot] merged 1 commit intocockroachdb:masterfrom
rharding6373:20260114-kafka-hashing-option

Conversation

@rharding6373
Copy link
Collaborator

@rharding6373 rharding6373 commented Jan 16, 2026

Previously, the franz-go based kafka sink only supported the fnv-1a hashing algorithm, because the hashing function was compatible with the original sarama-based sink. However, some users need alternative hashing functions that are compatible with their existing ecosystem.

This PR adds the partition_alg changefeed option, with the options fnv-1a (default) and murmur2 (kafka's default). This is protected by a cluster setting.

Epic: CRDB-58732
Fixes: #161202

Release note (general change): Changefeeds now support the partition_alg
option for specifying a kafka partitioning algorithm. Currently fnv-1a
(default) and murmur2 are supported. The option is only valid on kafka
v2 sinks. This is protected by the cluster setting
changefeed.partition_alg.enabled. An example usage: SET CLUSTER SETTING changefeed.partition_alg.enabled=true; CREATE CHANGEFEED ... INTO 'kafka://...' WITH partition_alg='murmur2'; Note that if a changefeed is
created using the murmur2 algorithm, and then the cluster setting is
disabled, the changefeed will continue using the murmur2 algorithm
unless the changefeed is altered to use a differed partition_alg.

@rharding6373 rharding6373 requested a review from aerfrei January 16, 2026 17:25
@rharding6373 rharding6373 requested a review from a team as a code owner January 16, 2026 17:25
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@aerfrei aerfrei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Great to have this done. Left some nits

if opts.IsSet(changefeedbase.OptHashAlg) {
if !changefeedbase.HashAlgEnabled.Get(&settings.SV) {
return errors.Newf(
"option %s requires cluster setting %s to be enabled",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We might want to add a test for this error message.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
}

// Check feature flag for hash_alg.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This comment seems to mostly read aloud the code below, maybe remove it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AGI. Done.

settings.ApplicationLevel,
"changefeed.hash_alg.enabled",
"if enabled, allows specifying the hash_alg changefeed option to"+
" choose between fnv-1a (default) and murmur2 hash functions for"+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The strings we're concatenating here do not have a space between the second and third, meaning this will become "... forKafka partitioning". To match what we do elsewhere in this file, have these end with "+ and no space at the beginning of the strings.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

var inner kgo.Partitioner

if changefeedbase.HashAlgEnabled.Get(&settings.SV) && strings.ToLower(hashMethod) == "murmur2" {
// Use franz-go default (murmur2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: this comment and the one below "Default to fnv-1a" could be full sentences. Maybe we'd prefer something like "murmur2 is the franz-go default and can be specified by the changefeed". And "When not otherwise specified, we use the Sarama default hash method of fnv-1a."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some word smithing, ptal.

defer leaktest.AfterTest(t)()
defer log.Scope(t).Close(t)

settings := cluster.MakeTestingClusterSettings()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Might we want to define this before we we define kgoPartitioner, where it's used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if changefeedbase.HashAlgEnabled.Get(&settings.SV) && strings.ToLower(hashMethod) == "murmur2" {
// Use franz-go default (murmur2)
inner = kgo.StickyKeyPartitioner(nil /* hasher */)
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid this else, we should be able to return directly in the if and then handle the default case not nested.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@rharding6373 rharding6373 force-pushed the 20260114-kafka-hashing-option branch from b587855 to 0e147b1 Compare January 16, 2026 18:39
Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Ready for another look.

}
}

// Check feature flag for hash_alg.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AGI. Done.

if opts.IsSet(changefeedbase.OptHashAlg) {
if !changefeedbase.HashAlgEnabled.Get(&settings.SV) {
return errors.Newf(
"option %s requires cluster setting %s to be enabled",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

settings.ApplicationLevel,
"changefeed.hash_alg.enabled",
"if enabled, allows specifying the hash_alg changefeed option to"+
" choose between fnv-1a (default) and murmur2 hash functions for"+
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

var inner kgo.Partitioner

if changefeedbase.HashAlgEnabled.Get(&settings.SV) && strings.ToLower(hashMethod) == "murmur2" {
// Use franz-go default (murmur2)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some word smithing, ptal.

if changefeedbase.HashAlgEnabled.Get(&settings.SV) && strings.ToLower(hashMethod) == "murmur2" {
// Use franz-go default (murmur2)
inner = kgo.StickyKeyPartitioner(nil /* hasher */)
} else {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

defer leaktest.AfterTest(t)()
defer log.Scope(t).Close(t)

settings := cluster.MakeTestingClusterSettings()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@github-actions
Copy link
Contributor

Potential Bug(s) Detected

The three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation.

Next Steps:
Please review the detailed findings in the workflow run.

Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary.

After you review the findings, please tag the issue as follows:

  • If the detected issue is real or was helpful in any way, please tag the issue with O-AI-Review-Real-Issue-Found
  • If the detected issue was not helpful in any way, please tag the issue with O-AI-Review-Not-Helpful

@github-actions github-actions bot added the o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. label Jan 16, 2026
inner = kgo.StickyKeyPartitioner(nil /* hasher */)
return &kgoChangefeedPartitioner{inner: inner}
}
// If not enabled, or if fnv-1a is specified, use fnv-1a by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe "If the cluster setting is not enabled or no option is specified, use fnv-1a by default.". It wasn't clear to me what is not enabled (the cluster setting? or no option provided?). And if we specify fnv-1a, it's not really by default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned up

func newKgoChangefeedPartitioner(hashMethod string, settings *cluster.Settings) kgo.Partitioner {
var inner kgo.Partitioner

if changefeedbase.HashAlgEnabled.Get(&settings.SV) && strings.ToLower(hashMethod) == "murmur2" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude raised an issue here that seems legit to me: we actually don't want to check this cluster setting here, since if the setting was turned on to create the changefeed successfully with murmur2, if it is turned off later we don't want to switch to the fnv-1a hash mid changefeed. We should also probably add a test for this scenario if possible.

Location: pkg/ccl/changefeedccl/sink_kafka_v2.go:620

Issue: The runtime cluster setting check in newKgoChangefeedPartitioner causes partition inconsistency for existing changefeeds. When a changefeed created with hash_alg='murmur2' resumes after the cluster setting changefeed.hash_alg.enabled is disabled, the partitioner silently falls back to fnv-1a, routing the same keys to different Kafka partitions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this is an interesting case where using a flag to prevent executing a risky new codepath (I realize the new codepath is not very risky, but this is why we require flags when backporting), can lead to weird, possibly undesirable behavior.

I removed the flag from the partitioner code, since it doesn't seem risky. I updated the commit message to reflect the expected behavior if the setting is toggled.

I tested manually to make sure it behaves as expected. Unfortunately there's a blocker to implementing a test in the test suite for this case, since kafkaFeed does not support multiple partitions, which are critical for validating that the hasher is hashing keys correctly. I spent a couple hours on this today but it's going to take more effort. I created #161310 to track this effort.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I changed the PR description to reflect the commit message change. I'm thinking with that manual testing even without the work of #161310 it should be fine to merge. Approving.

@rharding6373 rharding6373 force-pushed the 20260114-kafka-hashing-option branch from 0e147b1 to 4508455 Compare January 17, 2026 00:09
@rharding6373 rharding6373 added the O-AI-Review-Real-Issue-Found AI reviewer found real issue label Jan 17, 2026
@rharding6373 rharding6373 force-pushed the 20260114-kafka-hashing-option branch from 4508455 to 17010c2 Compare January 17, 2026 00:13
@rharding6373 rharding6373 requested a review from aerfrei January 17, 2026 00:13
@rharding6373 rharding6373 added backport-25.2.x Flags PRs that need to be backported to 25.2 backport-25.4.x Flags PRs that need to be backported to 25.4 backport-26.1.x Flags PRs that need to be backported to 26.1 labels Jan 20, 2026
}

if opts.IsSet(changefeedbase.OptHashAlg) {
if !changefeedbase.HashAlgEnabled.Get(&settings.SV) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do the cluster setting check in ALTER CHANGEFEED too. We might also want to restrict changing the option altogether so that we don't send keys to different partitions after changing the hash algorithm.

Once you add the check, could you also add a test for it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also want to restrict changing the option altogether so that we don't send keys to different partitions after changing the hash algorithm.

Changing the key partition mapping is the point of changing the hash algorithm. Seems like this should be left to the user, though it also seems like a potential foot-gun.

Added a test for alter changefeed, thanks for catching the lack of coverage. The test shows that the existing check already covers the alter case.

@rharding6373 rharding6373 force-pushed the 20260114-kafka-hashing-option branch 3 times, most recently from 85d2788 to 2292bbc Compare January 20, 2026 23:38
@rharding6373
Copy link
Collaborator Author

Changed the name from hash_alg to partition_alg. PTAL.

@github-actions
Copy link
Contributor

Potential Bug(s) Detected

The three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation.

Next Steps:
Please review the detailed findings in the workflow run.

Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary.

After you review the findings, please tag the issue as follows:

  • If the detected issue is real or was helpful in any way, please tag the issue with O-AI-Review-Real-Issue-Found
  • If the detected issue was not helpful in any way, please tag the issue with O-AI-Review-Not-Helpful

Previously, the franz-go based kafka sink only supported the fnv-1a
hashing algorithm, because the hashing function was compatible with the
original sarama-based sink. However, some users need alternative hashing
functions that are compatible with their existing ecosystem.

This PR adds the `partition_alg` changefeed option, with the options
`fnv-1a` (default) and `murmur2` (kafka's default). This is protected by
a cluster setting.

Epic: CRDB-58732
Fixes: cockroachdb#161202

Release note (general change): Changefeeds now support the `partition_alg`
option for specifying a kafka partitioning algorithm. Currently `fnv-1a`
(default) and `murmur2` are supported. The option is only valid on kafka
v2 sinks. This is protected by the cluster setting
`changefeed.partition_alg.enabled`. An example usage: `SET CLUSTER SETTING
changefeed.partition_alg.enabled=true; CREATE CHANGEFEED ... INTO
'kafka://...' WITH partition_alg='murmur2';` Note that if a changefeed is
created using the murmur2 algorithm, and then the cluster setting is
disabled, the changefeed will continue using the murmur2 algorithm
unless the changefeed is altered to use a differed `partition_alg`.
@rharding6373 rharding6373 force-pushed the 20260114-kafka-hashing-option branch from 2292bbc to b5f89d9 Compare January 20, 2026 23:48
@rharding6373
Copy link
Collaborator Author

Thanks Claude for finding the two instances of the old name I forgot to replace in the PR description.

Copy link
Collaborator

@andyyang890 andyyang890 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rharding6373
Copy link
Collaborator Author

TFTRs!

bors t=aerfrei,andyyang890

@andyyang890
Copy link
Collaborator

bors r=aerfrei,andyyang890

(fixing a typo)

@craig
Copy link
Contributor

craig bot commented Jan 21, 2026

@craig craig bot merged commit 60d00e7 into cockroachdb:master Jan 21, 2026
29 checks passed
@blathers-crl
Copy link

blathers-crl bot commented Jan 21, 2026

Based on the specified backports for this PR, I applied new labels to the following linked issue(s). Please adjust the labels as needed to match the branches actually affected by the issue(s), including adding any known older branches.


Issue #161202: branch-release-25.2, branch-release-25.4, branch-release-26.1.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl
Copy link

blathers-crl bot commented Jan 21, 2026

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from b5f89d9 to blathers/backport-release-25.2-161265: POST https://api.github.com/repos/rharding6373/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 25.2.x failed. See errors above.


error creating merge commit from b5f89d9 to blathers/backport-release-25.4-161265: POST https://api.github.com/repos/rharding6373/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 25.4.x failed. See errors above.


error creating merge commit from b5f89d9 to blathers/backport-release-26.1-161265: POST https://api.github.com/repos/rharding6373/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 26.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-25.2.x Flags PRs that need to be backported to 25.2 backport-25.4.x Flags PRs that need to be backported to 25.4 backport-26.1.x Flags PRs that need to be backported to 26.1 backport-failed o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. O-AI-Review-Real-Issue-Found AI reviewer found real issue target-release-26.2.0 v26.2.0-prerelease

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cdc: support murmur2 as a kafka hashing algorithm

4 participants