Skip to content

[cdc] Add KME schema reader in CDC to support schema evolution#2177

Merged
sushantmane merged 3 commits into
linkedin:mainfrom
sushantmane:Li-Fix-KME-Evolution-in-CDC
Oct 2, 2025
Merged

[cdc] Add KME schema reader in CDC to support schema evolution#2177
sushantmane merged 3 commits into
linkedin:mainfrom
sushantmane:Li-Fix-KME-Evolution-in-CDC

Conversation

@sushantmane
Copy link
Copy Markdown
Contributor

[cdc] Add KME schema reader in CDC to support schema evolution

CDC currently fails when KME schema evolves, since records encoded with newer schema
cannot be deserialized. This change introduces a KME schema reader so that records
with updated schema versions can be deserialized and processed correctly.

Code changes

  • Added new code behind a config. If so list the config names and their default values in the PR description.
  • Introduced new log lines.
    • Confirmed if logs need to be rate limited to avoid excessive logging.

Concurrency-Specific Checks

Both reviewer and PR author to verify

  • Code has no race conditions or thread safety issues.
  • Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed.
  • No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
  • Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
  • Validated proper exception handling in multi-threaded code to avoid silent thread termination.

How was this PR tested?

  • New unit tests added.
  • New integration tests added.
  • Modified or extended existing tests.
  • Verified backward compatibility (if applicable).

Does this PR introduce any user-facing or breaking changes?

  • No. You can skip the rest of this section.
  • Yes. Clearly explain the behavior change and its impact.

Copilot AI review requested due to automatic review settings October 2, 2025 09:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds KME (Kafka Message Envelope) schema reader support in CDC to handle schema evolution. The main purpose is to prevent CDC failures when KME schema evolves by enabling proper deserialization of records with newer schema versions.

  • Introduces a configurable KME schema reader that can be enabled via kme.schema.reader.for.schema.evolution.enabled
  • Refactors PubSubMessageDeserializer creation to use the new schema-aware deserializer when conditions are met
  • Updates dictionary reading functionality to accept custom deserializers

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
DictionaryUtils.java Adds overloaded method to accept custom PubSubMessageDeserializer for dictionary reading
PubSubMessageDeserializer.java Adds documentation warning against using default deserializers in production
ConfigKeys.java Introduces new configuration flag for enabling KME schema reader
VeniceChangelogConsumerImpl.java Updates dictionary reading to use the new schema-aware deserializer
VeniceChangelogConsumerClientFactory.java Implements logic to create KME-backed deserializer when enabled and D2 client is available
ChangelogClientConfig.java Removes hardcoded PubSubMessageDeserializer field to enable dynamic creation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copilot AI review requested due to automatic review settings October 2, 2025 20:43
@sushantmane sushantmane force-pushed the Li-Fix-KME-Evolution-in-CDC branch from fe3c7e6 to 3fb9d06 Compare October 2, 2025 20:43
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@sushantmane
Copy link
Copy Markdown
Contributor Author

Thanks for the review, @kvargha!

@sushantmane sushantmane enabled auto-merge (squash) October 2, 2025 20:54
@sushantmane sushantmane merged commit 7c8e9a1 into linkedin:main Oct 2, 2025
50 checks passed
@sushantmane sushantmane deleted the Li-Fix-KME-Evolution-in-CDC branch October 2, 2025 21:21
arjun4084346 pushed a commit to arjun4084346/venice that referenced this pull request Dec 9, 2025
…din#2177)

CDC currently fails when KME schema evolves, since records encoded with newer schema  
cannot be deserialized. This change introduces a KME schema reader so that records  
with updated schema versions can be deserialized and processed correctly.
sushantmane added a commit to sushantmane/venice that referenced this pull request Jan 30, 2026
…din#2177)

CDC currently fails when KME schema evolves, since records encoded with newer schema  
cannot be deserialized. This change introduces a KME schema reader so that records  
with updated schema versions can be deserialized and processed correctly.
misyel pushed a commit to misyel/venice that referenced this pull request Feb 2, 2026
…din#2177)

CDC currently fails when KME schema evolves, since records encoded with newer schema  
cannot be deserialized. This change introduces a KME schema reader so that records  
with updated schema versions can be deserialized and processed correctly.
sushantmane added a commit to sushantmane/venice that referenced this pull request Feb 8, 2026
…din#2177)

CDC currently fails when KME schema evolves, since records encoded with newer schema  
cannot be deserialized. This change introduces a KME schema reader so that records  
with updated schema versions can be deserialized and processed correctly.
misyel pushed a commit to misyel/venice that referenced this pull request Feb 17, 2026
…din#2177)

CDC currently fails when KME schema evolves, since records encoded with newer schema  
cannot be deserialized. This change introduces a KME schema reader so that records  
with updated schema versions can be deserialized and processed correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants