Skip to content

fixes for small batches dedupplication#7418

Merged
kanwarujjaval merged 2 commits intonewarchitecturefrom
dedupe-fixes
Mar 28, 2026
Merged

fixes for small batches dedupplication#7418
kanwarujjaval merged 2 commits intonewarchitecturefrom
dedupe-fixes

Conversation

@kanwarujjaval
Copy link
Copy Markdown
Member

No description provided.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the ClickHouse event deduplication job to better handle small mutation batches by switching to single-pass duplicate discovery and chunked mutation dispatch, with safeguards against overly large discovery runs.

Changes:

  • Replace multi-batch discovery loop with a single discovery query capped by DISCOVERY_LIMIT
  • Dispatch DELETE mutations in smaller chunks via MUTATION_BATCH_SIZE
  • Add ClickHouse query setting (max_query_size) and update checkpoint/result metadata fields

@kanwarujjaval kanwarujjaval merged commit f52cf27 into newarchitecture Mar 28, 2026
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants