Skip to content

fix: data sink worker optimizations (CM-1054)#3952

Merged
themarolt merged 14 commits intomainfrom
fix/CM-1054-throttle-org-updated-at
Mar 25, 2026
Merged

fix: data sink worker optimizations (CM-1054)#3952
themarolt merged 14 commits intomainfrom
fix/CM-1054-throttle-org-updated-at

Conversation

@themarolt
Copy link
Copy Markdown
Contributor

@themarolt themarolt commented Mar 25, 2026

Note

Medium Risk
Moderate risk because it changes data-sink worker batching semantics (org creation, relation upserts, sync triggering) and adds new production indexes/migration behavior, which could affect throughput and correctness under load.

Overview
Improves data-sink ingestion performance by adding new covering/partial Postgres indexes for verified username/email member-identity lookups, and refactoring findMembersByVerifiedEmails to use an ANY(array[...]) predicate.

Reduces redundant work in the data-sink worker by introducing a per-batch organization creation promise cache, skipping a known placeholder org (individual-noaccount.com), throttling organizations.updatedAt updates during findOrCreateOrganization, and batching/parallelizing activity queue dispatch.

Optimizes downstream side effects by deduplicating triggerMemberSync calls per (memberId, segmentId), batching Redis org-id aggregation writes, using bulk affiliation-policy checks, and adding in-function caches to cut repeated existence checks when upserting activity relations (with createOrUpdateRelations(..., true) to skip checks in this path).

Written by Cursor Bugbot for commit 08893ea. This will update automatically on new commits. Configure here.

Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Copilot AI review requested due to automatic review settings March 25, 2026 08:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces optimizations aimed at reducing write amplification and redundant DB lookups in the data sink worker path (CM-1054), primarily by throttling organizations.updatedAt updates and caching existence checks while upserting activity relations.

Changes:

  • Add optional throttleUpdatedAt behavior to organization updates and enable it from data_sink_worker.
  • Reduce repeated SELECT checks in createOrUpdateRelations by caching member/org/segment/conversation existence results within a batch.
  • Thread the new throttling flag through findOrCreateOrganization -> updateOrganization.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
services/libs/data-access-layer/src/organizations/base.ts Adds an optional throttled updatedAt update expression and passes the flag through org upsert flow.
services/libs/data-access-layer/src/activities/sql.ts Adds in-function caches to reduce repeated existence checks during activity relation upserts.
services/apps/data_sink_worker/src/service/organization.service.ts Enables throttled updatedAt updates for data sink worker organization upserts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Signed-off-by: Uroš Marolt <uros@marolt.me>
…on logging

Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Uroš Marolt <uros@marolt.me>
@themarolt themarolt merged commit d6a76c9 into main Mar 25, 2026
11 checks passed
@themarolt themarolt deleted the fix/CM-1054-throttle-org-updated-at branch March 25, 2026 21:22
mbani01 pushed a commit that referenced this pull request Mar 26, 2026
Signed-off-by: Uroš Marolt <uros@marolt.me>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants