Skip to content

feat(ws): improve warehouse-sources-load observability#60231

Merged
estefaniarabadan merged 2 commits into
masterfrom
posthog-code/improve-warehouse-sources-load-logging
May 27, 2026
Merged

feat(ws): improve warehouse-sources-load observability#60231
estefaniarabadan merged 2 commits into
masterfrom
posthog-code/improve-warehouse-sources-load-logging

Conversation

@estefaniarabadan
Copy link
Copy Markdown
Contributor

@estefaniarabadan estefaniarabadan commented May 27, 2026

Problem

we need more logs to understand what's happening under pipelines V3

Changes

  • add logs on pipeline V3 (both producer and consumer)

How did you test this code?

Test pass

Publish to changelog?

NO

@estefaniarabadan estefaniarabadan requested a review from a team May 27, 2026 11:48
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 27, 2026

Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
posthog/temporal/data_imports/pipelines/pipeline_v3/load/processor.py:140
Using `capture_exception` with a freshly-constructed `Exception` is a misuse of the error-tracking API. Sentry treats this as an exception event: it groups all occurrences under the same issue (no real traceback, no surrounding context), and the grouping algorithm will conflate different counts as one noisy fingerprint. The situation has already been logged at `WARNING` level by structlog immediately above — if Sentry alerting is desired, `capture_message` is the appropriate call here since there is no real exception to capture.

```suggestion
            capture_message(f"Found {len(modified_files)} modified delta files during first sync")
```

Reviews (1): Last reviewed commit: "add batch index" | Re-trigger Greptile

"Found modified files during first sync, skipping partial data loading",
batch_index=export_signal.batch_index,
modified_count=len(modified_files),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Using capture_exception with a freshly-constructed Exception is a misuse of the error-tracking API. Sentry treats this as an exception event: it groups all occurrences under the same issue (no real traceback, no surrounding context), and the grouping algorithm will conflate different counts as one noisy fingerprint. The situation has already been logged at WARNING level by structlog immediately above — if Sentry alerting is desired, capture_message is the appropriate call here since there is no real exception to capture.

Suggested change
)
capture_message(f"Found {len(modified_files)} modified delta files during first sync")
Prompt To Fix With AI
This is a comment left during a code review.
Path: posthog/temporal/data_imports/pipelines/pipeline_v3/load/processor.py
Line: 140

Comment:
Using `capture_exception` with a freshly-constructed `Exception` is a misuse of the error-tracking API. Sentry treats this as an exception event: it groups all occurrences under the same issue (no real traceback, no surrounding context), and the grouping algorithm will conflate different counts as one noisy fingerprint. The situation has already been logged at `WARNING` level by structlog immediately above — if Sentry alerting is desired, `capture_message` is the appropriate call here since there is no real exception to capture.

```suggestion
            capture_message(f"Found {len(modified_files)} modified delta files during first sync")
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentry? I though that was phased out. Does this make sense for us?

@estefaniarabadan estefaniarabadan enabled auto-merge (squash) May 27, 2026 15:15
Copy link
Copy Markdown
Contributor

@danielcarletti danielcarletti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOGS BABYYYYY

@estefaniarabadan estefaniarabadan merged commit 3406cb0 into master May 27, 2026
223 of 224 checks passed
@estefaniarabadan estefaniarabadan deleted the posthog-code/improve-warehouse-sources-load-logging branch May 27, 2026 15:19
@deployment-status-posthog
Copy link
Copy Markdown

deployment-status-posthog Bot commented May 27, 2026

Deploy status

Environment Status Deployed At Workflow
dev ✅ Deployed 2026-05-27 16:01 UTC Run
prod-us ✅ Deployed 2026-05-27 16:16 UTC Run
prod-eu ✅ Deployed 2026-05-27 16:16 UTC Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants