Skip to content

Conversation

cmanallen
Copy link
Member

@cmanallen cmanallen commented Sep 22, 2025

Adds initial outlines for a ClickHouse row data export and GCS export.


Note

Implements session replay data export via ClickHouse row pagination and GCS Storage Transfer jobs, with background tasks, new referrer, dependency, and comprehensive tests.

  • Replays Backend:
    • New export pipeline in src/sentry/replays/data_export.py:
      • ClickHouse row export (export_clickhouse_rows, query_replays_dataset, get_replay_date_query_ranges, CSV via row_iterator_to_csv, save_to_storage).
      • Async tasks: export_replay_row_set_async, export_replay_project_async, export_replay_data.
      • GCS Storage Transfer integration: job creation/run (create_transfer_job, request_create_transfer_job, request_run_transfer_job) and retry handler (retry_transfer_job_run); blob export scheduler (export_replay_blob_data).
  • Snuba Referrer: add Referrer.EU_DATA_EXPORT.
  • Dependencies/Types:
    • Add google-cloud-storage-transfer and use google.type.date_pb2; allowlist google.type.* for mypy.
  • Tests:
    • Add tests/sentry/replays/{unit,integration}/test_data_export.py and tests/sentry/replays/conftest.py with ReplayStore fixture.

Written by Cursor Bugbot for commit ac5d365. This will update automatically on new commits. Configure here.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Sep 22, 2025
Copy link

codecov bot commented Sep 22, 2025

Codecov Report

❌ Patch coverage is 92.35294% with 13 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/replays/data_export.py 91.39% 13 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #100009      +/-   ##
===========================================
+ Coverage   81.11%    81.23%   +0.11%     
===========================================
  Files        8595      8589       -6     
  Lines      380843    381315     +472     
  Branches    23900     23861      -39     
===========================================
+ Hits       308930    309745     +815     
+ Misses      71551     71217     -334     
+ Partials      362       353       -9     

@github-actions github-actions bot added the Scope: Frontend Automatically applied to PRs that change frontend components label Sep 30, 2025
Copy link
Contributor

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.


owner = ApiOwner.REPLAY
publish_status = {"POST": ApiPublishStatus.PRIVATE}
permission_classes = (SentryIsAuthenticated,)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Pub/Sub Notifications Blocked by Authentication Check

The DataExportNotificationsEndpoint uses SentryIsAuthenticated. This endpoint is for Google Cloud Pub/Sub notifications, which don't carry user authentication. This means legitimate Pub/Sub notifications will be rejected, breaking automatic retry functionality for failed transfer jobs.

Fix in Cursor Fix in Web

@cmanallen cmanallen merged commit 0dc5ff4 into master Oct 1, 2025
69 checks passed
@cmanallen cmanallen deleted the cmanallen/replays-data-export branch October 1, 2025 13:55
Copy link

sentry-io bot commented Oct 1, 2025

Issues attributed to commits in this pull request

This pull request was merged and Sentry observed the following issues:

@cmanallen
Copy link
Member Author

Well that's a bummer. Guess I'll revert then.

@cmanallen cmanallen added the Trigger: Revert Add to a merged PR to revert it (skips CI) label Oct 1, 2025
@getsentry-bot
Copy link
Contributor

PR reverted: 5c6d595

getsentry-bot added a commit that referenced this pull request Oct 1, 2025
This reverts commit 0dc5ff4.

Co-authored-by: cmanallen <6018782+cmanallen@users.noreply.github.com>
priscilawebdev pushed a commit that referenced this pull request Oct 1, 2025
Adds initial outlines for a ClickHouse row data export and GCS export.

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Implements session replay data export via ClickHouse row pagination
and GCS Storage Transfer jobs, with background tasks, new referrer,
dependency, and comprehensive tests.
> 
> - **Replays Backend**:
>   - **New export pipeline** in `src/sentry/replays/data_export.py`:
> - ClickHouse row export (`export_clickhouse_rows`,
`query_replays_dataset`, `get_replay_date_query_ranges`, CSV via
`row_iterator_to_csv`, `save_to_storage`).
> - Async tasks: `export_replay_row_set_async`,
`export_replay_project_async`, `export_replay_data`.
> - GCS Storage Transfer integration: job creation/run
(`create_transfer_job`, `request_create_transfer_job`,
`request_run_transfer_job`) and retry handler
(`retry_transfer_job_run`); blob export scheduler
(`export_replay_blob_data`).
> - **Snuba Referrer**: add `Referrer.EU_DATA_EXPORT`.
> - **Dependencies/Types**:
> - Add `google-cloud-storage-transfer` and use `google.type.date_pb2`;
allowlist `google.type.*` for mypy.
> - **Tests**:
> - Add `tests/sentry/replays/{unit,integration}/test_data_export.py`
and `tests/sentry/replays/conftest.py` with `ReplayStore` fixture.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ac5d365. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
priscilawebdev pushed a commit that referenced this pull request Oct 1, 2025
This reverts commit 0dc5ff4.

Co-authored-by: cmanallen <6018782+cmanallen@users.noreply.github.com>
cmanallen added a commit that referenced this pull request Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components Scope: Frontend Automatically applied to PRs that change frontend components Trigger: Revert Add to a merged PR to revert it (skips CI)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants