-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
feat(replays): Add data export #100009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(replays): Add data export #100009
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #100009 +/- ##
===========================================
+ Coverage 81.11% 81.23% +0.11%
===========================================
Files 8595 8589 -6
Lines 380843 381315 +472
Branches 23900 23861 -39
===========================================
+ Hits 308930 309745 +815
+ Misses 71551 71217 -334
+ Partials 362 353 -9 |
🚨 Warning: This pull request contains Frontend and Backend changes! It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently. Have questions? Please ask in the |
|
||
owner = ApiOwner.REPLAY | ||
publish_status = {"POST": ApiPublishStatus.PRIVATE} | ||
permission_classes = (SentryIsAuthenticated,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Pub/Sub Notifications Blocked by Authentication Check
The DataExportNotificationsEndpoint
uses SentryIsAuthenticated
. This endpoint is for Google Cloud Pub/Sub notifications, which don't carry user authentication. This means legitimate Pub/Sub notifications will be rejected, breaking automatic retry functionality for failed transfer jobs.
Issues attributed to commits in this pull requestThis pull request was merged and Sentry observed the following issues:
|
Well that's a bummer. Guess I'll revert then. |
PR reverted: 5c6d595 |
This reverts commit 0dc5ff4. Co-authored-by: cmanallen <6018782+cmanallen@users.noreply.github.com>
Adds initial outlines for a ClickHouse row data export and GCS export. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Implements session replay data export via ClickHouse row pagination and GCS Storage Transfer jobs, with background tasks, new referrer, dependency, and comprehensive tests. > > - **Replays Backend**: > - **New export pipeline** in `src/sentry/replays/data_export.py`: > - ClickHouse row export (`export_clickhouse_rows`, `query_replays_dataset`, `get_replay_date_query_ranges`, CSV via `row_iterator_to_csv`, `save_to_storage`). > - Async tasks: `export_replay_row_set_async`, `export_replay_project_async`, `export_replay_data`. > - GCS Storage Transfer integration: job creation/run (`create_transfer_job`, `request_create_transfer_job`, `request_run_transfer_job`) and retry handler (`retry_transfer_job_run`); blob export scheduler (`export_replay_blob_data`). > - **Snuba Referrer**: add `Referrer.EU_DATA_EXPORT`. > - **Dependencies/Types**: > - Add `google-cloud-storage-transfer` and use `google.type.date_pb2`; allowlist `google.type.*` for mypy. > - **Tests**: > - Add `tests/sentry/replays/{unit,integration}/test_data_export.py` and `tests/sentry/replays/conftest.py` with `ReplayStore` fixture. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit ac5d365. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
This reverts commit 0dc5ff4. Co-authored-by: cmanallen <6018782+cmanallen@users.noreply.github.com>
Adds initial outlines for a ClickHouse row data export and GCS export.
Note
Implements session replay data export via ClickHouse row pagination and GCS Storage Transfer jobs, with background tasks, new referrer, dependency, and comprehensive tests.
src/sentry/replays/data_export.py
:export_clickhouse_rows
,query_replays_dataset
,get_replay_date_query_ranges
, CSV viarow_iterator_to_csv
,save_to_storage
).export_replay_row_set_async
,export_replay_project_async
,export_replay_data
.create_transfer_job
,request_create_transfer_job
,request_run_transfer_job
) and retry handler (retry_transfer_job_run
); blob export scheduler (export_replay_blob_data
).Referrer.EU_DATA_EXPORT
.google-cloud-storage-transfer
and usegoogle.type.date_pb2
; allowlistgoogle.type.*
for mypy.tests/sentry/replays/{unit,integration}/test_data_export.py
andtests/sentry/replays/conftest.py
withReplayStore
fixture.Written by Cursor Bugbot for commit ac5d365. This will update automatically on new commits. Configure here.