Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/sentry/profiles/data_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def export_profiles_data(
gcp_project_id=gcp_project_id,
transfer_job_name=None,
source_bucket=source_bucket,
source_prefix=f"{organization_id}",
source_prefix=f"{organization_id}/",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: export_replay_blob_data creates GCS transfer jobs with source_prefix values missing a trailing slash, violating GCS API requirements.
Severity: CRITICAL | Confidence: 1.00

🔍 Detailed Analysis

The export_replay_blob_data function in src/sentry/replays/data_export.py constructs source_prefix values like "30/1", "60/1", "90/1" without a trailing slash. The Google Cloud Storage Transfer Service API requires source paths to end with a /. This will cause google.api_core.exceptions.InvalidArgument: 400 GCS source Path argument must end with '/' errors when attempting to create transfer jobs, making replay blob exports non-functional.

💡 Suggested Fix

Modify src/sentry/replays/data_export.py to append a trailing slash to source_prefix in export_replay_blob_data, e.g., source_prefix=f"{retention_days}/{project_id}/". Update test_export_replay_blob_data() to assert the correct path format.

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: src/sentry/profiles/data_export.py#L47

Potential issue: The `export_replay_blob_data` function in
`src/sentry/replays/data_export.py` constructs `source_prefix` values like `"30/1"`,
`"60/1"`, `"90/1"` without a trailing slash. The Google Cloud Storage Transfer Service
API requires source paths to end with a `/`. This will cause
`google.api_core.exceptions.InvalidArgument: 400 GCS source Path argument must end with
'/'` errors when attempting to create transfer jobs, making replay blob exports
non-functional.

Did we get this right? 👍 / 👎 to inform future reviews.

Reference_id: 2760760

destination_bucket=destination_bucket,
destination_prefix=destination_prefix,
notification_topic=pubsub_topic_name,
Expand Down
2 changes: 1 addition & 1 deletion src/sentry/replays/data_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -593,7 +593,7 @@ def export_replay_blob_data[T](
gcp_project_id=gcp_project_id,
transfer_job_name=None,
source_bucket=source_bucket,
source_prefix=f"{retention_days}/{project_id}",
source_prefix=f"{retention_days}/{project_id}/",
destination_bucket=destination_bucket,
destination_prefix=destination_prefix,
notification_topic=pubsub_topic_name,
Expand Down
6 changes: 3 additions & 3 deletions tests/sentry/replays/unit/test_data_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,6 @@ def test_export_replay_blob_data() -> None:

# Assert a job is created for each retention-period.
assert len(jobs) == 3
assert jobs[0].transfer_job.transfer_spec.gcs_data_source.path == "30/1"
assert jobs[1].transfer_job.transfer_spec.gcs_data_source.path == "60/1"
assert jobs[2].transfer_job.transfer_spec.gcs_data_source.path == "90/1"
assert jobs[0].transfer_job.transfer_spec.gcs_data_source.path == "30/1/"
assert jobs[1].transfer_job.transfer_spec.gcs_data_source.path == "60/1/"
assert jobs[2].transfer_job.transfer_spec.gcs_data_source.path == "90/1/"
Loading