feat(seer): Add helper for bulk updating Seer project settings#115756
Conversation
…ings Refactor update_seer_project_settings to extract _get_seer_project_options_to_update so the option-mapping logic can be reused by both the single-project and upcoming bulk-project code paths. Add bulk_update_seer_project_settings that uses bulk_create with upsert and bulk delete instead of per-project loops.
| scannerAutomation: bool | ||
|
|
||
|
|
||
| def update_seer_project_settings(project: Project, data: SeerProjectSettingsUpdate) -> None: |
There was a problem hiding this comment.
This function's logic is unchanged, we just return the options to set and clear instead of doing it right there, so that we can do both single and bulk db ops.
| ).exists() | ||
|
|
||
|
|
||
| class TestBulkUpdateSeerProjectSettings(TestCase): |
There was a problem hiding this comment.
Kept this pretty simple since the business logic is already tested in TestUpdateSeerProjectSettings.
| # For all projects, manually reload cache and invalidate Relay config | ||
| # since bulk ProjectOption operations bypass update_option/delete_option. | ||
| for project_id in project_ids: | ||
| ProjectOption.objects.reload_cache(project_id, "projectoption.bulk_set_value") |
There was a problem hiding this comment.
Note that we always reload the cache for each project, instead of checking first if any options were actually changed like ProjectOptionManager.set_value does. I thought it might get too complicated otherwise
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 184eeb2. Configure here.
| "sentry:seer_scanner_automation", data["scannerAutomation"], default=True | ||
| with transaction.atomic(using=router.db_for_write(ProjectOption)): | ||
| # Lock project rows to serialize concurrent writes. | ||
| list(Project.objects.select_for_update().filter(id__in=project_ids).order_by("id")) |
There was a problem hiding this comment.
Necessary to select for update, because we are modifying multiple related options that should be treated as a group (like handoff). Here is our current project count distribution--99.9% of our orgs have less than 500 projects, 99.34% less than 25. Given those stats, wondering if it's necessary to batch so that we don't lock up many projects at once / cause timeouts? I don't see any precedent for this in other code, but figured I'd bring it up
|
|
||
| # Manually reload each project's cache, since _raw_delete and bulk_create | ||
| # bypass the cache reloading in update_option and delete_option. | ||
| for project_id in project_ids: |
There was a problem hiding this comment.
if this ends up being prohibitively slow, we could always move this to an async task.
JoshFerge
left a comment
There was a problem hiding this comment.
looks great! just a couple of small comments.
Fixes CW-1285, AIML-2753 Depends on #115230, #115756 Adds a bulk/org-level endpoint for managing per-project Seer settings across multiple projects: - `GET /api/0/organizations/{org}/seer/projects/` — paginated list with search/sort/filter - `PUT /api/0/organizations/{org}/seer/projects/` — bulk update across multiple projects Use the helpers from #115037 and #115756 to translate high-level fields (agent, integrationId, stoppingPoint, scannerAutomation) into project options. ### Supported filters (via `query` parameter) | Filter | Operators | Example | |---|---|---| | `id` | `=`, `!=`, `IN`, `NOT IN` | `id:1`, `id:[1,2,3]` | | `name` (free text) | `=`, `!=` | `my-project` | | `reposCount` | `=`, `!=`, `>`, `<`, `>=`, `<=` | `reposCount:>0` | | `stoppingPoint` | `=`, `!=`, `IN`, `NOT IN` | `stoppingPoint:off` | | `agent` | `=`, `!=`, `IN`, `NOT IN` | `agent:seer`, `!agent:cursor_background_agent` | ### Supported sort fields (via `sortBy` parameter) `name`, `-name`, `reposCount`, `-reposCount`, `agent`, `-agent`, `stoppingPoint`, `-stoppingPoint` ### Example: GET paginated list with default sort ``` GET /api/0/organizations/sentry/seer/projects/ ``` ```json [ { "projectId": "2", "projectSlug": "test-seer-settings", "agent": "seer", "integrationId": null, "stoppingPoint": "code_changes", "scannerAutomation": true, "reposCount": 1 }, { "projectId": "5", "projectSlug": "z-project", "agent": "cursor_background_agent", "integrationId": "42", "stoppingPoint": "open_pr", "scannerAutomation": false, "reposCount": 3 } ] ``` ### Example: GET with search and sort ``` GET /api/0/organizations/sentry/seer/projects/ {"query": "reposCount:>0 agent:seer", "sortBy": "-reposCount"} ``` Returns only Seer-agent projects with at least one repo, sorted by repo count descending. ### Example: GET with free text search ``` GET /api/0/organizations/sentry/seer/projects/ {"query": "my-project"} ``` Matches against both project name and slug (case-insensitive). ### Example: PUT bulk update with filter ``` PUT /api/0/organizations/sentry/seer/projects/ {"query": "agent:cursor_background_agent reposCount:>0", "stoppingPoint": "open_pr"} ``` Returns `204 No Content`. Updates only projects matching the query. ### Example: PUT bulk update all projects ``` PUT /api/0/organizations/sentry/seer/projects/ {"scannerAutomation": false} ``` Returns `204 No Content`. With no query, updates all accessible projects. --------- Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>

Relates to CW-1285
Support both single-project and bulk-project Seer settings updates via
update_seer_project_settingsandbulk_update_seer_project_settings.update_seer_project_settingsso that the extracted option-mapping logic can be reused by both the single-project and upcoming bulk-project code paths.bulk_update_seer_project_settingshelper that bulk creates and deletes project options, instead of using per-project loops. We use_raw_deleteto bypass per-rowpost_deletecache reloads, and manually reload each project's cache at the end (bulk_createalso bypassespost_savecache reloads).Why
_raw_delete+ manualreload_cache?ProjectOptionManagerauto-connects apost_deletesignal handler that callsreload_cacheon everyProjectOptionrow deletion. However,bulk_createdoes not trigger anypost_savesignal, so we have to reload each project's cache at the end of the transaction anyway. For a bulk update touching N projects × up to 7 option keys, that's up to 7N redundantreload_cachecalls for row deletions.We considered three approaches:
1. Per-project loop (sequential)
Use
update_option/delete_optionper project inside a single transaction. Simplest, but extremely slow due to serialized DB round-trips.2. Per-project loop (thread pool, ~10 workers)
Parallelize the per-project loop with
ThreadPoolExecutor. ~3x speedup over sequential, but still much slower than bulk due to per-project round-trips. Also adds complexity:connections.close_all()per thread, no shared transaction, choosing a worker count.3. Bulk ops with
_raw_delete(chosen)_raw_deleteis a private DjangoQuerySetmethod that does not hitpost_delete. We pair it withbulk_createfor upserts and do a singlereload_cacheper project at the end. This gives us exactly N cache refreshes instead of up to 7N.We chose
_raw_deleteoverpost_delete.disconnect/connectbecause signal disconnect is global — another request thread deleting aProjectOptionduring our window would miss its signal._raw_deleteis scoped to the queryset, so no thread-safety concern._raw_deleteis private API but stable and has been in Django since 1.5 (2013)--only 2 commits 58b27e0 and ddefc3f have ever touched it and both are cosmetic changes.Local benchmarks (max_workers=10 for thread pool)
Bulk is 6-46x faster than the sequential loop and 3-9x faster than the thread pool. At N=100 (99.9% of orgs), bulk completes in <0.3s vs 13s for the loop.
The bulk approach is also the easiest to migrate away from if we move off
ProjectOptionin the future — the bulk query logic stays the same, we'd just swap the model. A thread pool approach would require ripping out concurrency machinery.