Skip to content

feat: Add cron job to schedule indexer job#106377

Merged
shruthilayaj merged 7 commits intomasterfrom
shruthi/feat/schedule-explorer-index-tasks
Jan 15, 2026
Merged

feat: Add cron job to schedule indexer job#106377
shruthilayaj merged 7 commits intomasterfrom
shruthi/feat/schedule-explorer-index-tasks

Conversation

@shruthilayaj
Copy link
Member

@shruthilayaj shruthilayaj commented Jan 15, 2026

Adds a periodic task to schedule Seer Explorer indexing
for projects in organizations with the seer-explorer-index
feature flag enabled. The cron tab is set for every hour,
just to make sure the task gets picked up in case a deploy or
scheduler restart causes a missed window.
Storing last_run in cache so it's only run every 24 hours.

This main task batches projects and for every batch of 100
spawns a task that calls seer's /v1/automation/explorer/index
endpoint with computed delays (same logic as statistical detectors).

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jan 15, 2026
},
"seer-explorer-index": {
"task": "seer:sentry.tasks.seer_explorer_index.schedule_explorer_index",
"schedule": task_crontab("0", "*/1", "*", "*", "*"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running it every hour, just to make sure the task gets picked up in case a deploy or scheduler restart causes a missed window. Storing last_run in cache so it's only run every 24 hours.

@shruthilayaj shruthilayaj marked this pull request as ready for review January 15, 2026 18:08
@shruthilayaj shruthilayaj requested review from a team and roaga January 15, 2026 18:08
if last_run and last_run > django_timezone.now() - EXPLORER_INDEX_RUN_FREQUENCY:
return

cache.set(LAST_RUN_CACHE_KEY, django_timezone.now(), LAST_RUN_CACHE_TIMEOUT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache set before work completes blocks retries for 24 hours

Medium Severity

The cache.set() call at line 67 stores the last run timestamp before the actual work is performed (lines 75-76 where the generator is consumed). If any exception occurs during dispatch_explorer_index_projects() or get_seer_explorer_enabled_projects(), the cache entry is already set. Subsequent hourly runs will hit the early-return check at lines 63-65 and skip execution for ~24 hours, defeating the stated goal of running hourly "to make sure the task gets picked up in case a deploy or scheduler restart causes a missed window."

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Contributor

@roaga roaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me. are we making sure that we're not running both this and the cron job defined in seer?

@shruthilayaj
Copy link
Member Author

makes sense to me. are we making sure that we're not running both this and the cron job defined in seer?

I've added an option to make sure this doesn't just start running when merged. I'll kill the seer job once I confirm this works!

@shruthilayaj shruthilayaj merged commit 78971e3 into master Jan 15, 2026
66 checks passed
@shruthilayaj shruthilayaj deleted the shruthi/feat/schedule-explorer-index-tasks branch January 15, 2026 20:35
@github-actions github-actions bot locked and limited conversation to collaborators Jan 31, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants