-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Labels
backendBackend relatedBackend relatedenhancementNew feature or requestNew feature or requestpriority:lowLow priorityLow priority
Description
Problem
The nightly cron only captures campaigns sorted by newest, so any campaign that was already live before the system was first deployed is never ingested (it will have scrolled far past page 10 in the newest listing). Users who set keyword alerts on day 1 will get no matches for campaigns that have been running for weeks.
Expected Behaviour
A one-time (or periodic catch-up) backfill run that fetches campaigns across all sort orders and deeper page depths to seed the database with historically active campaigns.
Proposed Fix
- Add a
/admin/backfillendpoint (or a CLI flag) that triggers a deep crawl:- All 15 root categories (or subcategories once feat: crawl subcategories in nightly cron (currently only 15 root categories) #12 is resolved)
- Sort orders:
magic,end_date,most_backed - Page depth: 25–50 pages per category
- Run once after deploy; subsequent nightly crons maintain freshness
- Rate-limit to avoid hammering ScrapingBee (existing
RateLimitercan be reused)
Notes
- One-time cost estimate: 15 categories × 3 sorts × 25 pages = 1,125 ScrapingBee requests × 5 credits = 5,625 credits (< 3% of monthly allowance)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
backendBackend relatedBackend relatedenhancementNew feature or requestNew feature or requestpriority:lowLow priorityLow priority