Add one-time startup task system (PP-3684) by jonathangreen · Pull Request #3035 · ThePalaceProject/circulation

jonathangreen · 2026-02-10T15:55:33Z

Description

Add a one-time startup task system that auto-discovers and executes tasks on the first application start after deployment. This provides a clean mechanism for post-deployment work that doesn't belong in a database migration. The main purpose of this system is to queue celery tasks, but it can also do other necessary work like cache invalidations, etc.

Each task is a Python file in the top-level startup_tasks/ directory that defines a run(services, session, log) callable. Tasks can perform inline work with the database/Redis/search or return a Celery Signature to dispatch heavy work asynchronously. On each container start, the initialization script discovers all task files, checks the startup_tasks database table for previously executed keys, and runs any new ones. The process is idempotent — each task runs only once.

Motivation and Context

After deploying new code (e.g. adding streaming media support), we sometimes need to run one-time tasks such as force-harvesting all collections. Alembic migrations handle schema changes but aren't suited for queuing long-running async Celery work or performing post-deployment operations that need access to the full services container. This system provides a clean, self-documenting mechanism for these post-deployment tasks.

PP-3684

How Has This Been Tested?

Comprehensive unit tests for discovery (TestDiscoverStartupTasks), execution (TestRunStartupTasks), and scaffolding (TestCreateStartupTask)
Tests cover: normal discovery, invalid modules, non-callable run attributes, import errors, underscore file skipping, nonexistent directory, executing new tasks, Celery signature dispatch, skipping already-executed tasks, failure handling, idempotency, stamp-only mode for fresh installs, and CLI edge cases
Integration tests in test_initialization.py verify call ordering and already_initialized forwarding
mypy strict mode passes on all new code
pre-commit hooks pass on all files

Checklist

I have updated the documentation accordingly.
All new and existing tests passed.

dbernstein · 2026-02-11T00:21:01Z

This is very cool. Great idea. I love the alembic inspired design.

codecov · 2026-02-11T19:42:17Z

Codecov Report

❌ Patch coverage is 96.77419% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.05%. Comparing base (9b984be) to head (72f1a3f).

Files with missing lines	Patch %	Lines
src/palace/manager/scripts/startup.py	96.24%	3 Missing and 2 partials ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##             main    #3035    +/-   ##
========================================
  Coverage   93.04%   93.05%            
========================================
  Files         480      482     +2     
  Lines       43716    43869   +153     
  Branches     6027     6045    +18     
========================================
+ Hits        40677    40823   +146     
- Misses       1968     1972     +4     
- Partials     1071     1074     +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov · 2026-02-17T15:06:12Z

Codecov Report

❌ Patch coverage is 98.73418% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.18%. Comparing base (7bd958a) to head (5c109eb).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/palace/manager/scripts/startup.py	98.52%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3035      +/-   ##
==========================================
+ Coverage   93.16%   93.18%   +0.01%     
==========================================
  Files         487      489       +2     
  Lines       44787    44943     +156     
  Branches     6173     6191      +18     
==========================================
+ Hits        41726    41880     +154     
- Misses       1985     1986       +1     
- Partials     1076     1077       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/palace/manager/scripts/startup.py

dbernstein

Looks good. A couple of minor comments to consider as you see fit.

dbernstein · 2026-02-17T20:03:19Z

src/palace/manager/scripts/startup.py

+                    dispatched_task_id = async_result.id
+                _record_task(session, key, state=StartupTaskState.RUN)
+        except Exception:
+            logger.exception("Failed to execute startup task %r.", key)


If this task fails to start (or at least queued in celery), is it correct to say that the next time this routine ran, the tasks that failed to run would still be pending and thus would be run again?

Perhaps we'll need a new cloudwatch alert or some other mechanism to let us know if there is an accumulation pending task failures?

Thats right. The task the failed to run won't stop the startup, but will output a log message.

This log message will output in migrate.log, which shows up in ansible in the Read migration log task. So if there is a failure we should see it when we do the upgrade.

Your right though, a cloudwatch alert is a good idea when we automate those upgrades.

Introduce a general-purpose registry for one-time startup tasks that are automatically discovered and dispatched to Celery on the first application start after deployment. This is useful for data backfills, re-imports, or reindexing that are too long-running for database migrations. - Add StartupTask SQLAlchemy model and migration for tracking queued tasks - Add auto-discovery system that scans startup_tasks/ for Python files defining a create_signature() callable - Add StartupTaskRunner integrated into InstanceInitializationScript, protected by the existing advisory lock - On fresh database installs, tasks are stamped without dispatching (no existing data to migrate) - Add bin/create_startup_task scaffolding command - Include example task for force-harvesting OPDS for Distributors

- Revert bin/util/initialize_instance to pass config_file instead of repo_root, matching the InstanceInitializationScript constructor - Resolve tasks_dir=None default to STARTUP_TASKS_DIR in run_startup_tasks and stamp_startup_tasks - Fix test_initialization.py tests to match actual API signatures and return value semantics for initialize_database and run_startup_tasks - Remove tests for nonexistent error-handling behavior - Update run_startup_tasks docstring to accurately describe error handling - Record task execution in the same transaction as the task itself to prevent duplicate Celery dispatches on crash - Rename queued_at column to recorded_at - Replace run boolean column with state enum (RUN, MARKED) - Fix StartupTaskCallable type alias to include logger parameter - Fix test task module signatures to match actual 3-arg call convention

Move apply_async() inside the transaction so broker failures roll back the task record, allowing retry on next startup. Add enum type cleanup to the migration downgrade to match project conventions. Add migration tests and a Celery dispatch retry test.

Move the run-vs-stamp branching from initialization.py into startup.py behind a single run_startup_tasks() function with an already_initialized parameter. Trim the module docstring to a README pointer. Add slug length truncation to create_startup_task.

The glob results are already iterated in sorted order, so the dict is built in key order and the final sorted() call was unnecessary.

- Fix template rendering crash on descriptions with braces by using str.replace instead of str.format - Rename ambiguous local variable db_initialized to already_initialized in initialization script - Document that startup tasks block application startup and should dispatch heavy work via Celery - Fix docstring to say "sorted by filename" instead of "sorted by key"

- Replace DatabaseTransactionFixture + Session patching with function_database for realistic engine-based test isolation - Extract RunStartupTasksFixture to share mocked discover_startup_tasks and services across all tests - Merge duplicate failure test into test_run_handles_failure_gracefully - Remove _engine context manager and related imports

Replace str.replace with a Jinja2 template for the startup task scaffolding to avoid ambiguity with brace-style placeholders. Also add an else branch so non-Celery tasks log "Executed" while Celery tasks only log "dispatched", avoiding a redundant double log.

Replace per-task "already executed; skipping" log lines with a single summary count to avoid spamming logs on every startup as tasks accumulate.

Fix grammar, correct the example filename format, remove the code example with an incorrect function signature, and simplify the documentation.

…PP-3684) (#3055) ## Description > Note: #3035 needs to get merged before this can go in. Adds a startup task that dispatches a forced re-harvest of all OPDS for Distributors collections on the next deployment. ## Motivation and Context After streaming media support landed (PR #3015), existing OPDS for Distributors collections need to be re-imported with the new parsing logic to pick up the changes. Resolves PP-3684 ## How Has This Been Tested? The startup task follows the established pattern and delegates to the existing `import_all` Celery task with `force=True`. The `import_all` task is already covered by existing tests. ## Checklist - [x] I have updated the documentation accordingly. - [x] All new and existing tests passed.

jonathangreen added DB migration This PR contains a DB migration feature New feature labels Feb 10, 2026

jonathangreen force-pushed the feature/startup-tasks branch from 24301f8 to d0b142e Compare February 11, 2026 19:33

jonathangreen force-pushed the feature/startup-tasks branch from a868599 to 5b9e3bb Compare February 17, 2026 13:29

jonathangreen changed the title ~~Add one-time startup task system for post-deployment Celery jobs (PP-3684)~~ Add one-time startup task system (PP-3684) Feb 17, 2026

jonathangreen marked this pull request as ready for review February 17, 2026 16:10

jonathangreen requested a review from a team February 17, 2026 16:10

jonathangreen mentioned this pull request Feb 17, 2026

Add startup task to force re-import OPDS for Distributors collections (PP-3684) #3055

Merged

2 tasks

dbernstein self-requested a review February 17, 2026 19:40

dbernstein reviewed Feb 17, 2026

View reviewed changes

src/palace/manager/scripts/startup.py Outdated Show resolved Hide resolved

dbernstein reviewed Feb 17, 2026

View reviewed changes

jonathangreen added 16 commits February 18, 2026 10:23

Remove redundant sort in discover_startup_tasks

14290aa

The glob results are already iterated in sorted order, so the dict is built in key order and the final sorted() call was unnecessary.

Use existing create helper for startup task recording

04aa7b2

Rename CLI entry point to palace-startup-task and fix README timestamps

ddbafd4

We don't need that test

89c1500

Remove redundant tests

e17f319

Add test for non-callable run attribute and .gitkeep for startup_tasks

5370c82

Reduce startup task log noise by summarizing already-run tasks

40b440d

Replace per-task "already executed; skipping" log lines with a single summary count to avoid spamming logs on every startup as tasks accumulate.

Remove the migration, it will come in in a seperate PR

3a331f6

update config

2aa6c95

jonathangreen added 2 commits February 18, 2026 10:23

Improve startup tasks README section for clarity and correctness

933e8c3

Fix grammar, correct the example filename format, remove the code example with an incorrect function signature, and simplify the documentation.

Use f-strings in startup task logging calls

5c109eb

jonathangreen force-pushed the feature/startup-tasks branch from 86d1de9 to 5c109eb Compare February 18, 2026 14:24

dbernstein approved these changes Feb 18, 2026

View reviewed changes

jonathangreen merged commit 6624070 into main Feb 18, 2026
19 checks passed

jonathangreen deleted the feature/startup-tasks branch February 18, 2026 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add one-time startup task system (PP-3684)#3035

Add one-time startup task system (PP-3684)#3035
jonathangreen merged 18 commits intomainfrom
feature/startup-tasks

jonathangreen commented Feb 10, 2026 •

edited

Loading

Uh oh!

dbernstein commented Feb 11, 2026

Uh oh!

codecov bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

dbernstein left a comment

Uh oh!

dbernstein Feb 17, 2026

Uh oh!

jonathangreen Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jonathangreen commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How Has This Been Tested?

Checklist

Uh oh!

dbernstein commented Feb 11, 2026

Uh oh!

codecov bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codecov bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

dbernstein left a comment

Choose a reason for hiding this comment

Uh oh!

dbernstein Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

jonathangreen Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jonathangreen commented Feb 10, 2026 •

edited

Loading

codecov bot commented Feb 11, 2026 •

edited

Loading

codecov bot commented Feb 17, 2026 •

edited

Loading