Skip to content

REF_READ_ONLY_DATABASE: provider.configure() still writes to the DB at startup #31

@lewisjared

Description

@lewisjared

Split-off from #29.

Summary

When REF_READ_ONLY_DATABASE=true the API opens the DB via sqlite:///file:<path>?mode=ro&immutable=1&uri=true, but startup crashes before the first request because the provider registry still tries to register diagnostics.

DEBUG | climate_ref_core.providers:configure:82 - Configuring provider esmvaltool ...
...
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) attempt to write a readonly database
[SQL: INSERT INTO diagnostic (slug, name, provider_id, enabled) VALUES (?, ?, ?, ?) RETURNING id, created_at, updated_at]
[parameters: ('ozone-annual-cycle', 'Ozone Diagnostics', 1, 1)]

So REF_READ_ONLY_DATABASE today only changes the connection string; the startup code path still mutates the DB. Until this is fixed, deployments can't actually mount /ref read-only — which was the whole point of the flag.

Where

backend/src/ref_backend/core/ref.py::get_provider_registry:

def get_provider_registry(ref_config: Config) -> ProviderRegistry:
    database = get_database(ref_config)
    return ProviderRegistry.build_from_config(ref_config, database)

ProviderRegistry.build_from_config calls provider.configure(config) per provider (in climate_ref_core.providers), which upserts diagnostic rows.

Proposed fix

Pick one, not all:

  1. Skip registration in read-only mode. If ref_config.db.read_only (or the equivalent flag), construct the registry by loading existing diagnostics out of the DB instead of calling provider.configure(). The API only needs to read what the CLI/workers registered — the writable path stays for ref providers setup and the workers.
  2. Split provider.configure() into two phases. A read path (hydrate the registry from the DB) and a write path (register/update diagnostic rows). The API uses the read path; workers + CLI continue to use the write path.
  3. Idempotent upsert that tolerates a read-only session. Weaker — it still writes on a fresh DB. Only viable combined with an "assume-already-registered" short-circuit when the session is read-only.

(1) is the cleanest: registration becomes an explicit operator action (ref providers setup), and the API stays a pure reader.

Acceptance

  • With a DB already populated by ref db migrate + ref providers setup, the API starts cleanly when:
    • REF_READ_ONLY_DATABASE=true
    • the /ref volume is mounted readOnly: true
  • Existing writable-mode behavior unchanged: ref providers setup still registers diagnostics, workers still register on first start.
  • Ideally: a test similar to tests/test_core/test_ref.py::test_get_database_read_only_rejects_writes that starts the full provider registry against a read-only DB and asserts no write is attempted.

Context

Deployed chart: climate-ref-aft 0.1.0 (PR Climate-REF/climate-ref-aft#7), image ghcr.io/climate-ref/climate-ref-frontend:v0.3.0. Full trail in #29.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions