Skip to content

Harden search cache and add pg_stat_statements/pg_cron support#440

Merged
bitner merged 26 commits into
mainfrom
v010-pr1-hash-and-dead-code-rerun
May 13, 2026
Merged

Harden search cache and add pg_stat_statements/pg_cron support#440
bitner merged 26 commits into
mainfrom
v010-pr1-hash-and-dead-code-rerun

Conversation

@bitner
Copy link
Copy Markdown
Collaborator

@bitner bitner commented May 12, 2026

Summary

This branch tightens PgSTAC search cache behavior, improves concurrency safety around search stats, and adds first-class support for pg_stat_statements and pg_cron in the pgstac Docker image and test flow.

What changed

  • Reworked search cache handling to use a canonical where-clause hash and better concurrency controls when creating and touching cached searches.
  • Added search stats refresh support so updatestats flows through cached and uncached search paths and keeps numberMatched / context counts current.
  • Add columns and tooling to be able to save named queries, to pin queries so they won't be cleaned by TTL values, and to have garbage collection that can clean up not-recently used, named, or pinned queries.
  • Add garbage collection function for searches (this still must be called by something such as pg_cron).
  • Added pg_stat_statements and pg_cron to the pgstac image, enabled them through shared_preload_libraries, and initialized them during container bootstrap.
  • Added smoke tests for both extensions in scripts/container-scripts/test.
  • Updated pgtap/basic SQL coverage around search, token handling, readonly behavior, and related cache behavior.

Why

The goal of the search-cache work is to reduce collision risk, avoid stale or inconsistent stats under concurrent requests, and make the search implementation easier to maintain by keeping hashing and cache logic in the same SQL module. We also want to be able to clear out the cache for any searches that we don't need to actively be able to look up by name or hash (for example for titler-pgstac integration).

The extension work makes the standard test image closer to the production runtime and gives us direct verification that both extensions are loaded and usable, rather than assuming the container configuration is correct.

bitner and others added 18 commits May 5, 2026 17:00
Co-authored-by: Pete Gadomski <pete.gadomski@gmail.com>
Co-authored-by: Pete Gadomski <pete.gadomski@gmail.com>
…ions

- Update pgstac-migrate pyproject.toml to require pgpkg>=0.1.1 (includes routine body-change detection)
- Regenerate migrations with pgpkg 0.1.1 which correctly includes search/search_query replacements
- Suppress unsafe DROP FUNCTION statements for routines that exist in target schema
- Fix PGTap test 116 to check column names in alphabetical order (migration adds columns at end)
- Update test plan count from 229 to 248 (tests added for GC, context_count, statslastupdated)
- Validate migration chain end-to-end with all tests passing
- All precommit hooks passing (migrations, pgtap, pypgstac)
- expand pgstac-migrate README with full CLI/API/env var docs and troubleshooting
- make psycopg[binary] mandatory in pgstac-migrate and pypgstac
- make psycopg-pool mandatory in pypgstac
- remove redundant psycopg optional/group wiring and update test script flags
- remove pgstac-migrate upper bound in pypgstac dependency
- update release workflow paths and uv setup/build step
- refresh docs/changelog references for pgpkg>=0.1.1
- regenerate uv lockfiles
…ash-and-dead-code-rerun

# Conflicts:
#	src/pgstac-migrate/pyproject.toml
#	src/pgstac-migrate/uv.lock
…d-code-rerun

# Conflicts:
#	.github/instructions/scripts.instructions.md
#	.gitignore
#	AGENTS.md
#	CLAUDE.md
#	src/pgstac/migrations/pgstac--0.9.11--unreleased.sql
@bitner bitner marked this pull request as ready for review May 12, 2026 19:20
@bitner bitner requested review from gadomski and hrodmn May 12, 2026 19:20
Copy link
Copy Markdown
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to update the best-practices docs to tell folks to set up a cronjob to clean up searches?

@bitner
Copy link
Copy Markdown
Collaborator Author

bitner commented May 13, 2026

Do we need to update the best-practices docs to tell folks to set up a cronjob to clean up searches?

At the end of this series of PRs, there's going to need to be a big clean up of docs including several "cron ready" functions/procedures that I'd rather document all together.

@bitner bitner merged commit 1b810ed into main May 13, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants