Skip to content

add(observation): db pool flushes observation#4668

Open
mkleczek wants to merge 3 commits intoPostgREST:mainfrom
mkleczek:pool-flush-obs
Open

add(observation): db pool flushes observation#4668
mkleczek wants to merge 3 commits intoPostgREST:mainfrom
mkleczek:pool-flush-obs

Conversation

@mkleczek
Copy link
Contributor

Emit a dedicated PoolFlushed observation when the DB pool is released during schema cache reload.

Comment on lines +100 to +101
obsChan <- newChan
appState <- AppState.initWithPool sockets pool config loggerState metricsState (Metrics.observationMetrics metricsState <> writeChan obsChan)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting, so we use the newChan instead of logging to stdout/stderr right?

What would be the advantage of this over the python tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting, so we use the newChan instead of logging to stdout/stderr right?

Haskell makes it so easy to add new observers (Observation -> IO () is a Semigroup) that it is too tempting not to use it :)

What would be the advantage of this over the python tests?

More seriously (this is a mix of more or less subjective opinions):

It is not really clear to me what is the project policy on "what tests should go where". My take on this is that I really think it is quite a bad idea to base your business logic tests on debug output.

I spent a considerable amount of time chasing failing io tests after adding this new observation and was really annoyed with the experience. So decided to move to HSpec tests instead.

I think HSpec tests are really well suited for logic testing. They are type safe and make refactoring easy. Also, reading observations as a stream of typed messages in tests (instead of untyped string debug output) makes the tests less brittle and more readable IMHO.

Chan is a simple but powerful concurrency primitive and, IMHO, perfect fit for testing asynchronous messaging scenarios. See https://github.com/PostgREST/postgrest/pull/4668/changes#diff-0f4d50d33174e92e7c2b8121202cbcb7923360111854c13c2f4baa9eadb80602R44 in this PR. Thanks to some higher level functions based on Chan and dupChan we have a simple DSL to validate observations that also provides nice error messages that simplify debugging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a considerable amount of time chasing failing io tests after adding this new observation and was really annoyed with the experience. So decided to move to HSpec tests instead.

Agree, I've had the same experience before.

My take on this is that I really think it is quite a bad idea to base your business logic tests on debug output.
Also, reading observations as a stream of typed messages in tests (instead of untyped string debug output) makes the tests less brittle and more readable IMHO.

Agree too. I like the direction overall.


Sometimes we've had cases where the ordering of the log messages change and that was reflected in tests. Say if we decided to move all logging python tests to Haskell, would a change in the order be detected too? (order does matter in some cases, for clarity)

Copy link
Contributor Author

@mkleczek mkleczek Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes we've had cases where the ordering of the log messages change and that was reflected in tests. Say if we decided to move all logging python tests to Haskell, would a change in the order be detected too? (order does matter in some cases, for clarity)

That depends on how observations order is related to debug log entries order. They should be the same but we don't test it in HSpec tests - we only read observations from a Chan.
There is a simple helper function waitFor which reads from the Chan skipping messages until it reads the message meeting provided criteria (or hangs waiting for an observation (or timeout) if none available). I think it is sufficient primitive for most cases.
You can test if required observations happen in right order (but possibly with other observations in between), eg:

        waitFor (2 * sec) "PoolFlushed 2" $ \x -> [ o | o@PoolFlushed <- pure x ]
        waitFor (1 * sec) "SchemaCacheLoadedObs" $ \x -> [ o | o@SchemaCacheLoadedObs{} <- pure x ]

This will skip SchemaCacheQueriedObs in between.

If necessary we can implement non-lenient validation.

Let's move this discussion to #4671 (comment)

Some helpers are provided for introspecting metrics already (used in JWT cache tests). This change provides facilities to additionally validate emited Observation events.
A new Spec module is also implemented, adding basic tests of schema cache reloading - their main goal is to excercise the new infrastructure.
@mkleczek mkleczek force-pushed the pool-flush-obs branch 3 times, most recently from 9d07481 to 3725ed3 Compare March 2, 2026 22:17
This change introduces a match_log function, that implements lenient matching of log output lines. Tests of log output are retrofited to use the function.
Emit a dedicated PoolFlushed observation when the DB pool is released during schema cache reload.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants