Skip to content

pdo_pgsql_pool_killed_concurrent #114

@EdmondDantes

Description

@EdmondDantes

Summary

Test tests/pdo_pgsql/029-pdo_pgsql_pool_killed_concurrent.phpt периодически падает на CI с диффом:

```

  • Pool count: 1
  • Pool count: 2
    Done
    ```

После `pg_terminate_backend` ожидается, что pool count упадёт с 2 до 1 в течение polling-окна, но иногда счётчик остаётся 2.

Latest occurrence

Prior history of the same test

The polling-window extension was tried and reverted, so the underlying race is not "polling too short" — there is something else going on.

Hypothesis

After `pg_terminate_backend(pid)`:

  • libpq on the killed connection's next syscall must observe `SSL_read` / `recv` failure and the pool's broken-connection detector must mark the slot as broken and decrement count.
  • If the killed coroutine is parked in a different state (e.g., not actively reading, or the next event loop tick hasn't yet processed the terminated socket), `pool->count()` reports the stale value.

Possibly the broken-connection callback isn't wired to fire when the connection is idle in the pool (only when actively doing IO), so until somebody tries to use that slot, it isn't released.

Repro

```bash
cd ext/async
for i in {1..30}; do
/usr/local/bin/php-cgi ../../run-tests.php -P -q -j4
tests/pdo_pgsql/029-pdo_pgsql_pool_killed_concurrent.phpt 2>&1
| grep -E "PASS|FAIL"
done
```

Suggested next steps

  1. Reproduce locally under load (the failure surfaces on CI runners more often than on dev boxes — possibly CPU-starved scheduling).
  2. Inspect what triggers the slot release: is it the broken-connection callback firing on the libpq fd, or a polling check?
  3. If callback-driven: ensure the callback is registered while the slot is idle, not only while a coroutine holds it.
  4. If polling: investigate whether 5s window from `libuv reactor: getaddrinfo result leaks when consumer is cancelled before disposal #111` was actually enough (the revert may have been premature).

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions