`TokenAccessStorage::authenticateImpl` was holding its own recursive
mutex while calling `AccessChangesNotifier::sendNotifications`, which
takes the notifier's mutex and then dispatches subscribers -- one of
which (`processRoleChange`, subscribed to `Role` changes from the same
storage) takes the storage's mutex back. The reverse order is taken by a
concurrent `CREATE ROLE`, which first goes through `sendNotifications`
(notifier mutex) and then into `processRoleChange` (storage mutex).
TSan flagged the resulting lock-order inversion and the server aborted
on first authentication, breaking all four `test_token_roles_mapping`
tests in the `Integration tests (amd_tsan, 2/6)` shard.
Switch the `lock_guard` to a `unique_lock` and explicitly `unlock` it
right before draining notifications. The drain still has to happen --
it's why this call was added in the first place -- but it doesn't need
to run under the storage mutex, and dropping the lock breaks the cycle.
Addresses 4 failing test(s) in Integration tests (amd_tsan, 2/6) on
#1803. After this fix the
still-failing set shrank from 4 to 0.
CI report:
https://altinity-build-artifacts.s3.amazonaws.com/json.html?PR=1803&sha=34750777806dc0e657e750822401efc331b74ef3&name_0=PR&name_1=Integration%20tests%20%28amd_tsan%2C%202%2F6%29
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Changelog category (leave one):
Cherry-picked from ClickHouse#102825.
Documentation entry for user-facing changes
It looks like we were not able to use the parquet metadata cache because the parquet format string gets passed many different ways and we were only looking for "Parquet". For example "PARQUET" from tools like Spark, "Parquet" internally in our code base, and "parquet" if you are following the iceberg spec.
In the future, it would be good to normalize this format string in one place, because we are dealing with this same issue in a few different places in the code.