Entra ID access token is acquired on every connect(), even on a pool hit

## Summary

With Entra ID authentication (e.g. `Authentication=ActiveDirectoryDefault`), `Connection.__init__` acquires an access token on **every** `connect()`, before the native connection pool is consulted. On a pool hit the freshly acquired token is never used: the pooled physical connection is already authenticated, and the pool keys only on the sanitized connection string, so the token in `attrs_before` is not reapplied. The token acquisition and struct packing are therefore wasted work on every reused connection.

This partially defeats the purpose of pooling for token-auth workloads: pooling is enabled to avoid per-connection cost, yet a token is still materialized for each connection.

## Where (v1.10.0)

`mssql_python/connection.py`, `Connection.__init__`:

```python
# token acquired unconditionally, before the pool is consulted
sanitized = remove_sensitive_params(parsed_params)
self.connection_str = _ConnectionStringBuilder(sanitized).build()
token = get_auth_token(auth_type, credential_kwargs)      # <-- always runs
if token:
    self._attrs_before[ConstantsDDBC.SQL_COPT_SS_ACCESS_TOKEN.value] = token

# ... later ...

# pool checkout happens here, in the C layer, keyed on connection_str
if not PoolingManager.is_initialized():
    PoolingManager.enable()
self._pooling = PoolingManager.is_enabled()
self._conn = ddbc_bindings.Connection(
    self.connection_str, self._pooling, self._attrs_before
)
```

`get_auth_token` -> `AADAuth._acquire_token` reuses a cached *credential instance*, but still calls `credential.get_token("https://database.windows.net/.default")` and `get_token_struct()` (UTF-16-LE encode + `struct.pack`) on every call. azure-identity serves the token from its own in-memory cache while it is still valid, so this is not a full network round-trip each time, but it is per-connection CPU work whose result is discarded on a pool hit.

## Evidence

A `dlt` pipeline loading many tables to a Fabric Warehouse with `Authentication=ActiveDirectoryDefault` and native pooling enabled:

- The native pool is active and reusing connections: `SQL_ATTR_RESET_CONNECTION` (pool checkout reset) is logged ~212 times, with a single real cold login (~2.9s) followed by a uniform ~0.28s per open.
- Yet `get_token: Azure AD token acquired successfully` is logged once per `connect()` (146 token acquisitions for 146 opens, exactly 1:1), i.e. a token is produced even for the reused connections.

## Impact

- Wasted CPU per connection (token struct packing + `credential.get_token` bookkeeping) exactly in the high-frequency, short-connection scenario that pooling is meant to optimize.
- Makes it harder to reason about pooling from the caller side: every open still performs an auth step, so a pooled checkout is indistinguishable from a fresh login by wall-clock.

## Possible direction

Consult the pool before acquiring/materializing the token, and only acquire when a new physical connection will actually be opened (pool miss). This depends on the pool-key/identity work tracked in #651: the pool currently cannot tell the caller whether a checkout will reuse or open a connection, and it cannot safely reuse a token-auth connection across identities. If the pool became identity-aware (per #651), a pool hit for the same identity could skip token acquisition entirely.

Related: #651 (pool identity separation), #580 (reducing per-connect parsing overhead).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Entra ID access token is acquired on every connect(), even on a pool hit #659

Summary

Where (v1.10.0)

Evidence

Impact

Possible direction

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Entra ID access token is acquired on every connect(), even on a pool hit #659

Description

Summary

Where (v1.10.0)

Evidence

Impact

Possible direction

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions