Skip to content

GIL is held for the entire duration of cursor.execute(), starving other threads / asyncio event loop #540

@StrayCatDBA

Description

@StrayCatDBA

Describe the bug

mssql-python holds the Python GIL for the entire duration of a query
executed via cursor.execute(...). Other Python threads — including the
asyncio event loop on the main thread when the query is dispatched via
asyncio.to_thread — are completely starved while the driver waits on the
network round-trip to SQL Server.

This is a regression versus pyodbc, which releases the GIL around the
blocking ODBC calls (SQLExecute, SQLExecDirect, SQLFetch, ...). Any
user combining mssql-python with asyncio.to_thread,
concurrent.futures.ThreadPoolExecutor, FastAPI/Starlette background
threads, a Flask threaded server, or even a simple background heartbeat
thread will see the entire process freeze for the duration of every query —
even though the query is dispatched to a worker thread specifically to
avoid that.

No exception is raised; the symptom is a stalled event loop / unresponsive
threads.

To reproduce

Self-contained repro using asyncio.to_thread and WAITFOR DELAY so no
schema or table is needed. The main thread runs an asyncio "heartbeat"
coroutine that prints a tick every 2 seconds; the query is dispatched to a
worker thread. If the GIL were released around the ODBC call, the heartbeat
would keep ticking. It does not — it freezes for the full 10 seconds.

"""
Minimal repro: mssql-python holds the GIL for the entire duration of a query.

Setup:
    pip install mssql-python
    set DB_CONNECTION_STRING=Server=...;Database=...;UID=...;PWD=...;Encrypt=yes;

Run:
    python repro_gil_held.py
"""

import asyncio
import os
import time

import mssql_python

QUERY_DELAY_SECONDS = 10
HEARTBEAT_INTERVAL = 2.0


def run_blocking_query() -> None:
    """Runs in a worker thread via asyncio.to_thread."""
    conn_str = os.environ["DB_CONNECTION_STRING"]
    conn = mssql_python.connect(conn_str)
    try:
        cursor = conn.cursor()
        t0 = time.monotonic()
        print(f"[query  ] {t0:8.3f} starting WAITFOR DELAY '00:00:{QUERY_DELAY_SECONDS:02d}'")
        cursor.execute(f"WAITFOR DELAY '00:00:{QUERY_DELAY_SECONDS:02d}'")
        cursor.fetchall() if cursor.description else None
        t1 = time.monotonic()
        print(f"[query  ] {t1:8.3f} query returned (elapsed {t1 - t0:.3f}s)")
        cursor.close()
    finally:
        conn.close()


async def heartbeat(stop: asyncio.Event) -> None:
    """Prints a tick every HEARTBEAT_INTERVAL seconds on the event loop thread."""
    last = time.monotonic()
    while not stop.is_set():
        now = time.monotonic()
        gap = now - last
        marker = "  <-- STALLED (GIL held by query thread)" if gap > HEARTBEAT_INTERVAL * 1.5 else ""
        print(f"[hbeat  ] {now:8.3f} tick (gap {gap:5.3f}s){marker}")
        last = now
        try:
            await asyncio.wait_for(stop.wait(), timeout=HEARTBEAT_INTERVAL)
        except asyncio.TimeoutError:
            pass


async def main() -> None:
    if "DB_CONNECTION_STRING" not in os.environ:
        raise SystemExit("Set DB_CONNECTION_STRING env var first.")

    stop = asyncio.Event()
    hb_task = asyncio.create_task(heartbeat(stop))

    await asyncio.sleep(2 * HEARTBEAT_INTERVAL)
    await asyncio.to_thread(run_blocking_query)
    await asyncio.sleep(2 * HEARTBEAT_INTERVAL)

    stop.set()
    await hb_task


if __name__ == "__main__":
    print(f"mssql_python version: {getattr(mssql_python, '__version__', 'unknown')}")
    asyncio.run(main())

Observed output (1.6.0):

mssql_python version: 1.6.0
[hbeat  ]    0.000 tick (gap 0.000s)
[hbeat  ]    2.001 tick (gap 2.001s)
[query  ]    4.003 starting WAITFOR DELAY '00:00:10'
[hbeat  ]   14.012 tick (gap 12.011s)  <-- STALLED (GIL held by query thread)
[query  ]   14.014 query returned (elapsed 10.011s)
[hbeat  ]   16.014 tick (gap 2.002s)

The 12-second gap between heartbeat ticks is the smoking gun: the asyncio
event loop on the main thread cannot run at all while a worker thread is
inside cursor.execute(...).

Expected behavior

The blocking ODBC calls should release the GIL around the
network/server-bound portion of the call, the way pyodbc does. The
heartbeat in the repro should continue to fire every ~2 s while the query
runs:

[hbeat  ]    0.000 tick (gap 0.000s)
[hbeat  ]    2.001 tick (gap 2.001s)
[query  ]    4.003 starting WAITFOR DELAY '00:00:10'
[hbeat  ]    4.001 tick (gap 2.000s)
[hbeat  ]    6.001 tick (gap 2.000s)
[hbeat  ]    8.001 tick (gap 2.000s)
[hbeat  ]   10.001 tick (gap 2.000s)
[hbeat  ]   12.002 tick (gap 2.001s)
[hbeat  ]   14.002 tick (gap 2.000s)
[query  ]   14.014 query returned (elapsed 10.011s)

Further technical details

Python version: 3.13.1 (CPython, 64-bit)
SQL Server version: SQL Server 2022
Operating system: Windows
mssql-python version: 1.6.0 (latest at time of filing)

Additional context

Root cause (from reading mssql_python/pybind/)

A grep across the entire pybind/ tree shows the GIL is released in only
two places, both in connection/connection.cpp:

$ rg "gil_scoped_release|call_guard|Py_BEGIN_ALLOW_THREADS" mssql_python/pybind
mssql_python/pybind/connection/connection.cpp:94    SQLDriverConnect
mssql_python/pybind/connection/connection.cpp:156   SQLDisconnect

ddbc_bindings.cpp (~288 KB, contains every statement-level wrapper)
contains zero gil_scoped_release and zero
py::call_guard<py::gil_scoped_release>(). As a result, every blocking
statement-level ODBC call holds the GIL across the network round-trip:

Wrapper C++ function ODBC call Approx. line in ddbc_bindings.cpp
DDBCSQLExecDirect SQLExecDirect_wrap SQLExecDirectW 1705
DDBCSQLExecute SQLExecute_wrap SQLExecute (+ DAE: SQLParamData, SQLPutData) 1881
SQLExecuteMany SQLExecuteMany_wrap repeated SQLExecute (+ DAE) 2750, 2770
DDBCSQLFetch SQLFetch_wrap SQLFetch 2944
DDBCSQLFetchOne/Many/All FetchOne_wrap / FetchMany_wrap / FetchAll_wrap SQLFetch / SQLFetchScroll (+ SQLGetData for LOBs) 5483+, 5587, 5621+
DDBCSQLFetchScroll SQLFetchScroll_wrap SQLFetchScroll (registered at 5810)
DDBCSQLMoreResults SQLMoreResults_wrap SQLMoreResults 5649
DDBCSQLGetData FetchLobColumnData SQLGetData (LOB streaming loop) 2958
DDBCSQLTables/Columns/PrimaryKeys/ForeignKeys/Procedures/SpecialColumns/Statistics/GetTypeInfo catalog wrappers corresponding catalog ODBC calls — all blocking 5806–5862
Connection.commit / rollback (in connection.cpp) SQLEndTran declared at 354

Suggested fix

The maintainers clearly know the pattern — it's already used correctly for
SQLDriverConnect and SQLDisconnect in
mssql_python/pybind/connection/connection.cpp line 94:

SQLRETURN ret;
{
    py::gil_scoped_release release;
    ret = SQLDriverConnect_ptr(_dbcHandle->get(), nullptr, connStrPtr,
                               SQL_NTS, nullptr, 0, nullptr,
                               SQL_DRIVER_NOPROMPT);
}

The same pattern needs to be applied to every blocking ODBC call in
ddbc_bindings.cpp. Two equivalent options:

Option A — explicit scope (required for wrappers that build Python
objects, e.g. the Fetch*_wrap family and SQLExecute_wrap's DAE loop,
which interleave Python-object construction with ODBC calls):

SQLRETURN ret;
{
    py::gil_scoped_release release;
    ret = SQLExecDirect_ptr(StatementHandle->get(), queryPtr, SQL_NTS);
}

Option B — pybind11 call_guard on the binding (only safe for wrappers
that do not touch any Python object between entry and the ODBC call, e.g.
SQLFetch_wrap, SQLMoreResults_wrap):

m.def("DDBCSQLFetch",       &SQLFetch_wrap,       "...",
      py::call_guard<py::gil_scoped_release>());
m.def("DDBCSQLMoreResults", &SQLMoreResults_wrap, "...",
      py::call_guard<py::gil_scoped_release>());

Releasing the GIL around just SQLExecute_wrap, SQLExecDirect_wrap, and
the Fetch*_wrap family would resolve the vast majority of user-visible
contention, including this repro.

Metadata

Metadata

Assignees

No one assigned

    Labels

    triage neededFor new issues, not triaged yet.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions