unexpected transaction start times with mssql temporal (system versioned) tables #11444
Replies: 5 comments 26 replies
-
In the meantime I'd appreciate a workaround as this is making our system-versioned tables unusable. |
Beta Was this translation helpful? Give feedback.
-
hi - I see in issue #5690 there was no reproducer created, in this case you've provided a script, thanks for that. However the claim that SQLAlchemy is holding open a transaction is definitely not the case, so we would assume this is some artifact of connection pooling - that is, your "pyodbc" example uses two separate connections, whereas your SQLAlchemy example will use the same connection by default. We would then make a random guess that something about pyodbc / system versioned tables etc. is producing this behavior (which to be honest I haven't walked through to understand exactly what the problem is, since this looks like driver behaviors). Try turning off pooling: from sqlalchemy.pool import NullPool
engine = create_engine(connection_url, poolclass=NullPool) the sequence of connection usage should then be mostly identical to your pyodbc version. |
Beta Was this translation helpful? Give feedback.
-
Here's the same behavior with pyodbc directly, so the problem is in pyodbc and I would report this issue to them. You can use a new connection each time by using NullPool as mentioned above from typing import List
from time import sleep
from datetime import datetime, timezone
import pyodbc
def format_time(time: datetime) -> str:
return time.strftime("%H:%M:%S.%f")
commitTimes: List[datetime] = []
sqlTimes: List[datetime] = []
connect_string = 'DRIVER={ODBC Driver 18 for SQL Server};Server=mssql2022,1433;Database=test;UID=scott;PWD=tiger^5HHH;TrustServerCertificate=yes;Encrypt=Optional'
print("The buggy behavior with pyodbc directly:")
# autocommit=False does not seem to matter (note SQLAlhcemy version is not setting this
# unless you add an appropriate isolation_level setting)
pyodbc_conn = pyodbc.connect(connect_string, autocommit=False)
with pyodbc_conn.cursor() as cursor:
cursor.execute("UPDATE [_TestTable] SET UserName=? WHERE Id=?", 'foo', 1)
pyodbc_conn.commit()
commitTimes.append(datetime.now(timezone.utc))
with pyodbc_conn.cursor() as cursor:
cursor.execute("SELECT [StartTime] FROM [_TestTable] WHERE Id=1")
row = cursor.fetchone()
sqlTimes.append(row[0])
pyodbc_conn.rollback()
sleep(10)
# uncomment to "fix"
# pyodbc_conn = pyodbc.connect(connect_string)
with pyodbc_conn.cursor() as cursor:
cursor.execute("UPDATE [_TestTable] SET UserName=? WHERE Id=?", 'bar', 1)
pyodbc_conn.commit()
commitTimes.append(datetime.now(timezone.utc))
with pyodbc_conn.cursor() as cursor:
cursor.execute("SELECT [StartTime] FROM [_TestTable] WHERE Id=1")
row = cursor.fetchone()
sqlTimes.append(row[0])
pyodbc_conn.rollback()
print(f"{'First commit:':<15} {format_time(commitTimes[0])}")
print(f"{'First save:':<15} {format_time(sqlTimes[0])}")
print(f"{'Second commit:':<15} {format_time(commitTimes[1])}")
print(f"{'Second save:':<15} {format_time(sqlTimes[1])}")``` output, without the "fix"
for my own reference, below is my full original script that includes DDL plus all three versions from typing import List
from time import sleep
from datetime import datetime, timezone
from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker
from sqlalchemy.engine import URL
from sqlalchemy.pool import NullPool
import pyodbc
def format_time(time: datetime) -> str:
return time.strftime("%H:%M:%S.%f")
CONNECTION_STRING= "DRIVER={ODBC Driver 18 for SQL Server};Server=mssql2022,1433;Database=test;UID=scott;PWD=tiger^5HHH;TrustServerCertificate=yes;Encrypt=Optional"
connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": CONNECTION_STRING})
engine = create_engine(connection_url,
echo=True,
# uncomment to "fix"
poolclass=NullPool
)
with engine.begin() as conn:
conn.exec_driver_sql("""
ALTER TABLE [dbo].[_TestTable] SET ( SYSTEM_VERSIONING = OFF)
""")
conn.exec_driver_sql("""DROP TABLE [dbo].[_TestTable]""")
conn.exec_driver_sql("""DROP TABLE [dbo].[_TestTableHistory]""")
conn.exec_driver_sql("""
CREATE TABLE [dbo].[_TestTable](
[Id] [int] IDENTITY(1,1) NOT NULL,
[UserName] [nvarchar](50) NOT NULL,
[StartTime] [datetime2](7) GENERATED ALWAYS AS ROW START NOT NULL,
[EndTime] [datetime2](7) GENERATED ALWAYS AS ROW END NOT NULL,
PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY],
PERIOD FOR SYSTEM_TIME ([StartTime], [EndTime])
) ON [PRIMARY]
WITH
(
SYSTEM_VERSIONING = ON ( HISTORY_TABLE = [dbo].[_TestTableHistory] )
)
""")
conn.exec_driver_sql("""INSERT INTO dbo.[_TestTable] (UserName) VALUES('John')""")
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
commitTimes: List[datetime] = []
sqlTimes: List[datetime] = []
print("SQLAlchemy Behaviour:")
with SessionLocal() as db:
cmd = text("UPDATE [_TestTable] SET UserName=:name WHERE Id=:id")
cmd = cmd.bindparams(name='foo', id=1)
db.execute(cmd)
db.commit()
commitTimes.append(datetime.now(timezone.utc))
query = text("SELECT [StartTime] FROM [_TestTable] WHERE Id=1")
result = db.execute(query)
row = result.fetchone()
sqlTimes.append(row[0])
sleep(10)
with SessionLocal() as db:
cmd = text("UPDATE [_TestTable] SET UserName=:name WHERE Id=:id")
cmd = cmd.bindparams(name='bar', id=1)
db.execute(cmd)
db.commit()
commitTimes.append(datetime.now(timezone.utc))
query = text("SELECT [StartTime] FROM [_TestTable] WHERE Id=1")
result = db.execute(query)
row = result.fetchone()
sqlTimes.append(row[0])
print(f"{'First commit:':<15} {format_time(commitTimes[0])}")
print(f"{'First save:':<15} {format_time(sqlTimes[0])}")
print(f"{'Second commit:':<15} {format_time(commitTimes[1])}")
print(f"{'Second save:':<15} {format_time(sqlTimes[1])}")
commitTimes.clear()
sqlTimes.clear()
print()
print("Expected Behaviour (pyodbc):")
with pyodbc.connect(CONNECTION_STRING, autocommit=False) as conn:
with conn.cursor() as cursor:
cursor.execute("UPDATE [_TestTable] SET UserName=? WHERE Id=?", 'foo', 1)
conn.commit()
commitTimes.append(datetime.now(timezone.utc))
cursor.execute("SELECT [StartTime] FROM [_TestTable] WHERE Id=1")
row = cursor.fetchone()
sqlTimes.append(row[0])
sleep(10)
with pyodbc.connect(CONNECTION_STRING, autocommit=False) as conn:
with conn.cursor() as cursor:
cursor.execute("UPDATE [_TestTable] SET UserName=? WHERE Id=?", 'bar', 1)
conn.commit()
commitTimes.append(datetime.now(timezone.utc))
cursor.execute("SELECT [StartTime] FROM [_TestTable] WHERE Id=1")
row = cursor.fetchone()
sqlTimes.append(row[0])
# shouldn't conn be closed already?? script below fails if we dont
# do this first
conn.close()
print(f"{'First commit:':<15} {format_time(commitTimes[0])}")
print(f"{'First save:':<15} {format_time(sqlTimes[0])}")
print(f"{'Second commit:':<15} {format_time(commitTimes[1])}")
print(f"{'Second save:':<15} {format_time(sqlTimes[1])}")
commitTimes.clear()
sqlTimes.clear()
print()
print("The buggy behavior with pyodbc directly:")
pyodbc_conn = pyodbc.connect(CONNECTION_STRING, autocommit=False)
with pyodbc_conn.cursor() as cursor:
cursor.execute("UPDATE [_TestTable] SET UserName=? WHERE Id=?", 'foo', 1)
pyodbc_conn.commit()
commitTimes.append(datetime.now(timezone.utc))
with pyodbc_conn.cursor() as cursor:
cursor.execute("SELECT [StartTime] FROM [_TestTable] WHERE Id=1")
row = cursor.fetchone()
sqlTimes.append(row[0])
pyodbc_conn.rollback()
sleep(10)
# uncomment to "fix"
# pyodbc_conn = pyodbc.connect(connect_string, autocommit=False)
with pyodbc_conn.cursor() as cursor:
cursor.execute("UPDATE [_TestTable] SET UserName=? WHERE Id=?", 'bar', 1)
pyodbc_conn.commit()
commitTimes.append(datetime.now(timezone.utc))
with pyodbc_conn.cursor() as cursor:
cursor.execute("SELECT [StartTime] FROM [_TestTable] WHERE Id=1")
row = cursor.fetchone()
sqlTimes.append(row[0])
pyodbc_conn.rollback()
print(f"{'First commit:':<15} {format_time(commitTimes[0])}")
print(f"{'First save:':<15} {format_time(sqlTimes[0])}")
print(f"{'Second commit:':<15} {format_time(commitTimes[1])}")
print(f"{'Second save:':<15} {format_time(sqlTimes[1])}") |
Beta Was this translation helpful? Give feedback.
-
@zzzeek instead of a nullpool maybe it's enough to just add a https://docs.sqlalchemy.org/en/20/core/events.html#sqlalchemy.events.PoolEvents.checkout event to execute a rollback/commit to reset the transaction? @event.listens_for(engine , 'checkout')
def receive_checkout(dbapi_connection, connection_record, connection_proxy):
dbapi_connection.rollback() # commit should work too It seems that pyodbc starts a new transaction on the server as soon as commit / rollback is executed. |
Beta Was this translation helpful? Give feedback.
-
Summary TL;DR - If Windows' own connection pooling is active (which it is by default for Microsoft's "ODBC Driver nn for Windows") then using NullPool for the SQLAlchemy pool is the best solution. With With SQLAlchemy's default QueuePool, when a SQLAlchemy connection is "closed" the DBAPI connection is not closed, it is just returned to SQLAlchemy's connection pool. When that connection is checked out of the QueuePool With NullPool, closing the SQLAlchemy connection also closes the DBAPI connection. However, that connection is still in Windows' connection pool. The next time SQLAlchemy "creates" a DBAPI connection via |
Beta Was this translation helpful? Give feedback.
-
Describe the bug
This is a duplicate of issue #5690 which I don't think was well understood. This doesn't happen using plain pyodbc so it all points to a SQLAlchemy issue, maybe related to how sessions are managed.
Context:
I'm using SQL server's system versioned tables. These tables automatically record the time of creation/modification and according to the documentation it's the time the transaction was started. After more investigation this seemingly refers to the outtermost transaction which in this case will be the implicit transaction opened by ODBC.
Problem:
As I create new sessions to update records in the system versioned table it records times from before the new session was created. The script below may help understand this.
Optional link from https://docs.sqlalchemy.org which documents the behavior that is expected
No response
SQLAlchemy Version in Use
2.0.30
DBAPI (i.e. the database driver)
pyodbc
Database Vendor and Major Version
SQL Server 12 (2014)
Python Version
3.10
Operating system
Windows
To Reproduce
Create a system versioned table and add a record to modify:
Run the following script:
Error
The script will print something like this:
Expected/Actual behaviour
Notice how in the expected behaviour sample the save time is always a few miliseconds before the respective commit time, which makes sense as this is the time from when the transaction was opened.
In the SQLAlchemy sample instead the second save time is shortly after the first commit, way before the second session is even created. It looks as if SQLAlchemy is keeping the old transaction in between sessions.
Additional context
No response
Beta Was this translation helpful? Give feedback.
All reactions