New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to use transactions with Azure Synapse Analytics #8231
Comments
@gordthompson can you please elaborate what the actual SQLAlchemy bug is? the trace shows pyodbc.connection.rollback() failing. how is that our issue ? |
When we sqlalchemy/lib/sqlalchemy/dialects/mssql/base.py Line 2927 in 805a132
Plain old SQL Server uses the former, while the Data Warehouse products use the latter. @ww2406 discovered that if they removed the first entry and just queried So this is just a workaround for the difference in behaviour between SQL Server and Synapse. |
OK I need more detail than that:
tl;dr; reversing them is a band aid, lets please use real server telemetry to choose the correct view up front |
if the "reversing" helps only because it avoids an error condition, that otherwise somehow corrupts the connection, then that also is kind of a bigger problem. |
@zzzeek 3-- No, sys.dm_exec_sessions does not exist in the PDW products. This is what causes this issue in the first place, I believe, and why pure pyodbc does not experience issues. SQLAlchemy fails when it attempts to query 4-- sys.dm_pdw_nodes_exec_sessions` is the equivalent in PDW products. PDW is architected differently from traditional SQL Server -- it is more akin to Redshift/BigQuery with the concept of distributed nodes etc. Nodes_exec_sessions provides a little more info about what node a query is executing on etc. that is immaterial for normal SQL Server. 5-- No. Plain SQL Server only contains 6-- I can do a little digging on this, and if nothing sticks out, I can see if my MS rep knows anything. As a user, a band-aid that fixes a long-standing restriction with no impact to existing functionality is preferable now as opposed to having nothing while waiting for a deeper fix with no additional functional benefits [assuming that no impact statement holds true; gordthompson ran a test suite and I personally tested against SQL Server 2019 and regular Azure SQL Database with positive results]. (Also, I use PDW here to loosely refer to SQL Server Parallel Data Warehouse, Azure Synapse Analytics, and Azure SQL Data Warehouse -- they are all similar) |
Re: 6.... This looks like a way to differentiate between the two. declare @EngineEdition as sql_variant
set @EngineEdition = (select SERVERPROPERTY('EngineEdition'))
select @EngineEdition as EngineEdition,
case @EngineEdition
when 1 then 'Personal or Desktop Engine'
when 2 then 'Standard'
when 3 then 'Enterprise'
when 4 then 'Express'
when 5 then ' SQL Database'
when 6 then 'SQL Data Warehouse'
end as 'Engine Edition Based on ID' I tested against a dedicated pool on my personal Azure account and confirmed I received a response of 6. I also tested against an enterprise instance in docker and received a response of 3. |
Are the tables in sys reported by the information tables, like |
Yes, at least on SQL Server 2019:
|
yes I like looking in system tables for it. however, I really need to understand the actual problem. is the problem the following?
what's involved getting a connection to one of these DBs, is there some msft developer cloud thing we can try? |
Not free AFAIK, but the lowest operating level of Synapse is “only” $1.x an hour which isn’t horrible for quick testing like this. If someone has a VS Enterprise subscription they get $150 in free credits monthly against Azure svcs |
I’m willing to setup and pay for a dedicated pool for a few hours of testing if that would be helpful. |
I need the answser to number 3 above, but also if someone wants to work on a "test just the correct table only" patch, that is fine as well |
OK @ww2406 has added a comment showing the original, actual bug, which is what it looked like from the beginning, which is pyodbc is being put into an invalid state it does not expect: import pyodbc
conn_string = ''
con = pyodbc.connect(conn_string)
cur = con.cursor()
cur.execute("SELECT * FROM sys.dm_exec_sessions")
# ProgrammingError -- catalog view is not supported in this version
con.rollback()
# 111214 -- attempt to complete transaction failed Here's why this is important:
|
@ww2406 - Perhaps Synapse might be deferring the start of the transaction? If you create the stored procedure CREATE PROCEDURE [dbo].[sqla_issue_8231]
AS
DECLARE @n int;
BEGIN
SET NOCOUNT ON;
SELECT @n = COUNT(*) FROM sys.system_views WHERE name = 'dm_exec_sessions';
-- purposely try hitting the view that does not exist
IF @n = 1
SELECT COUNT(*) AS i FROM sys.dm_pdw_nodes_exec_sessions;
ELSE
SELECT COUNT(*) AS i FROM sys.dm_exec_sessions;
END and then run the following script in SSMS BEGIN TRANSACTION
BEGIN TRY
EXEC dbo.sqla_issue_8231;
END TRY
BEGIN CATCH
-- ignore error
END CATCH
ROLLBACK does Synapse throw a "No corresponding transaction found." sort of error? |
@gordthompson it does not, no error whatsoever import pyodbc
conn_string = ''
con = pyodbc.connect(conn_string)
cur = con.cursor()
cur.execute("SELECT * FROM sys.dm_pdw_nodes_exec_sessions")
con.rollback()
# success |
well yes the failure is related to the fact that a SQL statement fails, which is exactly when youd want to call rollback, and then the rollback fails, because somehow that SQL statement's error condition has already rolled back the transaction, pyodbc doesnt know about it, and boom. this is all wrong on someone's end. |
okay, just mentioned that because you said "maybe all ... rollback calls will not work on this database" and this is one where it works. at any rate, I believe this is a database functionality. I tested the following C# code (dotnet 5.0, Microsoft.Data.SqlClient 4.1.0) and receive a similar result. using System;
using Microsoft.Data.SqlClient;
namespace SynapseTest
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Starting test...");
SqlConnectionStringBuilder builder = new SqlConnectionStringBuilder();.
builder.DataSource = "SERVER";
builder.UserID = "UID";
builder.Password = "PWD";
builder.InitialCatalog = "DATABASE";
using SqlConnection conn = new SqlConnection(builder.ConnectionString);
conn.Open();
using var txn = conn.BeginTransaction();
using var cmd = conn.CreateCommand();
cmd.Transaction = txn;
cmd.CommandText = "SELECT * FROM sys.dm_exec_sessions";
try
{
cmd.ExecuteNonQuery();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
txn.Rollback();
Console.WriteLine("All done");
}
}
}
|
A third-party site suggested XACT_ABORT may default to ON on DW but I haven't found any MS documentation to substantiate this |
Actually, I have, just didn't read carefully before: MS Error Handling If COMMIT or ROLLBACK are run and there is no active transaction, an error is raised. If a BEGIN TRANSACTION is run while a transaction is already in progress, an error is raised. This can occur if a BEGIN TRANSACTION occurs after a successful BEGIN TRANSACTION statement or when the session is under SET AUTOCOMMIT OFF. If an error other than a run-time statement error prevents the successful completion of an explicit transaction, Azure Synapse Analytics automatically rolls back the transaction and frees all resources held by the transaction. For example, if the client's network connection to an instance of Azure Synapse Analytics is broken or the client logs off the application, any uncommitted transactions for the connection are rolled back when the network notifies the instance of the break. If a run-time statement error occurs in a batch, Azure Synapse Analytics behaves consistent with SQL ServerXACT_ABORT set to ON and the entire transaction is rolled back. For more information about the XACT_ABORT setting, see SET XACT_ABORT (Transact-SQL). |
Agreed. The fact that .NET Also, the repro code for Azure from this comment does not produce an error on SQL Server 2019.
The two database platforms just behave differently. One possibility might be to |
my point is the same as always, DBAPI is not supposed to have restrictions on rollback(), otherwise you'd never know when you can call it or not if ceratin error conditions "autorollback" and break the driver's knowledge of the sequence of events. this is where pyodbc pretending to be "database agnostic" (95% of all pyodbc use is for SQL server) is a problem because it really should have logic to know that certain error conditions mean the driver just did an "autorollback", unless there's some way to configure this with the odbc driver or similar. |
Of course, if only a certain Redmond-based company that knows these products inside and out (hopefully) would consider writing a Python client like, idk, virtually every other database vendor…. 🤷♂️ Re: XACT_STATE, in my ganders through Synapse documentation, I think I saw something that MS introduced a new -2 value for XACT_STATE in PDW that doesn’t exist normally |
Ah, but the pyodbc team would argue (rightly, IMO) that it's not their job, either. If Synapse starts a transaction at the client's request and later rolls it back arbitrarily for whatever reason then it shouldn't raise an error if the client requests the rollback that has already occurred. We can always override |
I would not feel comfortable squashing an exception like that for something as critical as a database rollback. better would be if there were logic that literally ran "select can_we_rollback_here", whatever that has to be, then does the rollback. then, we make it a recipe for those using Azure Synapse. I dont want hardcoded logic in the dialect that adds latency to all MSSQL code everywhere to work around what is IMO a major bug in Synapse and/or drivers. |
Except we don't at the moment have a DialectEvents.do_rollback() method that can conditionilalize whether rollback is called or not. So to be clear we have fully confirmed there is aboslutely no way to get the DB or ODBC driver to not do this "autorollback", and what would pyodbc devs actually tell us to do ? catch all exceptions and regexp them ? |
I would assume when we use ordinary SQL Server we dont have XACT_ABORT set to ON. however, perhaps the standardized nature of XACT_ABORT allows us to add an option to the SQL Server dialect overall called "xact_abort=True", meaning, we assume the DB has XACT_ABORT turned on, that gives us an excuse to augment do_rollback() to check ahead of time for aborted trans, or try/catch the rollback but it's important that we catch a very specific error and make sure the code matches xact_abort. |
hi here is what I would accept:
|
hey @zzzeek I made a commit last night on that PR towards 1. I think the base part of it is figured out unless there are changes you'd like to see. Would you be willing to take a look, make sure I went on an okay path with it? Then I'm feeling kinda unsure how to approach developing testing for it (or do you think no additional testing beyond the existing test infrastructure is necessary)? I'll try to take a look at 2, too, but I feel less certain about how to make that work. It seemed like the place to start is |
this doesnt need new tests as coverage of the code in question is intrinsic to the database in use. for 2, the parameter would go here since i think we are saying it should apply to any backend (noting pyodbc is the only backend we test right now), then you add a do_rollback() method to MSCompiler as well which overrides this one. this is the idea diff --git a/lib/sqlalchemy/dialects/mssql/base.py b/lib/sqlalchemy/dialects/mssql/base.py
index ed4139ad17..576d6c830d 100644
--- a/lib/sqlalchemy/dialects/mssql/base.py
+++ b/lib/sqlalchemy/dialects/mssql/base.py
@@ -2843,6 +2843,7 @@ class MSDialect(default.DefaultDialect):
_supports_offset_fetch = False
_supports_nvarchar_max = False
+ _xact_abort_enabled = False
legacy_schema_aliasing = False
server_version_info = ()
@@ -2870,6 +2871,7 @@ class MSDialect(default.DefaultDialect):
deprecate_large_types=None,
json_serializer=None,
json_deserializer=None,
+ xact_abort_enabled=True,
legacy_schema_aliasing=None,
**opts,
):
@@ -2891,6 +2893,7 @@ class MSDialect(default.DefaultDialect):
self._json_serializer = json_serializer
self._json_deserializer = json_deserializer
+ self._xact_abort_enabled = xact_abort_enabled
def do_savepoint(self, connection, name):
# give the DBAPI a push
@@ -2901,6 +2904,14 @@ class MSDialect(default.DefaultDialect):
# SQL Server does not support RELEASE SAVEPOINT
pass
+ def do_rollback(self, dbapi_connection):
+ try:
+ dbapi_connection.rollback()
+ except self.dbapi.ProgrammingError as err:
+ if self._xact_abort_enabled and "111214" in str(err):
+ return
+ raise
+
_isolation_lookup = set(
[
"SERIALIZABLE",
|
can you reference "Fixes: #8231" in your PR so it links here, thanks |
This is what I've been working on for xact_abort: I've tested it as much as I can without Synapse, but it looks like it should work. |
great minds think alike, nice job |
Thanks. @ww2406 - Can you test that patch in isolation (i.e., without your select-o-matic patch to use the appropriate view) and verify that it works? That is
(or equivalent) and then do something like import sqlalchemy as sa
engine = sa.create_engine(connection_url, xact_abort=True)
with engine.begin() as conn:
pass |
Yes! I’ll take a look tonight.
…Sent from my iPhone
On Jul 12, 2022, at 10:09 AM, Gord Thompson ***@***.***> wrote:
This is what I've been working on for xact_abort:
***@***.***<gordthompson@de959b3>
I've tested it as much as I can without Synapse, but it looks like it should work.
great minds think alike, nice job
Thanks.
@ww2406<https://github.com/ww2406> - Can you test that patch in isolation (i.e., without your select-o-matic patch to use the appropriate view) and verify that it works? That is
pip install ***@***.***_abort
(or equivalent) and then do something like
import sqlalchemy as sa
engine = sa.create_engine(connection_url, xact_abort=True)
with engine.begin() as conn:
pass
—
Reply to this email directly, view it on GitHub<#8231 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANWPO7AM6BBNIYP5WMG6TKTVTV4AHANCNFSM53CCVM4A>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@gordthompson sadly 111214 is still raised :( |
Rats. Okay, thanks for checking. |
Absolutely! If it'd be helpful, I'd be happy to get you credentials to my personal SQL DW dedicated pool I play with occasionally for you to test directly without going through a middle man. Also happy to keep testing for you! |
Thanks, @ww2406 . Having a real DW connection certainly would be helpful to try and figure out what the heck is going on with this 111214 thing. |
@gordthompson sorry for the delay, life is a bit of an adventure right now, just sent you an email with some details! |
Okay, with big thanks to @ww2406 https://github.com/gordthompson/sqlalchemy/tree/xact_abort_0721a now has a patch that works for me against a real Synapse instance. After doing
the following code import logging
import sqlalchemy as sa
logging.basicConfig(level=logging.INFO)
connection_url = "<insert your connection URL here>"
engine = sa.create_engine(
connection_url,
xact_abort=True,
)
with engine.begin() as conn:
pass should not fail, but simply log
If this looks okay then I can either
|
let's keep this all on one PR / gerrit if possible. |
Is my patch still needed then? |
i thought gord could append to your patch. either way. sorry I just wait for the "it's done, here's our solution" patch to come out before I look too closely :) |
Gord Thompson has proposed a fix for this issue in the main branch: Fix 'No … transaction found' error on Synapse. https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/4006 |
Gord Thompson has proposed a fix for this issue in the rel_1_4 branch: Fix 'No transaction found' error on Synapse. https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/4020 |
A bit OT: But there is at least a whole documentation section on docs.microsoft.com about pymssql/FreeTDS ... https://docs.microsoft.com/en-us/sql/connect/python/pymssql/step-1-configure-development-environment-for-pymssql-python-development?view=sql-server-ver16 |
Fixed issue where the SQL Server dialect's query for the current isolation level would fail on Azure Synapse Analytics, due to the way in which this database handles transaction rollbacks after an error has occurred. The initial query has been modified to no longer rely upon catching an error when attempting to detect the appropriate system view. Additionally, to better support this database's very specific "rollback" behavior, implemented new parameter ``ignore_no_transaction_on_rollback`` indicating that a rollback should ignore Azure Synapse error 'No corresponding transaction found. (111214)', which is raised if no transaction is present in conflict with the Python DBAPI. Fixes: #8231 Closes: #8233 Pull-request: #8233 Pull-request-sha: c48bd44 Change-Id: I6407a03148f45cc9eba8fe1d31d4f59ebf9c7ef7 (cherry picked from commit 8fe3cd69c5f2d8f73e75fb19ae929273282fba57)
Hi Guys, I initially i was able to connect to the syanpse dedicated pool and perform GET request, However was not able to perform POST request and received the same error. I followed the above approach of: But after performing this action I was not evern able to connect with the server. With error same as earlier. Can you please advise me on this, if there are any additional steps to be performed? |
looks like instructions for this are at https://docs.sqlalchemy.org/en/20/dialects/mssql.html#avoiding-transaction-related-exceptions-on-azure-synapse-analytics |
Discussed in #8223
Originally posted by ww2406 July 7, 2022
Describe the bug
Hello!
Problem: Unable to use transactions with Azure Synapse Analytics
Description: ODBC Error 111214 is returned: An attempt to complete a transaction has failed. No corresponding transaction found.
Other notes: Transactions function properly when using pyodbc directly [see MCVE]. I saw this post by gordthompson in response to a question on StackOverflow (and since added to the documentation) that
but this is outdated. Azure Synapse Analytics does support transactions in most cases; see this Microsoft documentation:
To Reproduce
Error
Versions
Additional context
No response
The text was updated successfully, but these errors were encountered: