Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: clarify how sqlalchemy uses SQL transactions and how to combine with client retries #26966

Closed
awoods187 opened this issue Jun 25, 2018 · 13 comments
Assignees
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-investigation Further steps needed to qualify. C-label will change. docs-todo
Milestone

Comments

@awoods187
Copy link
Contributor

awoods187 commented Jun 25, 2018

sqlalchemy.exc.InternalError: (psycopg2.InternalError) SAVEPOINT COCKROACH_RESTART needs to be the first statement in a transaction

Ran into on the MOVR database cockroachdb/movr#16

@awoods187 awoods187 added the A-sql-pgcompat Semantic compatibility with PostgreSQL label Jun 25, 2018
@knz
Copy link
Contributor

knz commented Jun 27, 2018

@nstewart @awoods187 Please provide example code that reproduces the error

@knz knz added the C-investigation Further steps needed to qualify. C-label will change. label Jun 27, 2018
@knz knz assigned awoods187 and unassigned knz Jun 30, 2018
@knz
Copy link
Contributor

knz commented Jun 30, 2018

@awoods187 I'm sorry this issue is currently either improperly titled or lacking description. The sentence "ORM expect savepoints to be the first statement in a transaction" is factually false - the reason why you observe an error is instead that your program is trying to use a non-savepoint statement first.

We can't really do anything with this issue as-is. Please provide more information.

@knz knz added the X-unactionable This was closed because it was unactionable. label Jun 30, 2018
@nstewart
Copy link
Contributor

nstewart commented Jun 30, 2018 via email

@nstewart
Copy link
Contributor

SQLAlchemy + Postgres output

2018-06-30 07:54:31,062 INFO sqlalchemy.engine.base.Engine select version()
2018-06-30 07:54:31,063 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:54:31,065 INFO sqlalchemy.engine.base.Engine select current_schema()
2018-06-30 07:54:31,065 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:54:31,066 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2018-06-30 07:54:31,066 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:54:31,067 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2018-06-30 07:54:31,067 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:54:31,068 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings
2018-06-30 07:54:31,068 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:54:31,069 INFO sqlalchemy.engine.base.Engine select relname from pg_class c join pg_namespace n on n.oid=c.relnamespace where pg_catalog.pg_table_is_visible(c.oid) and relname=%(name)s
2018-06-30 07:54:31,069 INFO sqlalchemy.engine.base.Engine {'name': u'accounts'}
2018-06-30 07:54:31,073 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2018-06-30 07:54:31,073 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
WHERE accounts.id = %(id_1)s
2018-06-30 07:54:31,073 INFO sqlalchemy.engine.base.Engine {'id_1': 1}
2018-06-30 07:54:31,075 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
WHERE accounts.id = %(id_1)s
2018-06-30 07:54:31,075 INFO sqlalchemy.engine.base.Engine {'id_1': 2}
2018-06-30 07:54:31,076 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
2018-06-30 07:54:31,076 INFO sqlalchemy.engine.base.Engine {}
1 500
2 1300
2018-06-30 07:54:31,077 INFO sqlalchemy.engine.base.Engine SAVEPOINT cockroach_restart
2018-06-30 07:54:31,077 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:54:31,078 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
WHERE accounts.id = %(id_1)s
2018-06-30 07:54:31,078 INFO sqlalchemy.engine.base.Engine {'id_1': 1}
2018-06-30 07:54:31,080 INFO sqlalchemy.engine.base.Engine UPDATE accounts SET balance=%(balance)s WHERE accounts.id = %(accounts_id)s
2018-06-30 07:54:31,080 INFO sqlalchemy.engine.base.Engine {'balance': 400, 'accounts_id': 1}
2018-06-30 07:54:31,081 INFO sqlalchemy.engine.base.Engine UPDATE accounts SET balance=(accounts.balance + %(balance_1)s) WHERE accounts.id = %(id_1)s
2018-06-30 07:54:31,081 INFO sqlalchemy.engine.base.Engine {'balance_1': 100, 'id_1': 2}
2018-06-30 07:54:31,082 INFO sqlalchemy.engine.base.Engine RELEASE SAVEPOINT cockroach_restart
2018-06-30 07:54:31,082 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:54:31,082 INFO sqlalchemy.engine.base.Engine COMMIT
2018-06-30 07:54:31,083 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2018-06-30 07:54:31,084 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
2018-06-30 07:54:31,084 INFO sqlalchemy.engine.base.Engine {}
1 400
2 1400
2018-06-30 07:54:31,085 INFO sqlalchemy.engine.base.Engine ROLLBACK

SQLAlchemy + CockroachDB output

2018-06-30 07:55:20,094 INFO sqlalchemy.engine.base.Engine select current_schema()
2018-06-30 07:55:20,094 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:55:20,096 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2018-06-30 07:55:20,096 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:55:20,097 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2018-06-30 07:55:20,097 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:55:20,099 INFO sqlalchemy.engine.base.Engine SHOW TABLES
2018-06-30 07:55:20,099 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 07:55:20,104 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2018-06-30 07:55:20,106 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
WHERE accounts.id = %(id_1)s
2018-06-30 07:55:20,106 INFO sqlalchemy.engine.base.Engine {'id_1': 1}
2018-06-30 07:55:20,109 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
WHERE accounts.id = %(id_1)s
2018-06-30 07:55:20,109 INFO sqlalchemy.engine.base.Engine {'id_1': 2}
2018-06-30 07:55:20,111 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
2018-06-30 07:55:20,111 INFO sqlalchemy.engine.base.Engine {}
1 600
2 650
2018-06-30 07:55:20,113 INFO sqlalchemy.engine.base.Engine SAVEPOINT cockroach_restart
2018-06-30 07:55:20,113 INFO sqlalchemy.engine.base.Engine {}
Traceback (most recent call last):
  File "main.py", line 79, in <module>
    run_transaction(session, lambda conn: transfer_funds(session, 1, 2, 100))
  File "main.py", line 43, in run_transaction
    session.execute("SAVEPOINT cockroach_restart")
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/session.py", line 1176, in execute
    bind, close_with_result=True).execute(clause, params or {})
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 948, in execute
    return meth(self, multiparams, params)
  File "/Library/Python/2.7/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1200, in _execute_context
    context)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1413, in _handle_dbapi_exception
    exc_info
  File "/Library/Python/2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
    context)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/default.py", line 507, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.InternalError: (psycopg2.InternalError) SAVEPOINT COCKROACH_RESTART needs to be the first statement in a transaction
 [SQL: 'SAVEPOINT cockroach_restart'] (Background on this error at: http://sqlalche.me/e/2j85)

The repro code can be found here.

@nstewart nstewart removed the X-unactionable This was closed because it was unactionable. label Jun 30, 2018
@nstewart nstewart changed the title ORM expect savepoints to be the first statement in a transaction SAVEPOINT COCKROACH_RESTART needs to be the first statement in a transaction Jun 30, 2018
@knz
Copy link
Contributor

knz commented Jun 30, 2018

The sqlalchemy code is missing a txn begin call, so it is attempting to put all the SQL queries into a single (implicit) transaction.
Instead of this:

print_account_balances()
run_transaction(session, lambda conn: transfer_funds(session, 1, 2, 100))
print_account_balances()

try this:

print_account_balances()
session.begin()
run_transaction(session, lambda conn: transfer_funds(session, 1, 2, 100))
print_account_balances()

Arguably our doc that explains how to use run_transaction should warn the programmer that this is necessary.

@knz
Copy link
Contributor

knz commented Jun 30, 2018

@knz knz changed the title SAVEPOINT COCKROACH_RESTART needs to be the first statement in a transaction docs: clarify how sqlalchemy uses SQL transactions and how to combine with client retries Jun 30, 2018
@knz knz added the docs-todo label Jun 30, 2018
@knz
Copy link
Contributor

knz commented Jun 30, 2018

@rmloveland if you could have a look at this that would be swell. thanks!

@knz
Copy link
Contributor

knz commented Jun 30, 2018

@rmloveland also as Andy suspects, this problem may not be specific to SQLAlchemy. I think we should call out in more places that any client-side library or framework is more likely than not to start transactions implicitly, the client-directed retry system does not support this (well), so the two must be combined with care (making txns explicit).

@nstewart
Copy link
Contributor

nstewart commented Jun 30, 2018

session.begin() doesn't work. I've tried that and a few other things from their docs/stackoverflow. For anyone following along, just doing normal transaction rollbacks without savepoints was the workaround I ended up with in the short term.

Nathaniels-MacBook-Pro:savepointissue nate$ python main.py cockroachdb://root@localhost:26257/bank?sslmode=disable
2018-06-30 09:49:15,195 INFO sqlalchemy.engine.base.Engine select current_schema()
2018-06-30 09:49:15,195 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 09:49:15,199 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2018-06-30 09:49:15,199 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 09:49:15,201 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2018-06-30 09:49:15,201 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 09:49:15,204 INFO sqlalchemy.engine.base.Engine SHOW TABLES
2018-06-30 09:49:15,204 INFO sqlalchemy.engine.base.Engine {}
2018-06-30 09:49:15,210 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2018-06-30 09:49:15,210 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
WHERE accounts.id = %(id_1)s
2018-06-30 09:49:15,210 INFO sqlalchemy.engine.base.Engine {'id_1': 1}
2018-06-30 09:49:15,216 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
WHERE accounts.id = %(id_1)s
2018-06-30 09:49:15,216 INFO sqlalchemy.engine.base.Engine {'id_1': 2}
2018-06-30 09:49:15,218 INFO sqlalchemy.engine.base.Engine SELECT accounts.id AS accounts_id, accounts.balance AS accounts_balance
FROM accounts
2018-06-30 09:49:15,218 INFO sqlalchemy.engine.base.Engine {}
1 600
2 650
Traceback (most recent call last):
  File "main.py", line 79, in <module>
    session.begin()
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/session.py", line 857, in begin
    "A transaction is already begun.  Use "
sqlalchemy.exc.InvalidRequestError: A transaction is already begun.  Use subtransactions=True to allow subtransactions.

There are two things here:

  1. short term, as you mentioned, we need to document how to make this work. I'm pretty sure we can drill into sqlalchemy docs and get to a solution. I just didn't have time.
  2. (the one I'm most interested in). What do we need to do so the initial example works out of the box without modification (since it works fine for postgres)? Will the ticket you filed (sql: support rolling back nested transactions containing schema changes #10735) solve this?

@bdarnell
Copy link
Contributor

SQLAlchemy's session model has never made much sense to me. I believe the best way to use it is to create a new session for each transaction (if you don't, it may have implicit caches that obscure what's going on and create transactional inconsistencies). Our official run_transaction wrapper takes a sessionmaker instead of a session for this reason. (You should be using this instead of writing your own, BTW. We need to document this)

If you do want to reuse sessions, you must commit the implicit transaction before starting a new transaction using savepoints. In add_account, you unconditionally use the session, but you only commit if no account was found. In the other branch the transaction is left open, leading to this problem.

See also cockroachdb/sqlalchemy-cockroachdb#52

@nstewart
Copy link
Contributor

Thanks @bdarnell that works. While I have you, can you help me understand if general savepoint support would make this work out of the box (like the above demo app does when connected to pg)? In your blog you say We don’t support nested transactions, but we do support the special case of a single SAVEPOINT that covers the entire transaction:.

Would adding that alleviate this whole class of ORM problems?

@bdarnell
Copy link
Contributor

bdarnell commented Jul 1, 2018

Would adding that alleviate this whole class of ORM problems?

Generalized savepoints are a useful feature that some ORMs can take advantage of, but it doesn't exactly help with the current issue. The special cockroach_restart savepoint (which is used to have better starvation-avoidance for conflicting transactions) must encompass the entire transaction to do its job. The problem here is that the SQLAlchemy Session object is obscuring the true boundaries of your transaction from you, and that's not something we can fix on the server side.

SAVEPOINT cockroach_restart is admittedly a hack - if we had some other way to identify that one transaction is an attempt to retry another, we could get rid of it. However, as far as I can see this is impossible without even more invasive client-side modifications.

To the extent that we are able to recommend tools instead of meeting users with the tools they are already using, we should promote low-magic tools like the SQLAlchemy query builder instead of its ORM and Sessions (and analogous choices in other languages, such as jOOQ instead of Hibernate for java).

@knz knz moved this from Triage to Candidate next milestone in (DEPRECATED) SQL Front-end, Lang & Semantics Jul 5, 2018
@knz knz moved this from Candidate next milestone to Current milestone in (DEPRECATED) SQL Front-end, Lang & Semantics Jul 5, 2018
@knz knz added this to the 2.1 milestone Jul 23, 2018
@knz knz modified the milestones: 2.1, 2.1.x Oct 11, 2018
@rmloveland
Copy link
Collaborator

Closing since this was addressed via docs issue #3315.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL C-investigation Further steps needed to qualify. C-label will change. docs-todo
Projects
No open projects
Development

No branches or pull requests

5 participants