feat(dbapi): use inline begin to eliminate BeginTransaction RPC#1502
feat(dbapi): use inline begin to eliminate BeginTransaction RPC#1502waiho-gumloop wants to merge 1 commit intogoogleapis:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @waiho-gumloop, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request optimizes transaction handling in the Spanner DBAPI by leveraging the existing inline begin mechanism. It removes an unnecessary explicit Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable optimization by removing the explicit BeginTransaction RPC call in transaction_checkout. Instead, it leverages the existing inline begin functionality, where the transaction is initiated with the first SQL execution. This change effectively reduces latency by saving one network round-trip per transaction. The modification is well-supported by a new unit test ensuring begin() is not called, and the detailed description provides a thorough safety analysis, confirming the change is sound.
f176484 to
880081e
Compare
Fixes #1503
Summary
Connection.transaction_checkout()currently callsTransaction.begin()explicitly before the first query, which sends a standaloneBeginTransactiongRPC RPC. This is unnecessary because theTransactionclass already supports inline begin — piggybackingBeginTransactiononto the firstExecuteSql/ExecuteBatchDmlrequest viaTransactionSelector(begin=...).This PR removes the explicit
begin()call, letting the existing inline begin logic inexecute_sql(),execute_update(), andbatch_update()handle transaction creation. This saves one gRPC round-trip per transaction (~16ms measured on the emulator).What changed
google/cloud/spanner_dbapi/connection.py—transaction_checkout():self._transaction.begin()on L413_transaction_id=Nonetests/unit/spanner_dbapi/test_connection.py:test_transaction_checkout_does_not_call_beginto assertbegin()is not calledWhy this is safe
The inline begin code path already exists and is battle-tested —
Session.run_in_transaction()creates aTransactionwithout callingbegin()and relies on the same inline begin logic (session.py L566).Specific safety analysis:
transaction_checkout()callers always execute SQL immediately: It's only called fromrun_statement()→execute_sql()andbatch_dml_executor→batch_update(). Both set_transaction_idvia inline begin before any commit/rollback path.execute_sql/execute_update/batch_updatehandle_transaction_id is None: They acquire a lock, use_make_txn_selector()which returnsTransactionSelector(begin=...), and store the returned_transaction_id(transaction.py L612-L623).rollback()handles_transaction_id is None: Skips the RPC — correct when no server-side transaction exists (transaction.py L163).commit()handles_transaction_id is None: Falls back to_begin_mutations_only_transaction()for mutation-only transactions (transaction.py L263-L267).Retry mechanism is compatible:
_set_connection_for_retry()resets_spanner_transaction_started=False, so replayed statements go throughtransaction_checkout()again, create a freshTransaction, and use inline begin.PEP 249 conformance
This change is fully conformant with PEP 249 (DB-API 2.0). The spec does not define a
begin()method — transactions are implicit. The PEP author clarified on the DB-SIG mailing list that "transactions start implicitly after you connect and after you call.commit()or.rollback()", and the mechanism by which the driver starts the server-side transaction is an implementation detail. Deferring the server-side begin to the first SQL execution (as psycopg2 and other mature DB-API drivers do) is the standard approach. The observable transactional semantics — atomicity betweencommit()/rollback()calls — are unchanged.Performance impact
Before (4 RPCs per read-write transaction):
After (3 RPCs per read-write transaction):
Measured ~16ms savings per transaction on the Spanner emulator.
Context
This optimization was identified while profiling SQLAlchemy/DBAPI performance against Spanner. The DBAPI was created ~2 years before inline begin support was added to the Python client library (inline begin landed in PR #740, Dec 2022). The
run_in_transactionpath was updated to use inline begin, buttransaction_checkoutwas not.Test plan
tests/unit/spanner_dbapi/)tests/system/test_dbapi.py)begin()is not called bytransaction_checkout()