Replies: 4 comments 4 replies
-
(Looks like discussions doesn't support internal links :( ) Beginning of the discussion: #1708 (comment) Now as I look at it more, I like |
Beta Was this translation helpful? Give feedback.
-
This is great, thanks for starting the discussion.
The above strikes me as the most Pythonic approach. Nesting
I'm not sure I understand the point in that section. If you didn't enclose your reads and your writes into a transaction, your code is wrong, regardless of whether the statement is repeated. Thus, I don't think this is a good argument against adding an option to automatically repeat single Compare: await pool.execute('INSERT ...') and async with pool.atomic() as atomic:
async for transaction in atomic:
await transaction.execute('INSERT') |
Beta Was this translation helpful? Give feedback.
-
Thank you, Paul, for summarizing this. It's an excellent start. I'd like us to continue the discussion and then create an RFC out of this (shouldn't be too much work given the amount of content you already have here): Few comments:
Out of all proposed variants I personally like
Note that applying the
While still reading your proposal I've immediately thought of something like this. There are a few fundamental problems with it though:
The reverse API ( So it's basically a choice between simple looking: for tx in db.atomic(): # or 'async for' for better UX
async with tx:
await tx.query(...) and a more explicit: async with db.atomic() as atomic:
async for tx in atomic:
async with tx:
await tx.query(...) (I'm using I like the former variant (proposed by Paul) better, even though it's a hack. Given that this looks like the only sane API for Python, I'd like to propose to use async for tx in db.retry():
async with tx:
# ... code ...
Can you show examples of where we'd need multiple inheritance? Using methods would be more preferable in JS, where the
Composability is really that last nail in the old (current) API's coffin. I'm -1 on making |
Beta Was this translation helpful? Give feedback.
-
I like
Much prefer |
Beta Was this translation helpful? Give feedback.
-
Motivation
Sometimes PostgreSQL can't apply concurrent transactions and errors out with:
It's expected that transaction may be repeated and be successful when repeated. Common case when this happens is using
INSERT ... UNLESS CONFLICT ..
statement in EdgeDB. See Repeating EdgeQL section for an explanation of why Postgres and EdgeDB can't repeat transactions automatically.EdgeDB has only
REPEATABLE READ
andSERIALIZABLE
isolation levels. That means errors like the above may happen more often than with the defaultREAD COMMITTED
isolation level in Postgres.As we implement retrying connection on "concurrent update" we also want to handle:
BEGIN TRANSACTION
and beforeCOMMIT
(we can't reliably know whether commit itself worked if we sent lastCOMMIT
and got no response)There might be a different number of attempts and different timeout settings, though.
Proposal
General idea of the feature:
We need to introduce a new term. We can't use the term
transaction
for retryable transaction, because users might not expect "transaction" to be a retryable block. Here is a quick brainstorm on the block name:db.atomic(t => t.execute(..))
is my favorite so far, so I'll use it in examples. It reasonably means a block that is applied "atomically" (as seen by an out of transaction observer), but not a "transaction" so one will need to look in the docs to find out what it does specifically first time they see it.db.mutate(transaction => transaction.execute(..))
db.apply(transaction => transaction.execute(..))
db.unit_of_work(t => t.execute(..))
db.block(t => t.execute(..))
for transaction in db.retry():
(see Alternatives)db.try(transaction => transaction.execute(..))
db.retry_transaction(t => t.execute(..))
Also we should rename
pool.transaction
so it isn't the first thing people pick when they need transactions:with db.raw_transaction() as t:
is my (@tailhook) favourite, also requires user to find out what raw transaction iswith db.try_transaction() as t:
also good as as infor _ in range(10): with db.try_transaction(): ...
with db.plain_transaction() as t:
with db.unreliable_transaction() as t:
with db.single_transaction() as t:
JavaScript API
JavaScript API is quite straighforward:
Settings of how many retries and timeouts should be either in
poolConfig
or theconnectOptions
depending on how we decideatomic()
to work on connection object or if at all (see Open Questions)We may also allow option form:
pool.atomic({retries: 5}, t => {...})
, orpool.with_options({retries: 5}).atomic(t => {...})
Exceptions API
Error throwed by the client should have three methods:
err.isTransient()
-- for concurrent update and deadlock errorerr.isNetworkFailure()
-- basically istrue
when connectionis_closed()
after the exceptionerr.isEarlyNetworkFailure()
-- istrue
when connection was closed before request is sent. This is needed to retryconn.query()
in non-transaction contextAnd we should ensure that all network errors are caught and transformed into proper error classes having these methods.
We don't provide catch-all method like
isRetriable
, because network failures must not be caught duringCOMMIT
so users have the two errors a little bit different:Python API
Python API is harder. We can't use
with
statement any more because withblocks can't be retried.
We probably want to support two APIs, as a decorator of an in-place function:
And as a function call:
Note: this form forwards all the arguments to the function. We may want to special-case options:
db.atomic(my_tx, options=TransactionOptions(**opts))
.Also see Alternatives.
Exceptions API
Error throwed by the client should have three methods:
err.is_transient()
err.is_network_failure()
err.is_early_network_failure()
See JavaScript API for descriptions
Open Questions
Transaction on Specific Connection
Do we want and how
connection.atomic()
should work?a) It may only retry on the same connection and fail on disconnect
b) It may reconnect and replace underlying socket in the connection object and retry
c) We may only allow transactions on connection pools
Retry Single Queries on Connection Error
To make changing EdgeDB address or restarting EdgeDB work nicely, we need to retry simple queries on connection errors too:
But there are couple of issues:
There are couple of ideas to differentiate read-only and mutable queries:
Prepare
(and it's always safe to retry beforeExecute
happens)pool.query_atomic(..)
shortcut topool.atomic(lambda t: t.query(..))
pool.read_only_query()
methodpool.read_only().query()
pool.with_options({readonly: true}).query()
Note: methods (2-6) are also helpful for working with primary/replica installations. But probably only last two would allow full power, as they allow
pool.read_only(primary=true)
(i.e. in case you need read-only transaction that can't go to a replica).This issue can be solved by a later RFC.
Learning Curve
This complicates learning curve, but:
So while increasing learning curve we fix heisenbugs and simplify operations.
Failure Injection
I propose the following enabled by default:
Collect statitics of how many queries are executed in the previous second and on each new request trigger a failure with the probability of
1/n
where n is the number of requests in the previous second. We still need to figure out whethern
counts queries, transactions, or mutable queries/transactions (and have a list of exceptions, perhaps: dump+restore+migrations).The idea is that there will be ~1 failure per second. So on local instance when testing manually it would hit almost every request (which is fine as repeating them shouldn't be prohibitively costly). But under a huge load of thousands request per second 1 failure per second doesn't influence anything so even for production and/or benchmarks this is fine.
I think it should be disabled by an explicit command-line argument like
--disable-failure-injection
but might be tweaked with configuration settings?Alternatives
Python Variable Propagation
Function call API may not support arbitrary
*args, **kw
, and rely onpartial
:In this case we should make
transaction
argument a keyword arg. Because using positional argument in this case would make optional arguments impossible (skipping an argument will shifttransaction
arg), and adding another argument during refactoring is very error-prone.Python Loop/With API
We may also want
for/with
API:Or reverse:
There are number of pros and cons of this comparing to decorator:
Exception API
Instead of
is_transient
,is_network_failure
we could have subclasses that are tested withisinstance
/instanceof
. But this may eventually need multiple inheritance. Multiple inheritance doesn't work in JavaScript. And multiple inheritance and deep class hierarchies are rarely good. And at some point we may want to introduce other helpers likeis_retriable
this certainly doesn't work without multiple inheritance. (@tailhook: but maybe I like it because it is the Rust way)Repeating EdgeQL
One may think that why we can collect all the queries in the client (or even at the server) and retry.
The problem is that sometimes writes depend on previous reads:
If it happens that two transactions updating money will happen concurrently, it's possible that user have negative balance, even while code suggests it can't (when retrying transaction we don't check
if
again). But if we retry the whole block of code it will work correctly.Enabling Retries in Connection Options
At least for JavaScript we could keep the API as is, and then use connection configuration to introduce retries:
There are few problems of this approach:
with db.transaction()
as we allow now.transaction
on connection object, reconnecting on network failures would be an issueKeep
transaction
as but add a Helper for RetryingThe problem with this approach is that it hard to teach using
atomic
when row transactions "work on my laptop". However, this is somewhat alleviated by failure injection.Beta Was this translation helpful? Give feedback.
All reactions