Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: provide finer-grained client control over txn 2PC #26780

Closed
wzrdtales opened this issue Jun 18, 2018 · 12 comments
Closed

sql: provide finer-grained client control over txn 2PC #26780

wzrdtales opened this issue Jun 18, 2018 · 12 comments
Labels
A-sql-execution Relating to SQL execution. C-investigation Further steps needed to qualify. C-label will change. O-community Originated from the community S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Milestone

Comments

@wzrdtales
Copy link

wzrdtales commented Jun 18, 2018

BUG REPORT

Transactions don't fail even if they should not be allowed to read. This is a huge problem, when you actually want to use transactions and assume them to safely fail when reading something that should be in a locking scenario.

I produced a little test case to reproduce this, you will need at least node 8 to execute.

Executing results in the following

[INFO] No migrations to run
[INFO] Done
{ driver: 'cockroachdb',
  host: 'localhost',
  port: 26257,
  user: 'root',
  database: 'somedb' }
start 0
start 3
start 1
start 2
start 4
0 'survived here with nonce' '1'
3 'survived here with nonce' '2'
2 'survived here with nonce' '2'
start 2
2 'survived here with nonce' '3'
4 'survived here with nonce' '4'
1 'survived here with nonce' '2'
start 1
1 'survived here with nonce' '5'

So what we can see here is that run 3 and 2 resulted both in the nonce 2 and they did not fail before commit was submitted. This will effectively impact any other call which would be made in between within this transaction. This behaviour is in many scenarios not wanted and should be at least controllable.

Here is the repo to reproduce (subfolder unserialized): https://gitlab.com/wzrdtales/cockroachdb-nukes

This is a critical issue, since this means transactions are limited to queries, but not business logic, which makes it in many use cases completely useless.

@wzrdtales
Copy link
Author

One example where this completely throws everything abord:

select something;
  wait for service execution;
update local state;

if now the transactions fails, the external service execution has been started anyway and the state is now finally corrupt.

@wzrdtales
Copy link
Author

I guess this is related to #6583

@tbg tbg changed the title serializable transaction does not fail before commit is submitted kv: fail transaction early instead of at commit time Jun 18, 2018
@tbg tbg changed the title kv: fail transaction early instead of at commit time sql: fail transaction early instead of at commit time Jun 18, 2018
@tbg tbg added the A-sql-execution Relating to SQL execution. label Jun 18, 2018
@tbg
Copy link
Member

tbg commented Jun 18, 2018

Looking at your code, a summary of the workload is

# each client:
BEGIN;
SELECT "nonce" FROM "nonce" WHERE "id" = 0`);
UPDATE "nonce" SET "nonce" = "nonce" + 1 WHERE "id" = 0`
 console.log(i, "survived here with nonce", nonce.nonce);
COMMIT;

You sometimes see the update succeed and then the commit fails, whereas you'd prefer to see conflicts earlier, in the UPDATE statement.

In CockroachDB, conflicts are usually deferred. This has two main reasons (ignoring historical ones):

  1. allowing a txn to go through with its complete write set prevents many starvation scenarios
  2. it is more effective in reducing the latency required (we're exploiting this in a proposal called async writes).

You're right that sometimes you may explicitly want to opt out of that behavior.

Note that could be a footgun. For example, if the update succeeds, your transaction could still be aborted (by another higher priority transaction removing your write) before it gets to commit. Basically you always want to defer side effects until after you have committed (which I assume is where you're coming from). So if we implemented this, it would only abort you eagerly "most of the time" on the update when you're not going to commit, but it's not guaranteed to always do it.

Do you have a concrete example of what you're trying to achieve?

@wzrdtales
Copy link
Author

wzrdtales commented Jun 18, 2018

I have a huge microservice system in front of me. One of the examples where this is critical is that I want to have a second service store some information, that is only true for this single transaction and wait for it to be saved, or otherwise roll back. This is not possible at all currently and not fail safe. The example given at the top perfectly illustrates this. I read WRONG information from the table, send it off to something else to act upon it, may be even save it and am not able to roll it back.

In the example of the nonce, your only chance to do this is get the nonce outside of the transaction, so after it executed. This is terrible since your transaction wont be rolled back anymore should your service die unexpectedly (for example due to a power cut), but you still run with the wrong value and depending on how critical a correct state is, it will cause even greater damage along the other services and consumers.

@wzrdtales
Copy link
Author

So the problem is really the incorrect information that is read from the database, that might be utilized elsewhere. And there is no safe way to do this, you just can't currently. So you can't trust your data in some concurrency schemes.

@tbg
Copy link
Member

tbg commented Jun 19, 2018

A transaction gives you a consistent view of the data while it's running, and guarantees that the data has been written (in a serializable way) if and only if it commits successfully. What you are doing is pretending that the transaction will commit before you have done so. Doing so is incorrect not only in CRDB, but in many other data stores as well (a few may give you the guarantee that you want, but probably not intentionally).

For example, take a look at the "black and white" Postgres example. The first transaction runs an UPDATE and gets an error on COMMIT. This is very similar to the situation here.

What's a bit different in CRDB is that we intentionally defer many restart errors until the final COMMIT operation. It's conceivable that this should be opt-out; it isn't currently. However, it doesn't seem that that would be what you want as you're looking to get some guarantee about your writes in the absence of an abort that just won't hold.

@knz
Copy link
Contributor

knz commented Jul 9, 2018

@wzrdtales from your problem description it looks like what you're wanting is a two-phase commit across your db client microservice and the other microservice. This might be helped with XA distributed transactions (#22329), do you confirm?

(we're looking for realistic scenarios to motivate further work on #22329.)

@knz knz added this to Triage in (DEPRECATED) SQL execution via automation Jul 9, 2018
@knz knz added C-investigation Further steps needed to qualify. C-label will change. O-community Originated from the community S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. labels Jul 9, 2018
@wzrdtales
Copy link
Author

@knz Actually yes. Having controllable "locks" or at least lock like behaviour would be just an easy way to archieve what I am looking for. Having a transaction that can be joined in by a second microservice would be the perfect solution though.

@wzrdtales
Copy link
Author

But I wouldn't need Open XA for that and by all means I always avoid ODBC.

@knz
Copy link
Contributor

knz commented Jul 10, 2018

Thanks. Yes I understand you just need more control over the transaction lifecycle.

@knz knz changed the title sql: fail transaction early instead of at commit time sql: provide finer-grained control over txn 2PC Jul 10, 2018
@knz knz changed the title sql: provide finer-grained control over txn 2PC sql: provide finer-grained clkent control over txn 2PC Jul 10, 2018
@knz knz changed the title sql: provide finer-grained clkent control over txn 2PC sql: provide finer-grained client control over txn 2PC Jul 10, 2018
@wzrdtales
Copy link
Author

Yes basically. Two things that are interesting. From the micro service side, transactions that can be joined across services on the same db, would be quite powerful. Since you need to isolate data ownage in micro service systems, you often find the situation where you may end up with an inconsistent state.

On the other side, your described scenario, where you need asic semantics across different services. For example, my nonce example is actually real world one. This is relevant for the ethereum blockchain, since you need to manage the nonce externally, to effectively enable your service itself to keep being concurrent (the ethereum design around the whole nonce thing is everything but pretty though, not a huge fan of that...) . So for that second scenario actually something like Open XA could be a good way, but actually not Open XA itself. For that example in this case instead it ignores the success, error correction will take place when something wents south, which defers the detection of the error and the handling, but in that case it is asynchronous anyways.

@knz knz added this to Triage in (DEPRECATED) SQL execution via automation Jul 21, 2018
@tbg tbg added this to the Later milestone Jul 22, 2018
@jordanlewis jordanlewis moved this from Triage to Lower Priority Backlog in (DEPRECATED) SQL execution Oct 17, 2018
@jordanlewis jordanlewis added this to Incoming in KV via automation May 7, 2019
@jordanlewis jordanlewis removed this from Lower Priority Backlog in (DEPRECATED) SQL execution May 7, 2019
@jordanlewis jordanlewis removed this from Incoming in KV May 7, 2019
@nvanbenschoten
Copy link
Member

A transaction gives you a consistent view of the data while it's running, and guarantees that the data has been written (in a serializable way) if and only if it commits successfully. What you are doing is pretending that the transaction will commit before you have done so. Doing so is incorrect not only in CRDB, but in many other data stores as well (a few may give you the guarantee that you want, but probably not intentionally).

I agree with this. If a transaction has not reached its atomicity point, then it can not be assumed to have succeeded. So the only option to provide the guarantee that you are asking for here is for all transactions to serialize before/while evaluating. We now support SELECT FOR UPDATE, which you mentioned above, which provides more control over locking. However, even with more control over locking, the hard guarantee that you want is still fundamentally impossible to provide in a distributed system that tolerates node failures and network partitions.

I'm going to close this in favor of #22329. If we do decide to expose finer-grained control over transactions such that ACID semantics can be preserved across multiple database engines, I don't think we'd want to stray from the existing standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-execution Relating to SQL execution. C-investigation Further steps needed to qualify. C-label will change. O-community Originated from the community S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Projects
None yet
Development

No branches or pull requests

4 participants