Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orleans Transactions #3369

Closed
jason-bragg opened this issue Aug 30, 2017 · 21 comments
Closed

Orleans Transactions #3369

jason-bragg opened this issue Aug 30, 2017 · 21 comments
Labels
stale Issues with no activity for the past 6 months
Milestone

Comments

@jason-bragg
Copy link
Contributor

jason-bragg commented Aug 30, 2017

Over the course of the next few weeks we’ll be introducing a series of code changes to make a Microsoft Research (MSR) developed implementation of Orleans Transactions publicly available. This initial version of Orleans Transactions is based off this paper from @philbe and Tamer Eldeeb, with the exception that it will use a timestamp based concurrency control instead of locking, and dependency injection (via the Facet System) instead of inheritance.

Related work:
Facet System - #3223
Orleans transaction core - #3358
Transaction Orchestration - #3382
Transactional State - #3397
Transaction manager tests - #3473
Reusable storage bridge - #3360
Orleans.Transactions.Azure - #3426
Simple account transfer transaction example - #3653

Upcoming:
Design Documentation
ITransactionalJournaledState Facet + testing
User Documentation

This was referenced Aug 30, 2017
@galvesribeiro
Copy link
Member

galvesribeiro commented Aug 31, 2017

Thanks for the issue explaining the steps.

with the exception that it will use a timestamp based concurrency control instead of locking, and dependency injection (via the Facet System) instead of inheritance.

I understand, and agree that we should avoid inheritance and use DI/facets to make it work. I dream with a day we will have (almost) POCO grain classes.

However, I don't understand the change from locking to timestamp. Are you saying that you are using optimistic concurrency on this implementation instead of the weak-lock we have first phase of 2PC? If that is the case, doesn't it break all the "new" strategy defended by @philbe paper? Can you elaborate more on that? Thanks

@galvesribeiro
Copy link
Member

Also:

Grain Based Transaction Manager, backed by azure storage.

That get back to my question on #3358. Are we introducing a new provider interface here? Or will it just be an enhanced version of current state storage provider ones?

@ReubenBond
Copy link
Member

@galvesribeiro this introduces new interfaces, since simple small blob storage (like what we use for grain state) is not sufficient for storing transaction logs.

@philbe
Copy link
Contributor

philbe commented Aug 31, 2017

@galvesribeiro, regarding timestamp-based concurrency control. Yes, this replaces two-phase locking for this first release. It's not optimistic. It just requires that operations on a grain execute in timestamp order, where the timestamp is assigned when the transaction begins. The mechanism rejects an operation on a grain with a small timestamp that would invalidate a previously executed operation on the grain with a larger timestamp.

It doesn't break the new strategy. A transaction that modified a grain G and is in the midst of 2PC does not prevent a later transaction from operating on G in parallel with 2PC. In fact, there could be a sequence of such transactions that operate on G, if they execute fast enough and if 2PC is slow enough (due to the latency of the two write operations required by 2PC).

@volcimaster
Copy link

Seems like this is going to rely on exceptionally accurate time-keeping, which more-or-less precludes too-distributed work on a given grain G (or subsets) across multiple machines.

What kind of timestamp verification and vetting is going on to ensure that they're "correct" (or, at least, "correct in context") for parallel workloads?

@gabikliot
Copy link
Contributor

To stress the last comment, could you please describe the timestamp allocation process. I would assume it is done by the tx coordinator. Is there an RPC to the coordinator per tx? Or in batches? What happens when coordinator fails? How do u ensure timestamp order in case of coordinator failover?
All obviously solvable, just curious about your approach.

And finally, @philbe, can u please shed some light on the reason for the change? Did u get convinced it is more efficient in our context, or easier to implement?

@philbe
Copy link
Contributor

philbe commented Sep 5, 2017

@volcimaster The accuracy of timekeeping has no effect on the correctness of timestamp-based concurrency control. However, it can affect performance. It is important that transaction timestamps are usually consistent with the order in which transactions are initiated in real-time (even across silos). For example, if a silo consistently assigns timestamps that are smaller than those assigned at other silos, then its transactions' updates on a given grain will frequently be rejected because another transaction with larger timestamp has already been executed.

@gabikliot To address this issue, the cluster-wide transaction manager assigns timestamps. Communication from a silo's transaction agent to the central transaction manager (TM) is batched. The TM periodically persists a timestamp high-water mark that is an upper bound on any timestamp it assigns. If it fails and recovers, it starts assigning timestamps that exceed the previous high-water mark.

RE: reason for the change. We had an implementation of the timestamp-based solution that was closer to production quality than our locking-based solution. We believe locking will perform better, and plan to incorporate it in the future.

@veikkoeeva
Copy link
Contributor

veikkoeeva commented Sep 5, 2017

Although this will change in the future, I feel like cross-referecing #1524 adds to the discussion. Especially since clock-drift is both a problem and an attack vector and the linked issue has good discussion and links to other still open issues. E.g. should the list at #2916 be updated?

@jason-bragg
Copy link
Contributor Author

jason-bragg commented Sep 5, 2017

The latest PR "Minimal implementation of transaction orchestration #3382" should provide some insight into how the timestamps are managed. While conceptually the transactions need be ordered in time (by timestamp), what is critical to the algorithm is a consistent ordering, which is, in this initial implementation, provided by having a single authoritative source of transaction order, the Transaction Manager (TM). The TM assigns ordered transaction IDs to each transaction across the cluster and the order of these IDs acts as the timestamp.

The single authoritative TM, while providing well ordered transactions, is both a bottleneck and a single point of failure. In the short term we're willing to accept these shortcomings to get the system up and running, but in the longer term we'll move away from a centralized transaction manager (as @philbe eluded to).

See TransactionManager.cs

@gabikliot
Copy link
Contributor

Jason, I thought Phil wrote that consistent ordering is NOT critical for correctness, only for performance (to minimize aborts). That is how I thought about it too. The correctness is via 2pc.
Phil, sounds like if each agent gets a batch of tx IDs from the central coordinator, there will indeed be a lot of aborts since timestamps will not correspond with the real order. That is why locking will perform better. On the other hand, rpc per timestamp is also a bottleneck.
Am I correct?

@volcimaster
Copy link

@philbe - you say, "accuracy of timekeeping has no effect on the correctness of timestamp-based concurrency control," then immediately follow it up with, "if a silo consistently assigns timestamps that are smaller than those assigned at other silos, then its transactions' updates on a given grain will frequently be rejected".

This sounds like timestamping is vital: otherwise you're redoing work anytime a silo is noticeably behind on its timestamps.

That's a problem.

@jason-bragg
Copy link
Contributor Author

@gabikliot My comment was not intended as a contradiction of @philbe's assertions, only an explanation of how the initial implementation deals with ordering. In general, and especially in regards to transactions, if anything I say contradicts @philbe, the safe assumption is that I misspoke or am wrong. :) As for the overall properties of the general algorithms, I'll refrain from commenting as my core focus has been on correctness and performance of the initial implementation.

@philbe
Copy link
Contributor

philbe commented Sep 6, 2017

@gabikliot

  • sounds like if each agent gets a batch of tx IDs from the central coordinator, there will indeed be a lot of aborts since timestamps will not correspond with the real order. That is why locking will perform better.

That's one reason. Another is that operations on a contended grain will not necessarily execute in timestamp order, which creates needless aborts.

  • On the other hand, rpc per timestamp is also a bottleneck.

It degrades throughput. However, it's mitigated by the batched communication from each per-silo transaction agent to the transaction manager (TM).

@voicimaster You're right that clock skew would be a problem if each silo independently assigned a timestamp to each transaction. However, in our implementation, the centralized TM is assigning timestamps (actually, sequentially-assigned transaction id's). A silo S would still be disadvantaged if it had sluggish communication with the TM, since it would be just like having S use a slow clock to timestamp transactions. But even in this case, throughput will be ok as long as S's transactions are only accessing lightly-contended grains.

@raufismayilov
Copy link

It is stated that "Grain Based Transaction Manager, backed by azure storage." is planned in scope of this feature. Is integration with SQL server also considered or will there be any means to implement one?

@galvesribeiro
Copy link
Member

@raufismayilov I believe this feature will evolve over time, so a provider model will make possible to implement different backends for TM state.

@jason-bragg
Copy link
Contributor Author

@raufismayilov, @galvesribeiro, TM state storage is critical both to correctness and performance. We'll be placing an abstraction layer over the storage to allow for other implementations, but due to the critical nature of this component, it will likely be in flux for a bit longer. The current abstraction for this is Orleans.Transactions.Abstractions.ITransactionLogStorage, which is currently pluggable with only an in-memory version available, but we are already in the process of redesigning that.

@galvesribeiro
Copy link
Member

galvesribeiro commented Sep 18, 2017

@jason-bragg good to know the abstractions are already in place. As usual, Azure storage will serve perfectly as a model for other backend implementations. Once it is done, others will come up soon I believe. :)

@zhaoqinghai
Copy link

image

@stgolem
Copy link

stgolem commented Sep 4, 2018

@zhaoqinghai i think default Required option is suitable almost always.
IMO RequiresNew is really an edge case.

@ghost ghost added the stale Issues with no activity for the past 6 months label Dec 11, 2021
@ghost
Copy link

ghost commented Dec 11, 2021

We are marking this issue as stale due to the lack of activity in the past six months. If there is no further activity within two weeks, this issue will be closed. You can always create a new issue based on the guidelines provided in our pinned announcement.

@ghost
Copy link

ghost commented Mar 5, 2022

This issue has been marked stale for the past 30 and is being closed due to lack of activity.

@ghost ghost closed this as completed Mar 5, 2022
@dotnet dotnet locked as resolved and limited conversation to collaborators Apr 4, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stale Issues with no activity for the past 6 months
Projects
None yet
Development

No branches or pull requests