General purpose distributed transaction concept for Orleans grains. #1880

centur · 2016-06-28T14:35:13Z

Hi, guys. I want to discuss one idea about how distributed transactions can be implemented with Orleans.
I'm not an expert in this matter but read a lot of various opinions on sagas and other implementation proposals like #1090.

First, let me put some basic issues we have with Sagas concept (terminology is based on this presentation Applying the Saga Pattern ).
It sounds solid in theory, but we have some challenges in practice:

Every CT (compensating transaction) is usually a product not only of FT (forward transaction) but also a current grain state (CurState). So CT = f(FT,CurState).
There is a risk of failure when transferring CT command anywhere and this risk is lower if we will save it on the current grain (it may fail either, but it's a localized failure so we can retry WriteStateAsync few times).
Not every FT can easily produce CT, which is obeying same business rules and allowed operations. E.g.:
FT is UpdatePassword, which should be handled like State.Password = oneWayHash(UpdatePassword.NewPassword). Within such operations it's impossible to implement CT without subverting application rules: CT would be a SetPasswordHash with handler
State.Password = SetPasswordHash.OldPasswordHash which breaks incapsulation and exposes internal implementation details to an external entity. Things get even worse on the business domain edges - e.g. when FT interacts with any external service which doesn't allow such reversal (you can't revert password in Gmail back to an old one)

So Sagas are great on papers, but engineers want to build practical systems.

We had discussed few concepts and one of them seems reliable and doable. Just to give it some name - let's call it Eventual Transaction. This pattern relies on Orleans guarantees and behaviour and may be tricky to implement outside of Orleans.

So the pattern is:

We have general purpose EventualTransactionCoordinator (ETC) with the following properties:
- it's a stateful grain, which marks our transaction as committed or rolled back.
- It allows external grains to register on ETC and record their replies (more below)
We have participating grains with unified interface (each grain has a single method Perform and type of action defined by the type of input object (message)

Transaction flow resembles the one from 2 phase commit, but with 'eventual' second phase.

ETC grain starts and saves few pieces of information to the storage - TransactionId, StartTime, current TransactionState == Pending and expected number of participating grains. It also starts a timer\reminder which automatically rollbacks the transaction if expired.
Transaction Id is being passed to every FT command to every participating grain.

2.1 As part of Perform(FT) each grain has to save full FT message to a state property along with transactionId, store State+FT reliably, tries to apply FT to a copy of current State and send Confirmation back to ETC with a reference to itself. Effectively saying - 'I'm ready to continue with this Transaction'.

2.2 On each Confirmation - ETC saves reliably grainRef of a confirmed grain.
Once all participating grains sends confirmation ( either positive or negative) - ETC calculates the overall transaction state (Commit or Rollback) and persists this result to storage. This is the end of our transaction.
If timeout reached - ETC transition itself to a 'Rollback' state.
On activation ETC checks current time and Transaction startTime and if it's still in Pending but lapsed the timer - it does 'Rollback'.

Now comes the eventual consistency phase:
We really not interested in pushing all participating grains to commit or rollback immediately cause we have an extra logic around serving other requests to these grains:

When any other message (read or modify) comes to a grain which participated in a transaction, the grain must do the following: before handling the current request - it must reach ETC (using saved TransactionId) and check it's state.
4.1 If it is 'Commit' - grain must apply FT to its own current state and persist it. Grain cannot proceed further until it reliably stores mutated State. After this - grain can participate in another transaction or serve 'read' request based on the new State.
4.2 If ETC state is 'Rollback' - Grain can drop FT and TransactionID and reply to the current message from an existing State.
4.3. If ETC state is 'Pending' (this can be due to transaction in progress or waiting of timeout) - Grain can serve Read requests from a current state (without applying FT) but must reject Modify requests ( to prevent transactions overlap)

So based on this pattern and some extra overhead (TransactionId and some confirmation messages for a first phase) we can ensure that after our ETC commits or rollbacks atomically - we can guarantee (to a certain degree) that all participating grains can successfully apply FT to their states and will not serve any other Modify messages until each of them finalizes it's own part of transaction. Also, we have clear fixation points when we can crash any silo and still resume the transaction.
All cross grain message delivery should be At-Least-Once as with Sagas.

It also looks doable with server-side interception if SSI has access to an actual grain state (to verify that FT is applicable to State without exception)

Downsides \ Grey areas:

If for some reason we passed the initial application of FT to State but fail to apply it on transaction finalization ( really nasty bug in our code) - Transaction is committed and all grains can apply FT and participate in new transactions except this bad grain which is deadlocked on itself. Once such bug is fixed (manually or after some code change) - next invocation of this grain should unlock it.
We have a hanging ETC state in the database until every grain finalises its own eventual transaction. From time to time ETC can poll these grains (e.g. using stored GrainRef as per 2.2). It's not clear who must destroy ETC State after all grains finalise their states. Should it be an external process which scans all transactions and cleans ones who are no longer necessary ( all participating grains commited or reverted FT)?
We have some blocking for updates, but it's localised to participating grains (and each grain 'unblock' itself on a next call as per p.4)

Really keen to hear any constructive feedback and explanations why this will not work or how to achieve similar results with less pain and more gain.

cc @philbe (I saw this 'invocation' in 1090 - seems like works well ;) )

PS: sorry for a long description...

The text was updated successfully, but these errors were encountered:

sergeybykov · 2016-06-28T23:40:41Z

I wonder what @philbe thinks about this. He's been working on something similar, although not quite the same.

Are you not concerned about deadlocks? If ETCs are different for different transactions, there will be no way to detect them? As I understand, they will get eventually broken up via timeouts and rollbacks.

sergeybykov · 2016-06-28T23:51:29Z

Grain can serve Read requests from a current state (without applying FT) but must reject Modify requests ( to prevent transactions overlap)

So callers will get random rejections for write request if the grain happens to be in the middle of another transaction? That is probably okay for an app where such a situation is unlikely to happen due to its high level logic flow. But as a general purpose mechanism seems brittle to me. If writes were queued up and processed once the current transaction completes, that would be a friendlier behavior I think.

ReubenBond · 2016-06-28T23:59:57Z

If writes were queued up and processed once the current transaction completes, that would be a friendlier behavior I think.

And queuing would be fairly easy to add, too, using interceptors.

centur · 2016-06-29T00:00:12Z

I'm not sure how deadlocks can happen with this model - they will be automatically resolved with one of the competing transaction dies:

We have FooECT and BazETC,
FooETC transaction involves FooOne, FooTwo and SharedBar grains
BazETC involves BazOne, BazTwo and also SharedBar

Whoever locks SharedBar first - wins, as soon as the lock confirmed e.g. for FooETC - BazETC attempt to lock SharedBar will be rejected immediately, so BazETC will rollback the entire transaction.

If there will be a race and mutual lock (we have 2 SharedBars in both ETCs and one was locked in by FooETC and another one by BazETC ) - in worst case - both transactions will be safely rolled back, but there is a chance that one will be rolled back before another's lock attempt and one of them can succeed.

centur · 2016-06-29T00:01:41Z

@sergeybykov Yes, callers can get random rejections for write request but this rejection happens BEFORE any actual changes happen and they can safely rollback.

sergeybykov · 2016-06-29T00:05:40Z

ETC grain starts and saves few pieces of information to the storage - TransactionId, StartTime, current TransactionState == Pending and expected number of participating grains.

Who initially calls ETC and generates a TransactionId? A first grain in a transaction?

centur · 2016-06-29T00:14:17Z

Queuing up writes can work if a consequent write can be applied without checking the current state ( e.g. it's irrelevant for a write if prev transaction was commited or rolled back) but in practice it's not very common. Let's take bank accounts as an example:

Basic transaction involves ETC, Sender, Receiver.

Balances are 100 and 100.
FirstETC transaction says - Transfer 70 to Receiver.
SecondETC transaction says - Transfer 50 to Receiver.

The success or failure of SecondETC strongly depends on Success or Failure of FirstETC.

If FirstETC succeeds - balances will be 30 and 170 and SecondETC can't really withdraw 50 from balance with 30. So they are not really independent and must complete transaction in full before starting the next one. We can queue them up on the grain mailboxes but any split or State write failures would cause a havoc (or I don't see how can it be resolved reliably with a positive outcome for a second transaction if first is rolled back... or...)

Is this a correct understanding of what do you mean by queuing writes ? Like a grain can be involved in 2 transactions in parallel ?

ReubenBond · 2016-06-29T00:18:16Z

@centur

Is this a correct understanding of what do you mean by queuing writes ? Like a grain can be involved in 2 transactions in parallel ?

I meant simply defering requests to methods marked as [Write] (or whatever) by sticking them into a Queue<Tuple<InvokeMethodRequest,IGrainMethodInvoker>> or something until the current transaction has completed

centur · 2016-06-29T00:19:51Z

@sergeybykov Who initially calls ETC and generates a TransactionId? A first grain in a transaction?

We are using stateless grains to bundle up business logic of calling statefull grains and have central place to handle bizlogic errors. An equivalent of Unit of Work in a stateless grain.
Anyone can start a Transaction, even client, It might be done as IDisposable pattern:

using(var ETCGrain = GF.GetGrain<ITransactionCoordinator>(Guid.NewId()))
{
    await FooGrain.Perform(FooAction, ETCGrain.TransactionID);
    await BazGrain.Perform(BazAction, ETCGrain.TransactionID);
    await OtherGrain.Perform(OtherAction, ETCGrain.TransactionID);
// Can we invoke a Grain method on proxy disposing ? If no - we can just do
    ETCGrain.Commit(); 
// And ETCGrain.Rollback();  in Catch-Finally
}

centur · 2016-06-29T00:49:47Z

If I understand the server-side interception correctly - we can organize this entire transaction via interceptors and an extra state\ reliable storage for a transaction messages, we don't even need to have a unified interface on the grains cause we can operate over intercepted message. Just need to understand how we can shadow-copy the grain with the state and run that 'Apply FT to a State copy ' as described in 2.1...

philbe · 2016-07-01T02:58:20Z

I think your proposal ensures update transactions will be ACID, that is, atomic and durable. But they won’t be isolated (that is, serializable) if you allow transactions to read locked grains.

Since a program can read a locked grain, you allow inconsistent reads. Consider your money transfer example. Suppose a transaction T1 is transferring $50 from account1 to account 2. After T1 has debited account1 and before it has credited account2, another transaction T2 could read the balances in account1 and account2. This is an inconsistent state. If T2 writes the sum of the values it read in account1 and account2 into a third grain, audit-result, the latter would be missing the $50 that’s in transit. That’s a result that couldn’t happen if transactions executed serially.

You can avoid this outcome by not allowing transactions to read modified grains. In this example, T2 wouldn’t be able to read account1. That would ensure isolation, but it would reduce throughput.
You are aborting a transaction if it encounters a grain that has been modified but not yet committed by another transaction. That’s fine correctness-wise, but it does limit transaction throughput. Allowing a transaction to wait for the modified grain to commit will improve throughput, but then you’ll have the possibility of deadlocks. You can use your transaction timeout mechanism to avoid deadlocks. You’ll occasionally abort transactions that aren’t really deadlocked, just slow, but it won’t affect correctness.

Have you defined throughput requirements? Each transaction does quite a few sequential writes to storage.

I’m sorry and embarrassed that we haven’t finished cleaning up our transaction implementation sufficiently to make it public. I’ll see what I can do to speed things up.

centur · 2016-07-01T04:41:23Z

Thanks, @philbe .
In the described scheme I allow others to read only "all old states" or "all new", because the transition from old to new happens on transactionGrain.Commit().
Talking in T1 and T2 -

When T1 is pending - T2 can read only old states, which were before T1 started ( OldAccount1 and OldAccount2).
When T1 is commited - Acc1Grain and Acc2Grain will be in pre-charged state - any request sent to them will cause "Apply T1 change" side-effect before the grain can continue with a current request.
When T1 is rejected - any call will cause side effect which will drop "apply T1 change" and unlock grain for a new transaction.

Technically - on each access to Account1 and Account2 grains, these grains have to lookup in T1 Grain for the status, and based on that - decide how to serve the request - either return current OldAccount1\OldAccount2 states or delay the request, perform an actual T1 state mutation and serve NewAccount1 state.
By this point we may be in some physically inconsistent state, because if Acc1 was read and T1 is commited - Acc1 will be in a NewAccount1 state, but Acc2 will be still in 'unsure' state, awaiting for a next call to trigger same "Apply T1" effect and transition to NewAccount2 state.
But taking in account Orleans guarantees - this will happen inevitable, right ? And it will happen before any other call

I haven't had any throughput requirements yet, neither generic enough implementations. This idea is to smoke test the concept itself - whether more eyes can find any flaws.

gabikliot · 2016-07-01T05:39:31Z

What I did not understand in your proposal is the sentence "So Sagas are great on papers, but engineers want to build practical systems." What you propose is a standard regular transactions (with a bit different/weaker correctness semantics, as Phil mentioned).
Sagas are about multi-stage, long running processes, which involve a sequence of transactions. So not sure how you compare the two.
Or maybe you mean that this since it involves both persisting state AND rpcs (grain calls), it resembles Sagas?

centur · 2016-07-01T06:05:45Z

@gabikliot That was an unnecessary caustic statement to prevent recommendation a-la "in distributed world you can't have reliable transactions and must use sagas". Sorry if it insulted anyone.

Before we came to this concept - we tried to model how Sagas pattern can be applied to our system and found that it doesn't really work well on our scenarios and around edge cases - we were looking for atomic and easy to implement transaction concept, although which can run for some extended time (transaction timeout can be relatively long) and still doesn't affect the entire system, just create some minor congestion around update\delete operations ( which is usually less frequent than reads).

gabikliot · 2016-07-01T07:12:41Z

Ohh, I see. So long running transactions, but not necessarily multi stage. Makes sense.

philbe · 2016-07-06T17:12:48Z

Thanks @centur for your explanation of consistent reads. That helps. However, I'm still not sure you've covered all the cases. Consider this one:

T1 updates acct1
T2 reads acct1, and is given the old value (before T1) since T1 is pending.
T1 updates acct2
T1 commits
T3 reads acct2 (which causes T1's final value to be installed, right?)
T3 updates some other grains and commits // I'm not sure I need this step to make the example work
T2 reads acct2, and since T1 committed, T2 is given the new value (after T1) of acct2
Is that a valid sequence according to your design? If so, T2 is reading an inconsistent state.

I can imagine ways to prevent T2's inconsistent read. E.g., after reading acct1, it knows it depends on T1. So from then on, T2 must inquire whether T1 updated every grain that T2 accesses after acct1. But it gets complicated. For example, it's possible that many transactions updated acct2 after T1 committed and before T2 reads acct2. ETC would have to keep track of all of them and know when it's safe to forget about that update history.

centur · 2016-07-07T01:20:37Z

Thanks @philbe . This is an interesting case. I tried to model safest, read-committed behavior but this case breaks with two very competing transactions (T1 and T2, when T1 commits exactly between T2 reads the values of acc1 and acc2) which gives T2 an inconsistent view of the world).

Although it may work for our particular case :

We don't read values from grains to decide, we are sending a 'pre-flight' check to every grain involved in the operation - ' Given your current state - can you perform this operation', before we even sending exact same message to an Update method, and then we send all updates in one sequential set, kind of narrowing the window of potential inconsistent read for others. But I agree - it still exists.

it doesn't look like a good enough general cause where many others

Can you give any advice how this can be generalized enough to prevent such reads or maybe share the work-in-progress you was doing on transactions (if this is possible).

ashkan-saeedi-mazdeh · 2016-07-13T15:50:32Z

@philbe I'm interested to have the feature as well. Currently working on a game backend using Orleans. Trades should be handled as transactions. I will look at the implementation at https://github.com/saeedakhter/OrleansStrongConsistency
But it doesn't seem general enough to me. However still it's useful.

bailud · 2016-07-27T01:05:55Z

@ashkan-saeedi-mazdeh Could you give an example of the scenario where you want to use transactions for trades?

ashkan-saeedi-mazdeh · 2016-07-27T09:40:45Z

@bailud Generally speaking when an operation consists of calling methods in multipole grains and these method calls modify state. I want this state modifications to have transactional properties.
Example is trading virtual currency with items in a game. Imagine I have a MatchMaker and a Game to start, The things that should happen are these

Each player which wants to be in game should remove x from his/her currency
Game should start

Then if some players can remove it and the request for some fails, we have two options, eventual consistency by setting a reminder to retry for failed currency reductions later on or simply fail the transaction and leave all player currencies as it was before the transaction.

For example either all methods in the foreach below succeed or fail. I can tolerate executing them linearly instead of parallel as well.

List<Task> transactionTasks = new List<Task>();
foreach(var player in players)
{
transactionTasks.Add(player.DecreaseCurrency("coin",30));
}
Task.WhenAll(transactionTasks());

bailud · 2016-07-28T18:56:28Z

@ashkan-saeedi-mazdeh What you just described is exactly the use case for a transaction. If the game itself is a stateless worker, will adding a log make sense?

Before the loop, you persistent the currency of each player to a log.
For each player, you decrease his / her currency.
After the game finishes, you persistent a log for the completion of the game.

Thus, if the system fails during the game, you can check the log to see if the players have already paid for the game. Also, you need some additional logic when you recover from a failure, e.g. check if the game has been interrupted by a failure so that the players will not be charged twice.

ashkan-saeedi-mazdeh · 2016-07-30T09:39:54Z

@bailud Thank you for the response!
Yeah exactly this is the way that currently it should be handled in Orleans. However I think I need a few more log entries or lists os transactions done per player and ... to keep track of it all, but yes the logic is exactly right.

Ideally later on we have a set of interfaces to inherit from when we want to have grains which have transactions built-in directly to them. I understand this is a tough problem to solve generally. There are many questions like

What the grain should do when in middle of a transaction to incoming requests when it is Reentrant ?
How to generalize writing reverse transactions?
How to lock state partially to allow some operations if desired?
...

bailud · 2016-08-01T19:26:05Z

@ashkan-saeedi-mazdeh We are actually working on transactions in Orleans. Thus, I would like to know more about what people really expect from the transactions in Orleans. Will that be good enough to provide transactions in a traditional sense, i.e., database transactions, or do people want something more specific for the actor model?

For example, it would be really nice if you could elaborate what behavior / support from the language is desirable for (1) transaction with Reentrant grains (2) writing reverse transactions and (3) lock the state partially

ashkan-saeedi-mazdeh · 2016-08-03T07:57:50Z

@bailud Great then. Sergey told us that some researchers are doing stuff regarding it.
From my point of view a transaction should be similar to a Database transaction so multiple actors take part it in, Either transactions succeeds or after failure all states are returned to previous state.
I'm coming fully from a game development (both client and server) background so probably ideas of other people doing business applications is different.
I've used Erlang and mnesia and the Mnesia way is interesting
http://erlang.org/doc/apps/mnesia/Mnesia_chap4.html

The point is when we talk about transactions , what actually we care about is actor state and effects of operations on it. The things like partial state updates or reverse transactions I talked about are simply questions, I don't think in the first version we need custom reverse transaction functions or be capable of locking part of the state.
These can be desired in later versions , currently the following is great to have.

A transaction starts and writes itself to a log
any grain taking part writes its state to a store
the grain does call the function it should as a part of transaction
all grain calls in the transaction continue like this

Then if it failed all modified grains revert their state to the previous one. It's ok to lock the grain/grain state while transaction is running.
Some people want eventual transactions so instead of failing the transaction will be retried but that's not a feature of normal transactions IMHO. @centur above wanted that as an example if I understand correctly.

The hard part is when transaction fails and updates for reverting participant states fail as well. We keep the grain locked then. Maybe a good trade-off is to mark some methods of the grain with an attribute which says that they have nothing to do with state and can execute even when the grain is locked or we just lock state objects so any operation regarding them fails and method calls are always allowed but this second one is much harder and requires additional code generation for state objects I imagine or at least an accessor function for getting the grain state.

About reentrancy of grains, the transactions are most important for these grains since if a grain is not reentrant then it can not do anything else until its current Task is returned which is from the transaction call itself so the none-reentrant grains are easy to write transactions for actually, at least easier than reentrant ones.

Orleans revolutionized the actor model based programming with virtual actors and by adding transactions to actor operations you can probably multiply it I guess. Writing something similar to Mnesia is possible but that is just a DBMS to be used in actor systems. In fact service fabric reliable collections are kinda like Mnesia however I'm not sure if they allow for dirty operations or not and how many lock types they support. I'm really interested to see what you'll come up with and its differences to simply using DB transactions in actors when you need transactional properties. This can make Orleans more interesting for the cases that actors need to interact more with each other and do it in a consistent way.

I think it's good if you come to Orleans's gitter chat room and ask people's opinion about it.

@galvesribeiro @veikkoeeva @ReubenBond @centur What do you think guys?

galvesribeiro · 2016-08-03T21:09:33Z

@ashkan-saeedi-mazdeh sorry I didn't understood your correlation between transactions and reentrant grains...

Regarding the SF collections, you can read data before it is commited, I mean you can have inconsistent reads by reading the replicated state before the local state is actually replicated.

philbe · 2016-08-03T22:41:30Z

@ashkan-saeedi-mazdeh I think you'll find our transaction implementation supports the usage scenarios you described.

From my point of view a transaction should be similar to a Database transaction so multiple actors
take part it in,
Agreed. We use the COM+/J2EE style of transaction bracketing. Each method M can have one of the following tags:
• RequiresNew – When called, M starts a new transaction, T, and completes T on exit.
• Required – If M’s caller is executing a transaction, T, then M becomes part of T. If not, it starts a new transaction, T', and completes T' on exit.
• NotSupported – M never executes within a transaction, even if its caller is executing a transaction.

We allow a transaction to make many calls to the same grain. We allow grains to be re-entrant. With our locking implementation, all concurrent calls to a reentrant grain must be executing the same transaction. With our optimistic-concurrency-control implementation, concurrent calls from different transactions are allowed, but each transaction is working with its own copy of grain state.

With both locking and optimistic concurrency control, transactions have all of the ACID properties.

The hard part is when transaction fails and updates for reverting participant states fail as well.
We keep the grain locked then.
In our implementation, this can happen only if storage is unavailable. In that case, a transaction can't update the grain anyway, since we aren't able to guarantee the durability of transaction results.

bailud · 2016-08-03T23:03:09Z

@ashkan-saeedi-mazdeh I took a look at the Mnesia (http://erlang.org/doc/apps/mnesia/Mnesia_chap4.html).

The regular transaction in Mnesia looks very strict. It uses 2 phase locking, and can achieve serializable if the table level locking is used properly based on its description ("All programs accessing the database through the transaction system can be written as if they had sole access to the data.")

It becomes more interesting with the dirty operations. These operations execute without any locks. It said that dirty operations do not have atomicity or isolation. In additional, it seems that dirty writes can also be harmful to regular transactions, since it can update a record regardless of whether it is locked by another transaction or not ("The isolation property is compromised, because other Erlang processes, which use transaction to manipulate the data, do not get the benefit of isolation if dirty operations simultaneously are used to read and write records from the same table.").

It looks to me that it will be difficult to reason about the state of the system if the data can be updated by both dirty operations and transactions.

Could you give me some idea of when dirty operations are used and what you can expect from the database state?

@centur was asking for read committed, which is probably not safe enough for the trades scenario you mentioned.

ashkan-saeedi-mazdeh · 2016-08-04T07:21:28Z

@bailud I would not use both dirty operations and transactions on the same rows of a table since it will be very hard to reason about the state of the object so either something is not transactional and having eventually consistent views of the data are ok for the app and dirty reads and rights are ok and you can use dirty operations or you can not do dirty operations.

What I think is the right approach in Mnesia is

dirty writes only if last write can win and slight errors are ok
dirty reads only if eventual consistency of reads is ok
transactions only when you really have to have them.

You should keep in mind that Mnesia is not used as a general purpose database for actual storage of the data in my mind at least. You use it to make your Erlang code do its work and the real data storage is done to regular DBMS (being RDBMS or not). Mnesia has limited storage capabilities and even disk usage limitations for tables and ...

@centur was asking for read committed, which is probably not safe enough for the trades scenario you mentioned.

I just meant I don't represent the opinion of everyone, yes what he wanted was read commited which is not useful for my case at all.

ashkan-saeedi-mazdeh · 2016-08-04T07:24:05Z

@philbe It's great and I would say this really pushes the Orleans programming model about an order of magnitude forward at least for a certain number of applications.
Specially because of the semantics you described here.

The statement which says actor model is not suitable if many actors need to interact with each other no longer fully applies to Orleans then :) For interactions you need transactions and they are costly everywhere, in Orleans or not.

And of course this is all expected when one of the fathers of transactions is on top of the project :)

ashkan-saeedi-mazdeh · 2016-08-04T07:27:32Z

@galvesribeiro see @philbe 's comment about a reentrant grain is able to take part in the same transaction and call methods. I meant this can happen and if we lock it fully and don't allow, the performance and throughput of the system will drop for reentrant grains.

I should try to write more clearly.

centur · 2016-08-04T09:03:00Z

@ashkan-saeedi-mazdeh Slowly catching up with this thread. Yes, but my example with eventual transactions was a workaround as I don't know if there is a robust algorithm for the case you mentioned - transaction failed but reversal failed too. It's the case I was thinking about when I added that eventual thing into my algorithm - cause you may revert not only as a result of network or state failure but as a result of timeout too - so eventual part (the second commit phase) would work in case of explicit transaction grain failure and in case of transaction timeout - when grains need to be unlocked for other operations, and we can have grains activated and unlocked as is, when action happens, instead of reactivating all of them on some explicit scheduled trigger

ashkan-saeedi-mazdeh · 2016-08-04T09:12:42Z

@centur Fair enough. Your algorithm is actually what some solutions need.
AFAIK grains which should participate in transactions will aquire a lock on their state so even if the whole thing crashes , later on at first grain activation you see the log exists and fail the transaction or even better at silo startup you can check all transactions remaining from a crash and just fail m all for good or finish them depending on the timeout anticipated. This said, I never implemented a transaction system in a prod ready manner and am far from being expert on the subject matter.

centur · 2016-08-04T09:19:31Z

Re on few mentions: generally speaking - I want at least some implementation of transactions.

I proposed my solution to spark the discussion and move things forward. I think since we started to use orleans a year ago - there always was a subtle notion that transactions are somewhere near, just around the corner, but after the year they didn't appear. I had few discussions about Orleans on meetups and people are asking about transactions, so having good transaction implementation out of the box will skyrocket orleans compared to other actors system.

If it will be database-style ones with ACID -that perfect, as it's close to a natural OOP style in C#
If it will requre some rework of our business logic (like in my proposed implementation ) - I'm fine as we are going to implement it anyway and given our timelines and a "promised future" which is just around the corner for a year - we have to implement something on our side before "the future" happens anyway.

I'm really interested in reading anything about @philbe's implementation, just to understand the algo and teach myself on complexity and problems in this area, if it's possible.
If the implementation is in some public branch - this is even better so we can see the code.

And again - I'm not an expert, I'm the guy who needs this feature in any form and I can see that this is the very burning question for many others, whether they are on gitter or not or whether they are using orleans or just deciding to try.

stgolem · 2016-10-30T09:17:45Z

Hello. Found this issue while trying to understand if there is any transactional concept in Orleans. As i understood @philbe has some kind of solution but not public yet?
Is there any public plans, specs, concepts out there for doing transactions with grains?

galvesribeiro · 2016-10-30T13:14:45Z

@stgolem last meetup @philbe mentioned that is working in it and would release something soon... We are all waiting for it :)

veikkoeeva · 2016-10-30T20:58:53Z

Cross-referencing #2161 for further information.

ashkan-saeedi-mazdeh · 2016-10-31T08:06:24Z

@stgolem In general in all actor based frameworks like Erlang, Akka and ... the task of doing transactions is done by using database transactions. In the specific case of Erlang, They have a special database called Mnesia usually used for the task. The DB is not anything special on its own other than having the specific feature of being able to store Erlang terms.

So effectively until the feature becomes available you can use database transactions. As an example if you have a trade going on between two players in a game, when player A's grain calls a Trade method with player B's grain as the other party, you execute a DB transaction to do the operation and declare it successful if the transaction completed.

The feature should be available in a couple of months from the meet-up time which was around a month ago I guess.

philbe · 2016-11-01T00:33:01Z

A technical report that describes our transaction mechanism is now available here, written by @tamereldeeb and me. We'll make code available soon. Initially, it will support optimistic concurrency control (OCC), not two-phase locking as described in the tech report. We were hoping to add the OCC description to the tech report before publishing it. But we've been slow and decided not to delay further. We'll update the tech report with an OCC description later.

ashkan-saeedi-mazdeh · 2016-11-01T09:53:49Z

@philbe The link is not working

stgolem · 2016-11-01T12:51:11Z

@ashkan-saeedi-mazdeh this is simple, when we have some kind of "commiter" grain that can decide whole work. But in more general environment, where every grain is doing some part of work, i wish them all to do "commit" together.
And yes, link is not working ((

philbe · 2016-11-01T16:15:45Z

@ashkan-saeedi-mazdeh and @stgolem. Very sorry, I accidentally made it private to Microsoft. It's public now, so you should be able to access it.

centur · 2016-11-01T22:05:06Z

@philbe still acting weird:
When opened in a normal browser windows - it gives this

400 Bad Request
Request Header Or Cookie Too Large
nginx/1.11.3

But it opens in-private just fine... Just in case - here is a direct link to PDF

philbe · 2016-11-01T22:42:13Z

@centur Indeed, that's strange. What browser are you using?

centur · 2016-11-02T06:53:35Z

Google chrome. But I logged in to Azure portal in that profile (so it sends bunch of cookies with request to microsoft.com).
I suspect that it may be the same case for others, so added direct link. Issue worked around :)

AshkanSaeedi · 2016-11-02T10:20:19Z

@philbe Read it all at morning with a cold , really interesting. but two things.

1- In page 6 the report mentions that Orleans writes back state to storage automatically when a grain deactivates but in fact it doesn't. It might be interesting to add the feature or the possibility of destructors but for now Orleans doesn't persist state at deactivation time.

2- Why did you use a separate process for TM and did not host it in silos? To scale separately ? Or because you wanted all machine resources for the TM?

philbe · 2016-11-02T16:12:34Z

@AshkanSaeedi
1- Good catch. It's incorrectly worded. It should say "It uses that mapping to populate a grain state when it is activated. It also allows a grain to save its state at any time, e.g., just before returning from a method call that modifies its state or when the grain is deactivated. "

2- Yes and Yes. The TM throughput was too low when it ran in a silo.

stgolem · 2017-11-02T12:33:44Z

@philbe @sergeybykov Is there any news about the prorgess here?
Should we expect some implementation in near future, or we have to solve this problem on our own?

ReubenBond · 2017-11-02T21:39:23Z

@stgolem 2.0.0-beta1 includes early support for transactions :) There are tests demonstrating its usage here while documentation is pending

sergeybykov · 2018-10-08T21:50:41Z

2.1.0 includes an "RC" quality implementation of cross-grain transactions.

jthelin added the enhancement label Jun 28, 2016

gabikliot added the Status: investigating label Jul 1, 2016

centur added this to the Backlog milestone Nov 7, 2016

lmagyar mentioned this issue May 26, 2017

TransactionScope activity support OrleansContrib/Orleans.Activities#40

Closed

benjaminpetit added the help wanted label Jul 26, 2018

sergeybykov modified the milestones: Backlog, 2.2.0 Sep 12, 2018

sergeybykov removed help wanted Status: investigating labels Oct 8, 2018

sergeybykov closed this as completed Oct 8, 2018

ghost locked as resolved and limited conversation to collaborators Sep 29, 2021

General purpose distributed transaction concept for Orleans grains. #1880

General purpose distributed transaction concept for Orleans grains. #1880

Comments

centur commented Jun 28, 2016 • edited by gabikliot Loading

sergeybykov commented Jun 28, 2016

sergeybykov commented Jun 28, 2016

ReubenBond commented Jun 28, 2016

centur commented Jun 29, 2016

centur commented Jun 29, 2016

sergeybykov commented Jun 29, 2016

centur commented Jun 29, 2016

ReubenBond commented Jun 29, 2016

centur commented Jun 29, 2016 • edited Loading

centur commented Jun 29, 2016

philbe commented Jul 1, 2016

centur commented Jul 1, 2016

gabikliot commented Jul 1, 2016

centur commented Jul 1, 2016

gabikliot commented Jul 1, 2016

philbe commented Jul 6, 2016

centur commented Jul 7, 2016

ashkan-saeedi-mazdeh commented Jul 13, 2016

bailud commented Jul 27, 2016

ashkan-saeedi-mazdeh commented Jul 27, 2016

bailud commented Jul 28, 2016

ashkan-saeedi-mazdeh commented Jul 30, 2016 • edited Loading

bailud commented Aug 1, 2016

ashkan-saeedi-mazdeh commented Aug 3, 2016 • edited Loading

galvesribeiro commented Aug 3, 2016

philbe commented Aug 3, 2016

bailud commented Aug 3, 2016 • edited Loading

ashkan-saeedi-mazdeh commented Aug 4, 2016

ashkan-saeedi-mazdeh commented Aug 4, 2016

ashkan-saeedi-mazdeh commented Aug 4, 2016

centur commented Aug 4, 2016

ashkan-saeedi-mazdeh commented Aug 4, 2016

centur commented Aug 4, 2016

stgolem commented Oct 30, 2016

galvesribeiro commented Oct 30, 2016

veikkoeeva commented Oct 30, 2016

ashkan-saeedi-mazdeh commented Oct 31, 2016

philbe commented Nov 1, 2016

ashkan-saeedi-mazdeh commented Nov 1, 2016

stgolem commented Nov 1, 2016

philbe commented Nov 1, 2016

centur commented Nov 1, 2016 • edited Loading

philbe commented Nov 1, 2016

centur commented Nov 2, 2016

AshkanSaeedi commented Nov 2, 2016

philbe commented Nov 2, 2016

stgolem commented Nov 2, 2017 • edited Loading

ReubenBond commented Nov 2, 2017

sergeybykov commented Oct 8, 2018

centur commented Jun 28, 2016 •

edited by gabikliot

Loading

centur commented Jun 29, 2016 •

edited

Loading

ashkan-saeedi-mazdeh commented Jul 30, 2016 •

edited

Loading

ashkan-saeedi-mazdeh commented Aug 3, 2016 •

edited

Loading

bailud commented Aug 3, 2016 •

edited

Loading

centur commented Nov 1, 2016 •

edited

Loading

stgolem commented Nov 2, 2017 •

edited

Loading