Review Transaction Semantics #153

gregrluck · 2013-06-19T16:06:31Z

While quite complete, we need to do a complete pass understanding and specifying fully how each operation interacts with transactions.

We will likely look at this after the first public review.

e.g.
What is the impact of transactions on expiry? How do they interact? eg: does expiry cause a transaction not to be committed? eg: say a getAndReplace may not work if the entry has expired on commit?

verydapeng · 2013-07-10T09:47:23Z

Can some one explains a typical use case for cache to participate in transaction? IMO, cache = some final value that can be shared for a period of time ...

Cotton-Ben · 2013-07-10T10:05:56Z

see slides 38-43 at

https://community.jboss.org/servlet/JiveServlet/download/827161-100215/%40JPMorgan%3D%3DJavaKnowledgeForum%3DFINAL52Data%20Locality%20Latency%20and%20Caching.pptx

verydapeng · 2013-07-10T10:42:25Z

thx ben

Cotton-Ben · 2013-07-11T14:47:33Z

You're welcome. Also, there is some spirited discussion re: this subject in our community discussion forum ( https://groups.google.com/forum/#!forum/jsr107 )

gregrluck · 2013-07-18T03:13:09Z

Reviewed the Transaction section in the spec doc. Applied reformatting and fixed a couple of grammar errors. Raised #189 for missing exception types. We have a few people reviewing this area over the next few days. We know there are a lot of method interactions which need to be specified.

brianoliver · 2013-08-01T13:11:24Z

With respect to CacheWriters, in the specification we state: "The semantics of Transactional Consistency are implementation specific." This really needs some explanation. I think it's very incomplete.

eg: Consider a distributed cache, backed by a cache writer to some underlying store, managed by a cluster of n servers. Assuming a developer can start a transaction against the distributed cache. The manipulate many entries (across the cluster) and then do a commit.

Currently the API does not provide the ability for the backing Cache Writers to be "prepared" with the "preparation" of the Cache entries in the transaction. It's obvious that the transaction must be two (or three) phased, but as we don't have the ability to provide the CacheWriter (or Loader) with transaction information, and/or call prepare / commit, we have to assume that the CacheWriter is non-transactional!

This is a very significant challenge. The writing to the CacheWriters across the n servers can only occur in the "commit" phase and if there's a single failure (or timeout), the entire transaction is now corrupted.

There are numerous possible solutions to this issue, none of which are very good.

Don't allow CacheLoaders or CacheWriters to be used with Transactional Caches. This isn't too bad because if a developer is using a Transactional Cache, they are also Transacting against another resource, most likely a Database (or Messaging System). ie: The Cache should not be managing the underlying Database resource transactions. That could/should be done at the application level.

While this works, I can easily think of use-cases where this breaks down quickly.

Define a new API for CacheLoaders and CacheWriters that Transactional Caches use. This API would provide the appropriate transactional context (sub-transactions of the application transaction), that of which can be used by CacheLoaders/Writers etc, it starts to erode the consistency of the API we have created. eg: We'd need to create a new type of Configuration for Transactional Caches. This isn't ideal.

brianoliver · 2013-08-01T19:05:10Z

(as identified by Bill Shannon)

Currently specification only allows transactions to be supported or not supported. Instead we should be able to determine if local or XA transactions are supported. eg: some implementations may not support both.

Cotton-Ben · 2013-08-01T22:42:18Z

Brian, Musing openly (and quickly) let me just throw out some of our growing concerns. We are now not sure that it is even possible to completely specify Transactions semantics given how we formally define "Cache" capability guarantees. The point you made about 'how do we test a Cache's transactional completeness given that a Cache, formally, has no time of durability guarantee?' is VERY VALID. Musing openly, our use case (very much definitely) needs something that we have traditionally called a "Cache" to be at least transactional capability ambitious. Though we once said otherwise, we now believe that we don't need a "Cache" to guarantee transactional completeness -- especially wrt to XA. We don't use XA and it ntroduces hellish complications. The important thing is that something that we call a "Cache" allow us to operate w/ accommodation for all of our use cases that are DIRTY_READ intolerant and PHANTOM_READ intolerant. i.e we do need TX_ISOLATION completeness, but we do not need full XA capability completeness. The points you make about how striving to provide this completeness will basically disarm the whole efficacy of a Cache's core usefulness is gaining clarity with us. We like Bill Shannon's consideration that we should be able to determine if local or XA transactions are supported. To heck with full TX completeness. Maybe clarify semantics that we specify to consider this directly, possibly offering considerations to bridges for potential completeness? E.g. When Queues and Database participants do XA they enlist via API bridges like javax.sql.XADataSource and javax.jms.XAQueueConnection ... maybe we just specify (purposefully vaguely?) something like a javax.cache.XACacheSource as bridge for implementations to consider providing XA? Thx, -Ben

cruftex · 2013-08-06T15:36:16Z

Hi,

wrote together some points on transactions here:

http://www.headissue.com/pub/jsr107-review-20130724.html#transactions

However, I don't know whether this leads somewhere, just another try to sort things.

Interestingly, there are many ways to address the topic (expectations, needs) which don't perfectly fit together.

Let me try some ask stupid questions approach:

What implementations exist in real live, so it is worth a standard?
What exactly is expected from a cache when data is accessed/read during a transaction? Mustn't it be the latest one because we do a transaction? Depending on exclusive access to the data source via loader/writer or non-exclusive access need to change (or drop?) the cache behavior.
What applications need transaction awareness and will communicate directly with the caching API, so this must be solved within the cache? What about to do a use case collection for this?

My current conclusion is:

Caching and transaction makes sense if the data source is exclusively accessed via loader/writer by the cache, or the "cache" is the storage itself (e.g. with persistence, or k/v in-memory store)
If the data is accessed non-exclusively the coherency issues, which may be also "cross technology", will be hard to unsolvable.
If accessed non-exclusively, everything above READ_COMMITED means that no caching can occur
The whole rules of transaction isolation needs to be re-thinked how to be applied on a cache. E.g. if I do a commit on some Cache.put operation is it successful event if some entries were already evicted, or is eviction not allowed during a transaction?

Best,

Jens

Cotton-Ben · 2013-08-06T18:46:48Z

What implementations exist in real live, so it is worth a standard?

Most of the big Caching providers (EhCache, Coherence, eXtreme Scale, Infinispan, etc. ) have indicated they will deliver JSR-107 compliant implementations that deliver the Transactions option.

What exactly is expected from a cache when data is accessed/read during a transaction? Mustn't it be the latest one because we do a transaction? Depending on exclusive access to the data source via loader/writer or non-exclusive access need to change (or drop?) the cache behavior.

The 'Cache' (or at least what my team has traditionally called a 'Cache') needs to guarantee a full ACID capability for a user-demarcated set of operation(s) (on the 'Cache' operand(s)).

This is a non-trivial challenge to deliver full ACID. For those use cases that only need the 'A' in ACID, JSR-107 provides the javax.cache.Cache.EntryProcessor<K,V,T> interface.

Note that I have qualified what we call a 'Cache'. As your white-paper (thank you, BTW) points out (and with which I agree:

This means, within a transaction context the semantics of a cache will be redefined, it is considered to act like a transactional storage and not like a cache. The reason for this is, that it is not allowed to "forget" during the transaction. The paradoxon is, when the transaction commits, removing the mappings of all entries involved (invalidating) seems fairly legal to me.

Yeah. EXACTLY. It gets really hairy if you try to axiomatically derive even the concept of a "Pure Cache" doing Transactions. I like this point from your white-paper.

What applications need transaction awareness and will communicate directly with the caching API, so this must be solved within the cache? What about to do a use case collection for this?

For a cursory intro, see slides 38-43 at https://community.jboss.org/servlet/JiveServlet/download/827161-100215/%40JPMorgan%3D%3DJavaKnowledgeForum%3DFINAL52Data%20Locality%20Latency%20and%20Caching.pptx

For more detail, see our discussion here https://groups.google.com/forum/#!topic/jsr107/MP1ae96LMvM

If accessed non-exclusively, everything above READ_COMMITED means that no caching can occur

If you are not being "Pure Cache" axiomatic, I disagree.

The whole rules of transaction isolation needs to be re-thinked how to be applied on a cache. E.g. if I do a commit on some Cache.put operation is it successful event if some entries were already evicted, or is eviction not allowed during a transaction?

I don't see it that way. We really need to only specify the isolation interface and semantics ... which IMHO are well done in Greg and Ludovic's most recent writing of Chapter 5.

But, again, your points about the axiomatic implications of a 'Pure Cache' being transactional are noted and appreciated.

brianoliver · 2013-08-06T19:34:44Z

Ben,

This statement:

Most of the big Caching providers (EhCache, Coherence, eXtreme Scale, Infinispan, etc. ) have indicated they will > deliver JSR-107 compliant implementations that deliver the Transactions option.

Is simply not true. No vendor has or can commit to implementing this optional feature. Furthermore there is no notion of "compliance" in this space as there are no TCK tests. An implementation can't be "compliant" unless it passes all of the TCK tests.

Why are there no TCK tests? Simply because there is no agreement on the semantics of transactions, especially with respect to expiry and eviction.

-- Brian

Cotton-Ben · 2013-08-06T20:11:31Z

My bad. You are (of course) correct .... as written, that sentence cannot be true, it should have said

Most of the big Caching providers I have communicated with have indicated an ambition to deliver the JSR-107 Transactions option.

And with no valid TCK, I agree that any such ambition is moot.

What to do?

brianoliver · 2013-08-06T20:21:19Z

What to do?

I wish I knew, because I know the use-cases you're trying to solve. :-)

Perhaps it's a magic 🎱

I think more discussions will need to happen. It's good we're getting to them. It's a hard problem and we really appreciate the work, effort, thought's on this from everyone.

cruftex · 2013-08-08T11:59:13Z

If accessed non-exclusively, everything above READ_COMMITED means that no caching can occur

If you are not being "Pure Cache" axiomatic, I disagree.

Ben, what do you mean by "Pure Cache" axiomatic?

BTW: However, I am wrong because above READ_COMMITED the cache can expect that the value does not change once read, so it can cache within the transaction....

Cotton-Ben · 2013-08-08T13:02:30Z

Ben, what do you mean by "Pure Cache" axiomatic?

Actually, I was taking some poetic licence by using that term ... no such term exists in the literature.

Idea is this, when mathematicians make statements re: algebras they are grounded in very intense rigor. They start from the axioms (fundamental truths - that apply in all frames of reference - wrt to statements about the algebra's ({operators},{operands}) ... e.g 0 = 0 is the 'reflexive' axiom in the algebra of Natural numbers). From these axioms, they then make "statements" about the algebra and categorize/promote these statements as they improve in quality and maturity. e.g. statements can evolve from conjecture-->lemma-->theorem-->law. These statements get promoted as they survive proof arguments (which depend explicitly on axiomatic bases).

If we apply this kind of rigor to a "Caching algebra" ... and we want to make quality statements about Caching's operators and operands, well we're in kind of trouble wrt to Transactions!

Let's say we have this conjecture: "Caches can be Transactionally sound and complete". And now we want to promote this Caching statement to be a theorem. Well, if one of Pure Caching's "axioms" (actually definition) is that "a Cache can evict an Entry at any time and has ZERO durability obligation" then our conjecture is in big trouble. Any one can correctly counter this conjecture's efficacy by saying "Caches cannot be Transactionally sound and complete" because the statement is inconsistent with a Pure Cache's axioms.

So I am resigned to altering the original conjecture, modifying it to say "Something that I have historically called a Cache can be Transactionally sound and complete". Maybe that statement can get promoted to something of higher quality that conjectrure, but a "Pure Cache" won't be bothered with this consideration. Can't do it.

I took a lot of license using the term "Pure Cache axiomatic" in the last post (but you get what I mean).

Cotton-Ben · 2013-08-08T13:13:12Z

one of Pure Caching's "axioms" (actually definition) is that "a Cache can evict an Entry at any time and has ZERO durability obligation"

Brian, Greg - You guys may have answered this already, but, is this statment accepted as fundamentally true with regard to a "Pure Cache"?

brianoliver · 2013-08-08T14:23:40Z

Yes. That's true. When we talk about Caches, we're always talking about caches that have those conditions.

lorban · 2013-08-08T15:52:35Z

I would like to answer Brian Oliver on the comment he posted about a week ago (#153 (comment)).

I personally don't see any problem with transactional caches and cache writers/readers. If a cache is configured for local transactions, then the transaction's context is local to the cache (hence its name) and should not be propagated in any way to any other resource. If you want a transaction to span across your cache and some other resource(s), this is what XA is for: you should configure your cache as XA and make sure the other resources the cache writer/reader access are XA compliant too and will enlist in the XA transaction's context. this may require propagating the context with the use of suspend/resume if different threads are used but that's standard JTA stuff.

I see no need for anything more than what we currently have.

brianoliver · 2013-08-08T16:14:38Z

:) It doesn't look like a problem until a vendor tries to implement them. Like many things that have been designed, on the surface they look ok, but the devil is in the details.

As pointed out, the challenge with the specification is that it basically enforces implementations to support both if they are to support the "optional" transactions feature. It's been proposed that this be split into two parts, optionally supporting Local and XA.

For XA the APIs actually need to be changed - as confirmed by those implementors that attempt to do this. Changing the API would also make Local transactions a bit easier to implement, but then we have two different Caching APIs.

Cotton-Ben · 2013-08-09T15:46:17Z

@lorban wrote: I see no need for anything more than what we currently have.

If the obligation of our spec is only to specify a sound/complete interface and limited semantics (leaving the devlish details soley to the implementation), then I agree with Ludovic. If so, what we have in Chapter 5 and Chapter 6 right now is perfectly fine. But, if it is the obligation of the spec to unburden the implementation by providing more semantic details (especially wrt to XA specifics) then our spec may indeed need to say more.

@brianoliver wrote: It's been proposed that this be split into two parts, optionally supporting Local and XA.

This may be an ideal compromise.

I agree with Brian and Bill Shannon's proposal to do this split into two separate options (my agreement here may be a bit selfish .... both because (i) we don't use XA (ii) we now agree that XA betrays to some degree a Cache's fundamental efficacy).

If the devlish details of doing full XA would indeed derail any implementation's other-wise sound/complete local transactions capability from being JSR-107 compliant, then by all means let's make XA transactions and local transactions separately optional.

This proposal seems to safely liberate both spec and implementation from obligation to be "burdened'.

Brian, Greg, would it be at all appropriate to put this proposal to an EG vote at https://groups.google.com/forum/#!forum/jsr107?

gregrluck · 2013-09-04T03:20:46Z

Removing transactions from V1

brianoliver closed this as completed Aug 6, 2013

brianoliver reopened this Aug 6, 2013

gregrluck closed this as completed Sep 4, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review Transaction Semantics #153

Review Transaction Semantics #153

gregrluck commented Jun 19, 2013

verydapeng commented Jul 10, 2013

Cotton-Ben commented Jul 10, 2013

verydapeng commented Jul 10, 2013

Cotton-Ben commented Jul 11, 2013

gregrluck commented Jul 18, 2013

brianoliver commented Aug 1, 2013

brianoliver commented Aug 1, 2013

Cotton-Ben commented Aug 1, 2013

cruftex commented Aug 6, 2013

Cotton-Ben commented Aug 6, 2013

brianoliver commented Aug 6, 2013

Cotton-Ben commented Aug 6, 2013

brianoliver commented Aug 6, 2013

cruftex commented Aug 8, 2013

Cotton-Ben commented Aug 8, 2013

Cotton-Ben commented Aug 8, 2013

brianoliver commented Aug 8, 2013

lorban commented Aug 8, 2013

brianoliver commented Aug 8, 2013

Cotton-Ben commented Aug 9, 2013

gregrluck commented Sep 4, 2013

Review Transaction Semantics #153

Review Transaction Semantics #153

Comments

gregrluck commented Jun 19, 2013

verydapeng commented Jul 10, 2013

Cotton-Ben commented Jul 10, 2013

verydapeng commented Jul 10, 2013

Cotton-Ben commented Jul 11, 2013

gregrluck commented Jul 18, 2013

brianoliver commented Aug 1, 2013

brianoliver commented Aug 1, 2013

Cotton-Ben commented Aug 1, 2013

cruftex commented Aug 6, 2013

Cotton-Ben commented Aug 6, 2013

brianoliver commented Aug 6, 2013

Cotton-Ben commented Aug 6, 2013

brianoliver commented Aug 6, 2013

cruftex commented Aug 8, 2013

Cotton-Ben commented Aug 8, 2013

Cotton-Ben commented Aug 8, 2013

brianoliver commented Aug 8, 2013

lorban commented Aug 8, 2013

brianoliver commented Aug 8, 2013

Cotton-Ben commented Aug 9, 2013

gregrluck commented Sep 4, 2013