Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISPN-2384 Avoid losing entries with concurrent passivation/activation #1455

Closed
wants to merge 1 commit into from

Conversation

galderz
Copy link
Member

@galderz galderz commented Nov 7, 2012

REQUIRES CAREFUL REVIEW

  • Fixed by making sure that evicting an entry as a result of exciding the given max size is done within the internal hash map segment lock, and make sure activations happen within this lock too.
  • This way, concurrent passivation/activation operations will be lined in such way that removing from store (activation) won't happen at the same time as removing entries from memory without providing guarantees that the entry remains in memory.
  • Added a unit to replicate this entry loss.
  • Activation interceptor still needed to handle environments where eviction happens manually, via cache.evict() calls.



@LogMessage(level = WARN)
@Message(value = "Unable to remote entry under %s from cache store after activation", id = 208)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@galderz
Copy link
Member Author

galderz commented Nov 20, 2012

@tristantarrant Fixed typo and rebased

// If removing (and not evicting), activation rules
// require entry to be removed from cache store (via
// activation) and counter to be updated
evictionListener.onEntryActivated(key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "activate" here? A CacheStore.remove() may be cheaper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onEntryActivated does two things:

  1. store.remove(key)
  2. if stats are enabled, it updates the activation counter

onEntryActivated is called so that both of those things happen (if stats configured...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but think about it from a code readability perspective. You're calling a method which implies "do something because this entry has been activated" on a codepath that means "do something, this entry has been removed!".

Why should an entry removal increment an activation counter anyway? That's incorrect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The activation happens on removal by returning the previous value, which might, or might not come from the cache store. The problem is that at the data container level, I've got no idea whether this is the case or not. That's where the activation count increase comes from. Such information is only available at the interceptor level.

The problem of all of this is that the activation/passivation logic is split between interceptors and data container. Clearly, keeping the logic in the interceptor didn't work since it opened consistency gaps. At the data container level, I'm missing a lot of information of about the operation (i.e. flag information, executed command, context...etc)

Btw, I haven't added this code for the fun of it, but rather to carry on being consistent with how ActivationInterceptor was working. If you look at the current ActivationInterceptor, the removal increments the activation counter. Without this code existing unit tests will fail.

To sum up, I've gone for the approach that allowed us to cover the edge cases without changing the way passivation/activation counters are updated. TBH, we've never sat down and wrote down exactly how the activation/passivation should really work. We probably inherited it from the JBoss Cache days.

At one point we should find a better way to deal with passivation/activation wo/ this split of logic, but that's probably gonna require some decent refactoring, which I wanted to avoid right now.

If you have any other ideas, I'm all ears :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have another idea. Why not just maintain the activation counter in the interceptor and if the remove succeeds, update the counter. I don't see why they are so tightly coupled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the current ActivationInterceptor, it there's a remove() called, it removes from the store and updates the
activation counter.

Isn't that incorrect? Logically, a remove is not an activation event, if you define activation as the process of moving something from a cache store into memory. I think we should fix the test (which seems incorrect) as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, remember the reason for moving away from the activation interceptor for keeping the counters and logic to remove from store: the fact that I need to do activation from within the hash map's put call.

I appreciate the comments but if you have other solutions in mind, I'd definitely suggest giving them a shot because tests can easily break. A great deal of time was spent fixing this and finding a way to solve it in such way that all tests passed too, code duplication was avoided and logic split was kept to a bare minimum.

Sure, I can understand I might have got some observations wrong above. I coded this a month ago (!!) and so my recollection is a bit buzzy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't be so fixated on doing whatever it takes to make a test pass. This can lead to incorrect understanding of why things are done in the first place; keep in mind that a test may be incorrect as well.

This is what I gather from this case.

  • The understanding of what activation is, seems wrong. As such, the test is testing for an activation count increasing when an entry is removed. This is not correct.
  • I am fine with counters being stored in the eviction listener as you have done, but line 1731, a removal, should not increment this counter.
  • As such, line 1731 should also not invoke a method called onEntryActivated(), which is confusing, leading to code smell - since the entry is not really activated at all.
  • Maybe just add another method on the eviction listener called removeEntryFromCacheStore(), which does exactly what it says, and doesn't update any counters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, finally we're getting somewhere :).

I agree with you, the way we have specified activation to work is a bit vague. In fact I asked myself similar questions you are asking. I started a thread in the dev list to discuss changes to the way we're doing activation:

http://lists.jboss.org/pipermail/infinispan-dev/2012-November/011628.html

And support (Dennis) was against changes here:

http://lists.jboss.org/pipermail/infinispan-dev/2012-November/011629.html

The thing about removal is the fact that it can optionally return the previous value, and that's why according to support guys it should be considered an activation, but really, these values are not stored in memory any longer.

Happy to go with your view though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let me know when your changes are in and I'll handle the PR.

@galderz
Copy link
Member Author

galderz commented Dec 2, 2012

@maniksurtani I've rebased again to keep up with latest master changes. If you're happy with it, can you handle the pull req? Or you wanna comment further?

@maniksurtani
Copy link
Member

Pls see my comment above.

@galderz
Copy link
Member Author

galderz commented Dec 12, 2012

@maniksurtani, the pull req is ready

* Fixed by making sure that evicting an entry as a result of exciding
the given max size is done within the internal hash map segment lock,
and make sure activations happen within this lock too.
* This way, concurrent passivation/activation operations will be lined
in such way that removing from store (activation) won't happen at the
same time as removing entries from memory without providing guarantees
that the entry remains in memory.
* Added a unit to replicate this entry loss.
* Activation interceptor still needed to handle environments where
eviction happens manually, via cache.evict() calls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants