-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace guava multimap in PCBC with custom impl #1569
Conversation
For a long time PerChannelBookieClient has used guava LinkedListMultiMap to store conflicting V2 completion keys and values. This is problematic though. Completion keys are pooled objects. When a key-value pair is stored in a LinkedListMultiMap, if it is the first value for that key, a collection is created for the values, and added to a top-level map using the key, and then the key and the value are added to the collection. When a second value is added for the same key, the key and value are simply added to the collection. The problem occurs when the first key is removed. PBCB will recycle the key object, but this object is still being used in the multimap in the top-level map. This causes all sorts of fun like NullPointerException and IllegalStateException. Because of this, this patch introduces a very simple multimap implementation that only stores the key one time (in the collection) and uses the hashCode of the key to separate the collections into buckets. It's pretty inefficient, but this code it only hit in the rare case where a client is trying to read or write the same entry from the same ledger more than once at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Awesome implementation and good catch!
@@ -21,7 +21,8 @@ | |||
import static org.apache.bookkeeper.client.LedgerHandle.INVALID_ENTRY_ID; | |||
|
|||
import com.google.common.base.Joiner; | |||
import com.google.common.collect.LinkedListMultimap; | |||
import com.google.common.collect.ArrayListMultimap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed it is not. checkstyle should have picked that up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems it did here. just not locally :/.
There are a few checkstyle issues. |
rerun bookkeeper-server tls tests |
while (completionObjectsV2Conflicts.get(key).size() > 0) { | ||
errorOut(key, rc); | ||
} | ||
Optional<CompletionKey> multikey = completionObjectsV2Conflicts.getAnyKey(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this 'getAnyKey()' sounds a bit weird to me, therefore inside the implementation we allocating a bunch of objects (using stream API).
I have an alternative proposal:
completionObjectsV2Conflicts.forEachValue( k -> errorOut(k, rc))
the iteration will be internal to the implementation, we will save method calls and allocations and overall the code will look better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errorOut modifies the collection, so forEachValue will give you a ConcurrentModificationException.
Regarding allocations, this isn't in a critical path, so I didn't take any care to avoid them, but escape analysis should get rid of most of them in any case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed the internals of errorOut.
Okay for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
This change is needed to be cherry-picked to The problem has been observed using
/cc @ivankelly @merlimat |
Yes, this should go into 4.7.2 |
For a long time PerChannelBookieClient has used guava LinkedListMultiMap to store conflicting V2 completion keys and values. This is problematic though. Completion keys are pooled objects. When a key-value pair is stored in a LinkedListMultiMap, if it is the first value for that key, a collection is created for the values, and added to a top-level map using the key, and then the key and the value are added to the collection. When a second value is added for the same key, the key and value are simply added to the collection. The problem occurs when the first key is removed. PBCB will recycle the key object, but this object is still being used in the multimap in the top-level map. This causes all sorts of fun like NullPointerException and IllegalStateException. Because of this, this patch introduces a very simple multimap implementation that only stores the key one time (in the collection) and uses the hashCode of the key to separate the collections into buckets. It's pretty inefficient, but this code it only hit in the rare case where a client is trying to read or write the same entry from the same ledger more than once at the same time. Author: Ivan Kelly <ivan@ivankelly.net> Reviewers: Enrico Olivelli <eolivelli@gmail.com> This closes apache#1569 from ivankelly/conc-test-flake
Descriptions of the changes in this PR: (cherry-pick #1569) For a long time PerChannelBookieClient has used guava LinkedListMultiMap to store conflicting V2 completion keys and values. This is problematic though. Completion keys are pooled objects. When a key-value pair is stored in a LinkedListMultiMap, if it is the first value for that key, a collection is created for the values, and added to a top-level map using the key, and then the key and the value are added to the collection. When a second value is added for the same key, the key and value are simply added to the collection. The problem occurs when the first key is removed. PBCB will recycle the key object, but this object is still being used in the multimap in the top-level map. This causes all sorts of fun like NullPointerException and IllegalStateException. Because of this, this patch introduces a very simple multimap implementation that only stores the key one time (in the collection) and uses the hashCode of the key to separate the collections into buckets. It's pretty inefficient, but this code it only hit in the rare case where a client is trying to read or write the same entry from the same ledger more than once at the same time. Author: Ivan Kelly <ivanivankelly.net> Reviewers: Enrico Olivelli <eolivelligmail.com> This closes #1569 from ivankelly/conc-test-flake Master Issue: #1569 Author: Ivan Kelly <ivan@ivankelly.net> Reviewers: Ivan Kelly <ivank@apache.org>, Enrico Olivelli <eolivelli@gmail.com> This closes #1618 from sijie/cherry-pick-pcbc
Descriptions of the changes in this PR: (cherry-pick apache#1569) For a long time PerChannelBookieClient has used guava LinkedListMultiMap to store conflicting V2 completion keys and values. This is problematic though. Completion keys are pooled objects. When a key-value pair is stored in a LinkedListMultiMap, if it is the first value for that key, a collection is created for the values, and added to a top-level map using the key, and then the key and the value are added to the collection. When a second value is added for the same key, the key and value are simply added to the collection. The problem occurs when the first key is removed. PBCB will recycle the key object, but this object is still being used in the multimap in the top-level map. This causes all sorts of fun like NullPointerException and IllegalStateException. Because of this, this patch introduces a very simple multimap implementation that only stores the key one time (in the collection) and uses the hashCode of the key to separate the collections into buckets. It's pretty inefficient, but this code it only hit in the rare case where a client is trying to read or write the same entry from the same ledger more than once at the same time. Author: Ivan Kelly <ivanivankelly.net> Reviewers: Enrico Olivelli <eolivelligmail.com> This closes apache#1569 from ivankelly/conc-test-flake Master Issue: apache#1569 Author: Ivan Kelly <ivan@ivankelly.net> Reviewers: Ivan Kelly <ivank@apache.org>, Enrico Olivelli <eolivelli@gmail.com> This closes apache#1618 from sijie/cherry-pick-pcbc (cherry picked from commit 83d3abe) Signed-off-by: JV Jujjuri <vjujjuri@salesforce.com>
For a long time PerChannelBookieClient has used guava
LinkedListMultiMap to store conflicting V2 completion keys and
values. This is problematic though. Completion keys are pooled
objects. When a key-value pair is stored in a LinkedListMultiMap, if
it is the first value for that key, a collection is created for the
values, and added to a top-level map using the key, and then the key
and the value are added to the collection. When a second value is
added for the same key, the key and value are simply added to the
collection. The problem occurs when the first key is removed. PBCB
will recycle the key object, but this object is still being used in
the multimap in the top-level map. This causes all sorts of fun like
NullPointerException and IllegalStateException.
Because of this, this patch introduces a very simple multimap
implementation that only stores the key one time (in the collection)
and uses the hashCode of the key to separate the collections into
buckets. It's pretty inefficient, but this code it only hit in the
rare case where a client is trying to read or write the same entry
from the same ledger more than once at the same time.