Add capability to remove keys from MemoryMapState #48
Conversation
…opaque semantics)
|
+1 |
|
I don't understand the internals of trident quite as well as the rest of storm. I am a bit confused by the change to OpaqueMap.multiGet. I can see that it is a good performance improvement, but I don't see how it relates to multiRemove. I am fine with putting it in, and the tests look fine so I am +1 on the fix, but I would like to be sure that I understood the code correctly. |
|
Trident stores the batch id with any state it stores. It then uses this batch id to decide how to do updates. Opaque maps will update using the "prev val" if the batch id is the same, otherwise it will update using the current val (due to the nature of opaque semantics). Now the problem comes in when you do multiple updates to the same state in the same batch. In this case, the batch id will be the same on the second update, but you should update using the curr val instead of the prev val. The code was correct on updates but not on gets. If the value was updated in the batch the current val should always be returned, otherwise the standard opaque semantics should be done. Whether a value has had updates applied to it or not is detected with the "CachedBatchReadsMap". |
|
I get it now. Thanks for the explanation. +1 |
remove extra defn
Bug 46683 - Storm HDFS bolt should ack after tuples are synced
Vagrant storm/mesos cluster setup
The implementation maintains proper opaque semantics, so if the batch is retried the keys will still be there. Keys are officially removed once the next batch starts. This also fixes a bug in OpaqueMap where get after put/update in the same batch wasn't working properly.