Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mempool: Keep cache hashmap and linked list in sync #2188

Merged
merged 3 commits into from Aug 13, 2018

Conversation

ValarDragon
Copy link
Contributor

@ValarDragon ValarDragon commented Aug 8, 2018

This removes bugs with the linked list being full, but hashmap empty.

Ref #2180

  • Updated all relevant documentation in docs
  • Updated all code comments where relevant
  • Wrote tests
  • Updated CHANGELOG_PENDING.md

This removes bugs with the linked list being full, but hashmap empty
@ValarDragon ValarDragon added this to Ongoing in current iteration via automation Aug 8, 2018
@ValarDragon ValarDragon moved this from Ongoing to Review Needed in current iteration Aug 8, 2018
@codecov-io
Copy link

codecov-io commented Aug 8, 2018

Codecov Report

Merging #2188 into develop will decrease coverage by 1.28%.
The diff coverage is 90%.

@@            Coverage Diff             @@
##           develop   #2188      +/-   ##
==========================================
- Coverage    62.28%     61%   -1.29%     
==========================================
  Files          212     193      -19     
  Lines        17565   15761    -1804     
==========================================
- Hits         10941    9615    -1326     
+ Misses        5718    5336     -382     
+ Partials       906     810      -96
Impacted Files Coverage Δ
mempool/mempool.go 71.49% <90%> (+2.43%) ⬆️
rpc/grpc/client_server.go 66.66% <0%> (-4.77%) ⬇️
libs/db/remotedb/remotedb.go 36.84% <0%> (-4.69%) ⬇️
consensus/reactor.go 71.94% <0%> (-1.35%) ⬇️
consensus/state.go 76.09% <0%> (-1.1%) ⬇️
blockchain/pool.go 65.73% <0%> (-0.7%) ⬇️
libs/db/remotedb/grpcdb/client.go 0% <0%> (ø) ⬆️
abci/example/kvstore/persistent_kvstore.go 64.04% <0%> (ø) ⬆️
state/errors.go 0% <0%> (ø) ⬆️
p2p/trust/store.go 82.55% <0%> (ø) ⬆️
... and 41 more

Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to create a small benchmark for the mempool and compare old/new implementations.

@@ -528,16 +527,26 @@ func (cache *mapTxCache) Push(tx types.Tx) bool {
// but deleting a non-existent element is fine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment can be removed now

return true
}

// Remove removes the given tx from the cache.
func (cache *mapTxCache) Remove(tx types.Tx) {
cache.mtx.Lock()
delete(cache.map_, string(tx))
stx := string(tx)
// TODO: Can we get value and index at the same time?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aww okay. #postlaunch it may be worth experimenting with a "prehashed input hashmap. (Alot simpler than it sounds) and then we just hash once with the same fast hash.

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Aug 9, 2018

The benefit of this change is space savings + fixing bugs with cache eviction. Computation time will increase due to usage of the clist, so making a benchmark is a good idea.

Is there a good way to benchmark memory taken up? Just println unsafe size of?

@melekes
Copy link
Contributor

melekes commented Aug 9, 2018

go test -memprofile?

@ValarDragon
Copy link
Contributor Author

:o cool, didn't know about that

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Aug 9, 2018

before:

BenchmarkCacheInsertTime-8   	 5000000	       385 ns/op	      88 B/op	       3 allocs/op
BenchmarkCacheRemoveTime-8   	10000000	       152 ns/op	       0 B/op	       0 allocs/op

after:

BenchmarkCacheInsertTime-8   	 1000000	      1027 ns/op	     360 B/op	       7 allocs/op
BenchmarkCacheRemoveTime-8   	 2000000	       860 ns/op	     112 B/op	       2 allocs/op

So it looks like we're getting a performance hit due to the concurrent list. However this performance hit is neglible compared to the signature verification of saving a checkTx, and we are now saving memory from elements we remove from the cache. (The before case didn't remove it at all from the linked list. If we used the O(N) method to do that properly, it would certainly be slower)

current iteration automation moved this from Review Needed to Reviewed Aug 9, 2018
@melekes
Copy link
Contributor

melekes commented Aug 9, 2018

However this performance hit is neglible compared to the signature verification of saving a checkTx, and we are now saving memory from elements we remove from the cache.

I tend to agree. Plus, if there are usually no "bad" txs in the network (e.g. private deployment), cache can be always disabled.

map_ map[string]struct{}
list *list.List // to remove oldest tx when cache gets too big
map_ map[string]*clist.CElement
list *clist.CList // to remove oldest tx when cache gets too big
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some reason we needed to switch to Clist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. CheckTxs are called in parallel, and can have different execution times. I can describe problematic situations more if you'd like. (I ran through a couple in my head that I thought would eventually cause big problems before making the clist switch)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is there some new requirement that only now we need a clist? or you think we always needed it?

CheckTx actually don't run in parallel, they run sequentially, but responses come back asynchronously (though in order)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh your right, we don't need it here. The reason we don't need it (and something I failed to consider earlier) is that we have a mutex on all deletions and additions. I forgot that we needed that for the hashmap still. I'll switch it back, thanks for pointing this out :D

@@ -524,20 +523,26 @@ func (cache *mapTxCache) Push(tx types.Tx) bool {
if cache.list.Len() >= cache.size {
popped := cache.list.Front()
poppedTx := popped.Value.(types.Tx)
// NOTE: the tx may have already been removed from the map
// but deleting a non-existent element is fine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we knew we weren't removing from the list in .Remove but we thought it was ok?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing noone considered the lingering memory or the issues with how it affects cache eviction of good txs. (Full linked list, empty hashmap)

@ValarDragon
Copy link
Contributor Author

Now that we're back to normal lists, here are the new benchmarks:

BenchmarkCacheInsertTime-8   	 5000000	       398 ns/op	      88 B/op	       3 allocs/op
BenchmarkCacheRemoveTime-8   	10000000	       179 ns/op	       0 B/op	       0 allocs/op

basically just adds 20ns to removal time.

@ValarDragon ValarDragon force-pushed the dev/mempool_linked_list branch 2 times, most recently from a844c4a to 3ed0d72 Compare August 11, 2018 13:47
@melekes melekes merged commit 8a1a792 into develop Aug 13, 2018
current iteration automation moved this from Reviewed to Done Aug 13, 2018
@melekes melekes deleted the dev/mempool_linked_list branch August 13, 2018 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants