-
Notifications
You must be signed in to change notification settings - Fork 110
Conversation
- storer needs to take netstore not localstore to put the chunk so that fetchers created earlier could respond
storage/pushsync: update NetStore api from master storage/pushsync: add sleeps and a bit more tracing, change subscription wait delay network: try sending receipt only if there is no closer peer storage/pushsync: use netstore, rather than localstore storage/netstore: logs with node id storage/pushsync: opentracing storage/pushsync: propagate origin on receipts is push synced - smoke test network: kademlia closer peer storage: very high retry timeout measure send chunk rlp timer pss timers and goroutine sendChunk traces for chunsk in localstore and subscribepush try minbinsize: 3 increase search timeout ; move tracing before nil check link netstore and delivery rename tag New to tag Create swarm-smoke: fix check max prox hosts for pull/push sync modes tag roundtrip wip more logs kad as part of pusher? kad to storer; revert pss change metrics to pss outbox len metric emit metrics once every 10sec. pss: refactor storage/pushsync: closer than me trace chunk: adding XOR comparison disable pushsync tests, as they are not setting up kad properly
- WaitTillDone does check initially - introduce IncN to increment with n - api: inspector to use tag.Done - unexport context, span, and Context() to be used by http server
- minor improvements in logging - calls on Kademlia directly on Pss struct - add IsClosestTo function to pubsub using kademlia.IsClosestTo
- move package from storage to toplevel - only closest peer to address returns a receipt - IsClosestTo(addr) is now part of the PubSub interface - rename TestPushSyncAndStoreWithLoopback to TestProtocol - for TestPusher and , IsClosest is mocked properly - remove Origin field from receipt message - pushed item remembers first and last sent time - retryInterval is dynamically set as 2 * average roundtrip (excluding outliers) - early send check is removed from unit test - receipts channel is just []byte for address - we are using mutli-set for setting synced status on chunks - tag increment now follows multiset and not incorrectly when the receipt arrives for this remember syncedItems is needed alongside syncedAddrs - in pusher sync loop, we use static context - correctly unsubscribe DB.SubscribePush - if pusher is closest node to a chunk, chunk is not sent using pubsub, but receipt is directly sent to self using a shortcut - therefore no self send is needed in storer - loopBack pubsub is shared in the protocol unit test but wrapped in different testPubSub structs to control the behaviour of IsClosestTo function - added delayResponse to TestProtocol too - testPushSyncIndex SubscribePush now increments tag StateStored correctly only first time - fix checkTags to wait for synced status (it caused occasional flakiness) - extract simulation test parameters (nodes/chunks/testcases) as command line flags, default is 16/16/16
- introduce custom logger on pusher/storer - dynamic setting of retryinterval extracted - timeout changed back - fix and use testutil.checkTags - use tag context - tags.Create and NewTag has context argument - fix pusher.Close before localstore
- changed order of select cases - have Tests on top followed by helpers - add sent,synced check to testutil.CheckTag - add timeout to close wait - adapt to testutil.Init() - length protection to label helper - resolve rebase conflicts
- simplify simulation using error group - make pss message handling asynchronous - simulation NewBzzInProc hive disable autoconnect (to check) - pss outbox go routine leak
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some important pss changes are lost
pss/pss.go
Outdated
@@ -290,6 +292,10 @@ func New(k *network.Kademlia, params *Params) (*Pss, error) { | |||
cp.Set(capabilitiesForward) | |||
} | |||
k.Capabilities.Add(cp) | |||
for i := 0; i < hasherCount; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will cause flaky tests. this is a bug. see here https://github.com/ethersphere/swarm/pull/1392/files#diff-6109308aa1312b43e9a8d048e2f2d1ffR271
and maybe double check other changes on my branch
This reverts commit cb82a14.
76e656c
to
0d11f36
Compare
@janos and me went line by line over the PR, and considering the tests we ran over the last few days, and the time pressure we are under, we think it is OK to merge this now, and iterate on improvements on Push Sync is working under the same performance requirements we have for Pull Sync, but obviously there is room for improvement both in terms of code simplicity and testing, as well as in performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks solid to me :)
I meticulously went through the changes.
- the tag / request context is corrected now, no regression in tests (9->8)
- in pushsync handleReceipt, pushReceipt does not need to be called in go routine if pss is async, right?
thanks for the great work guys
- retry logic is reverted to the original, that is ok.
return err | ||
} | ||
p.logger.Trace("handleReceiptMsg", "receipt", hex.EncodeToString(receipt.Addr)) | ||
go p.pushReceipt(receipt.Addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with async pss this is not needed i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had worse results without it, even if recent pss changes suggest that it is not needed. We did not get much deeper why is that the case.
This PR represents the consolidated efforts on Push Syncing implementation based on spec https://hackmd.io/9eWxJ_MJS8i04onWg49UBA?both, mostly by @zelig @nonsense and @nolash (for pss).
The implementation package is
pushsync
and integration was done by updating other packages and making required changes in the codebase, most importantly to tags andpss
package. Pss message handling was made async and tests are updated.A backward incompatible change was made by removing --nosync cli flag and adding --sync-mode which does not enable push syncing by default. Using
all
orpush
as the value for the flag will enable push syncing.This PR i branched from #1768 and updated with master code.
Original PR #1392.