-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize how peer stats are processed #20138
Conversation
Jenkins BuildsClick to see older builds (8)
|
(js->clj event-js :keywordize-keys true)]) | ||
"waku.backedup.settings" (rf/dispatch [:profile/update-setting-from-backup | ||
(js->clj event-js :keywordize-keys true)]) | ||
"wallet" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to have some comment here about order being of some importance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't hurt to add a comment 👍🏼 Let me add it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @J-Son89, an important oversight on my part. case
has constant-time dispatch, so the order of clauses doesn't matter. So the order of clauses is essentially based on readability.
^js event-js (.-event data) | ||
type (.-type data)] | ||
(log/debug "Signal received" event-str) | ||
^js event-js (oops/oget data :event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we use oops here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this safe to use here? it throws an error if it's nil :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@J-Son89, that's a good question.
oops
should pretty much always be used due to advanced compilation. The checks it performs, according to its documentation only happen during development. In advanced compilation, all checks should disappear.
https://github.com/binaryage/cljs-oops?tab=readme-ov-file#play-it-safe-during-development
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't remember from my experience if this holds true. I'm sure we'll find out soon if not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't remember from my experience if this holds true. I'm sure we'll find out soon if not
I'm not sure either. I'll do a check today to settle this question and will report here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@J-Son89, I just confirmed in a prod build that oops/oget
does not throw any exception on missing keys, as is documented in the library.
To give a real example, this is a warning produced now when we do advanced compilation (release build):
------ WARNING #165 - :infer-warning -------------------------------------------
File: /build/status-mobile-source-jsbundle/src/status_im/contexts/wallet/wallet_connect/effects.cljs:46:10
--------------------------------------------------------------------------------
43 | (let [{:keys [params id]} proposal
44 | approved-namespaces (wallet-connect/build-approved-namespaces params
45 | supported-namespaces)]
46 | (-> (.approveSession web3-wallet
----------------^---------------------------------------------------------------
Cannot infer target type in expression (. web3-wallet approveSession (clj->js {:id id, :namespaces approved-namespaces}))
This error disappears with the correct type hint or when we use oops/oget
.
But that's not the main problem. Advanced compilation renames properties, unless the code access object properties as strings, but we don't do that when we write (.x someObject)
. oops
solves this important problem too.
For the sake of visibility to more folks to help spread the word, cc'ing @clauxx and @smohamedjavid
e099d3e
to
fa06813
Compare
thank you @ilmotta , i think it would be even better to do it on the go side, so it won't clog the bridge as well |
Good suggestion, this could be explored in the future. The frequency at which peer stats is initially emitted is controlled by the waku client itself. We have the chance to debounce in status-go after that, but it raises the question of the time period and if the time we choose is a good one for the desktop client as well. In fact, for the mobile client, I think we could debounce for much longer because I've seen this signal arrive multiple times after login. I'd consider even 5s or 10s. But I think the desktop client can handle a lot more without choking, so keeping this decision about how long we want to debounce on our side seems better, at this least from this perspective. This PR brings a tangible improvement. Do you think this solution is good enough to be merged for now? |
fa06813
to
282855e
Compare
Hey reviewers, we need at least one more approval to merge this PR. The changes should be safe & simple, but please let me know if you think anything important should change. Thanks! |
@ilmotta it's fine to merge this, but just to be aware, we had a chat with @vitvly and we noticed that peer stats are not used at all on mobile. |
@cammellos, on the topic of the signal being used or not, the (root) subscription How would you check peer count in the mobile app if we stop consuming the signal? |
@ilmotta oh I must have missed that, I think best thing is to just query if the user lands on this page in that case |
62% of end-end tests have passed
Failed tests (17)Click to expandClass TestWalletOneDevice:
Class TestCommunityMultipleDeviceMerged:
Class TestWalletMultipleDevice:
Class TestGroupChatMultipleDeviceMergedNewUI:
Class TestActivityMultipleDevicePR:
Class TestOneToOneChatMultipleSharedDevicesNewUi:
Expected to fail tests (3)Click to expandClass TestGroupChatMultipleDeviceMergedNewUI:
Class TestCommunityOneDeviceMerged:
Passed tests (32)Click to expandClass TestWalletOneDevice:
Class TestCommunityMultipleDeviceMergedTwo:
Class TestOneToOneChatMultipleSharedDevicesNewUiTwo:
Class TestActivityCenterContactRequestMultipleDevicePR:
Class TestGroupChatMultipleDeviceMergedNewUI:
Class TestDeepLinksOneDevice:
Class TestActivityMultipleDevicePR:
Class TestOneToOneChatMultipleSharedDevicesNewUi:
Class TestActivityMultipleDevicePRTwo:
Class TestCommunityOneDeviceMerged:
|
@status-im/mobile-qa, can you help me understand if the e2e failures are expected? As far as my testing goes and from the videos themselves, I can tell signals are being processed just like in |
@cammellos, @vitvly I found another place deserving our attention. The profile visibility status event PR will come next week, it's almost done https://github.com/status-im/status-mobile/compare/ilmotta/optimize-how-peer-stats-are-processed...ilmotta/do-not-process-signal-wakuv2-peerstats?expand=1 |
I’ll modify status-im/status-go#5224 so that there is a flag to disable peerstats signal altogether. Then mobile will query for peer stats when needed. |
I'm relaunching e2e, as they have been failed mostly due to reliability issues, and checking login in a user with sufficient amount of data (just in case) Thank you for the detailed explanation @ilmotta! |
Logout for the same user works about the same (like you mentioned in the PR): about 17s for several communities / several 1-1 chats, waiting for e2e tests |
88% of end-end tests have passed
Failed tests (3)Click to expandClass TestWalletMultipleDevice:
Class TestWalletOneDevice:
Expected to fail tests (3)Click to expandClass TestGroupChatMultipleDeviceMergedNewUI:
Class TestCommunityOneDeviceMerged:
Passed tests (46)Click to expandClass TestActivityMultipleDevicePR:
Class TestCommunityMultipleDeviceMerged:
Class TestOneToOneChatMultipleSharedDevicesNewUiTwo:
Class TestCommunityMultipleDeviceMergedTwo:
Class TestOneToOneChatMultipleSharedDevicesNewUi:
Class TestActivityMultipleDevicePRTwo:
Class TestActivityCenterContactRequestMultipleDevicePR:
Class TestDeepLinksOneDevice:
Class TestWalletOneDevice:
Class TestGroupChatMultipleDeviceMergedNewUI:
Class TestCommunityOneDeviceMerged:
|
- Debounce peer stats signals by 1s because they may happen 30 times during login and the payload conversion using js->clj is not cheap. - Change the order of case macro checks: put more frequent signals at the top. - Better format code
282855e
to
3d89063
Compare
Motivation from #20059 (Slow app login)
Summary
During login, the signal
wakuv2.peerstats
happens anywhere between 10-20 times in a very short amount of time. To make things worse, the JS object payload is large(ish) and we are callingclj->js
each time.This simple PR solves this inefficiency by debouncing the event
:wakuv2-peer-stats
for the duration of 1s. As far as I measured, the event is now only being processed once throughout the entire login flow and the 1s delay to update the app-db should be negligible.Also reordered high-frequency signals to be at the top of theThis is wrong.case
macro.case
has constant-time dispatch and the order of clauses don't matter.oops/oget
instead of interop because that's our current solution to reliably survive ClojureScript advanced compilation.case
macro to minimize git diffs and to be more readable, i.e. expressions are not right-aligned according to longest signal string (less diff in future pRs) and there are less weird line breaks because of this choice.(js->clj event-js :keywordize-keys true)
to(transforms/js->clj event-js)
Note
This PR won't solve the slow login by issue #20059 (that's mainly a status-go issue I'm investigating separately) but it will help free up re-frame's queue while logging in.
And btw, the login flow in
status-im.contexts.profile.login.events
is constricting re-frame's queue way too much. I see low-hanging fruits for us there, for instance, delaying dispatching groups of events in different time slots.Areas that may be impacted
Refactors brought by this PR should be safe and don't require manual QA.
Steps to test
E2E tests should be sufficient, plus my own manual checks.
status: ready