-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[INDIS]-check whether new load balancer executor is shutdown before using in dual-read mode #984
Conversation
…in dual-read mode
d2/src/main/java/com/linkedin/d2/balancer/dualread/DualReadLoadBalancer.java
Outdated
Show resolved
Hide resolved
d2/src/main/java/com/linkedin/d2/balancer/dualread/DualReadLoadBalancer.java
Outdated
Show resolved
Hide resolved
Code LGTM. Please test locally with toki and attach the log here. You can run mint undeploy in one terminal, send a request to it in another terminal during the shutdown. The log will show the messages you added. |
Hi, @bohhyang Attach the log
It seems not work. During
|
What responses you get for those requests you send at shutdown? Could you set a breakpoint at getClient and see if it stops for the requests at shutdown? Please attach the log file. |
Thanks. Updated the description and attach the log file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. In the future, upload the log file for long-term reference since pasteIn will expire in a few days.
Thanks for reminder. Upload the log file instead of pastein |
toki-restli-backend.log @brycezhongqing you were hitting toki backend app, which is wrong, and looking at the wrong log file. It should be toki frontend app. The call stack is toki frontend uses d2 client to call toki backend. Toki backend doesn't make downstream calls at all so there is no d2 service discovery happening there. Please check the readme of Toki https://github.com/linkedin-multiproduct/toki, and use the "Querying toki" curli call "curli -v "https://localhost:18792/Toki/resources/**toki**?action=**tokiRestClient**". And the valid log is toki-war.log. You can deploy only toki frontend with "mint deploy -w toki-war -f qei-ltx1", which will send requests to EI toki-backend, so you won't mix the two apps. |
Thanks @bohhyang for ur correction. Yes. U are right, toki backend will only echo the message (checked from the code) |
Background
For
samza job
, the LoadBalancer will be shutdown during running.2024-02-28 22:28:49.469 [main] DynamicClient [INFO] shutting down dynamic client
2024-02-28 22:28:49.469 [main] DualReadLoadBalancer [INFO] New load balancer successfully shut down
2024-02-28 22:28:49.470 [main] ZKFSLoadBalancer [INFO] Shutting down
2024-02-28 22:28:49.470 [Indis xDS client executor-4-1] XdsClientImpl [INFO] Shutting down
More context could check from :
https://linkedin-randd.slack.com/archives/C033U0SJFEX/p1709165894672129?thread_ts=1709075763.043899&cid=C033U0SJFEX
SI-37856
As the
newLb executor
inDualReadLoadBalancer
is shut down, so DualReadLoadBalancer throws exceptions when it’s already shutdown but gets new tasks submitted to it. Then Samza code force-exits when uncaught exception is found.Solution
So for
_newLbExecutor
indual-read
mode. We need to check whether is itshutdown
before using to avoid new tasks process.Test done
Regression test in local on
toki-restli-backend
for dual-read modeHere is the full log:
toki-restli-backend.log
Follow up
As it’s hard to catch the timing for when the dual read LB is not shutdown yet, but the newLbExecutor is shut down. So we could not get the
newLb executor is shutdown already. Skipping getClient on newLb executor.
log usingmint undeploy
method.Will double check with samza team to check their deployment is no problem.