New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not observe changes to resolved names while building delegation tree #1763
Conversation
Thanks for submitting the PR @edio we are taking a look at the changes you provided. |
Thanks for this, @edio. This actually relates pretty closely to #1731. The pickFirst method you implement here is essentially just This points to a larger change. If we truly always only care about the initial value of a delegation (and I think this is probably the case) then we should change the signature of This also allows us to change the namerd delegation thrift method to be a simple unary request instead of a long-poll. In turn, this allows us to get rid of the delegation observation cache that is causing us trouble in #1731. Of course, this is a much larger sweeping change and also a breaking API change to the namerd API so we should tread carefully. But I think going in this direction is much preferable to the isolated change here that makes the delegate implementation incongruous with it's type signature. Let me know what you think. |
c1c497c
to
c43858a
Compare
Delegator.json tries to walk through all branches of delegation tree. Any resolved name is then observed for changes. On large complicated dtabs this slows down delegation tree building and in some cases may even cause StackOverflowError while unfolding resulting structure of Activities and Vars. This change makes Delegator.json to use only initial delegation result.
c43858a
to
21b2a83
Compare
@adleong thanks for explaining big picture and for pointint out conceptual difference between I do see, that And this doesn't look like something, that can be included in the nearest bugfix release. And the issue is real for us (although, I do understand, that without STR this is not very convincing). Do you think we could implement some part of this big change asap? Namely, we'll try to return Please, see the update to the code. If you're not willing to include this, would reproduce scenario of the issue we have convince you? (i would really like to avoid that, because it appears to be much more effortful, than the fix itself, but if there are no other options... :)) |
Thanks, @edio. I will discuss this with the team in the new year once folks are back at work. |
Hi @edio, we may have an alternate solution for you. Take a look at this PR: #1768 which adds a Please take a look and let me know if that satisfies your requirements. |
I guess we could block |
@edio Can you share some more context around the use of delegator.json? What exactly are clients trying to do? Is this just to avoid the two calls to bind & addr? |
@olix0r, oh sorry for not making it clear. It's only delegator UI, so we can check (mostly during troubleshooting), how a name is bound and resolved. |
Hi @edio, sorry for the delay in getting back to you about this. After some internal discussion I think the best path forward is: Change the signature of the This is certainly a larger change that what you have proposed here, but I think it is more correct and will be easier to maintain in the long run. If you'd like to take a crack at implementing my suggestion, I'd be happy to help out however I can. Otherwise, we'll likely have time to do this sometime next week. Let me know what you think! |
@adleong I'd be willing to take this opportunity to dig into code deeper, but I can do this not earlier than the next week too. And probably even only the week after the next one. So feel free to work on that if you have time. I'll post here, when I'm ready to start work, and if it appears that you haven't started yet, I will work on the PR. |
Hey @edio, I'm going to close this PR but I'm looking forward to working together with you on the solution we discussed above. |
@adleong @olix0r so far this issue is the most frequent one bringing Namerd down in our environment. As far as I understand, you proposed some sort of a fundamental solution that requires some work to be implemented. Unfortunately at the moment @edio don't have availability to contribute that so I wonder if you can reconsider accepting this PR as is as as simple remedy to the issue or maybe we can ask you to implement a proper fix in the next version? Thanks |
Thanks for reminding me about this! If you folks don't have the bandwidth to do this, I'll make sure we allocate time for it as part of 1.3.7. Does that work for you? |
Would be amazing, thanks a lot! |
…inkerd#1763) Appending proxy-init to the end of the list ensures that it won't interfere with other init containers from accessing the network, before the proxy container is created. This resolves bug linkerd#1760 Signed-off-by: ihcsim <ihcsim@gmail.com>
At NCBI we have very complicated dtabs. For a single name resolution consul appears 16 times in leaves of delegation tree, only 1 of 15 lookups in consul usually gives us expected result, the rest are expected to fail.
With such setup calls to delegator.json sometimes were causing namerd to start doing hundreeds to thousands of requests per second to consul and in some (unfortunately, not fully identified) cases, this state of namerd could even last for at least several days (until restarted).
This fixes the issue for us on our setup.
--- commit message
Delegator.json tries to walk through all branches of delegation tree.
Any resolved name is then observed for changes.
On large complicated dtabs this slows down delegation tree building
and in some cases may even cause StackOverflowError while unfolding
resulting structure of Activities and Vars.
This change makes Delegator.json to use only the first lookup result.