New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Namerd not updated automatically #1669
Comments
Hey @jsenon, thanks for reporting this -- we'll take a look and get back to you. Would you mind also adding Namerd's metrics payload if you have it? |
I don't have historical payload. does namerd metrics or heapster screenshot can help you? |
@jsenon That's no problem. We can look into reproducing with the info you added in the description. In the meantime, how hard would it be for you to switch to namerd's io.l5d.mesh interface? That interface is newer and it might not have the same correctness problem, so I think it would be worth a shot. |
@klingerf Unfortunately env. is freeze for demo. I will check on another cluster. |
I was able to reproduce this behavior in one of our Kubernetes test environments, running Kubernetes 1.6.10 and Linkerd/Namerd 1.3.0. It appears that Namerd fails to successfully re-establish watches after it encounters a 410 "too old resource version" error from the watch API. In the namerd logs, I see:
And after that message is printed, I see no further updates from We'll need to investigate why the watch is not successfully re-established after encountering the "too old resource version" error. This relates to #1649. |
@klingerf thanks for investigation and error investigation on your side |
Just piggybacking this issue. We're seeing the same behavior after deploying linkerd/namerd 1.3.0 to our staging env. Thanks for looking into it @klingerf. |
@klingerf any update on when this might get fixed? |
Hey @activeshadow, we are actively working on it. Have a repro but not a fix yet, but should hopefully have the fix soon. Will update here when we do, at which point we can cut a bugfix release. |
Update: we have identified a possible root cause and are working to verify. This is a high priority issue. |
Great, thank you for the update!
…On Wed, Oct 18, 2017 at 02:43:03PM -0700, William Morgan wrote:
Update: we have identified a possible root cause and are working to verify.
This is a high priority issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.*
--
//SIGNED//
Bryan T. Richardson
Active Shadow LLC
505.382.2077
|
Hey @jsenon, @activeshadow, @Taik -- we just merged a fix for this issue and are working on cutting a release candidate now. Can update in an hour or two once it's available, and then we'd love to have you verify it's fixed in your environment. |
Ok, please give |
@klingerf sure, I ll test it tomorrow and give you quick feedback, thanks |
@klingerf so far the fix looks to be working! I can restart pods, causing them to get a new endpoint IP, and linkerd picks it up and continues to send them data. Thanks! |
@klingerf Looks good for me so far. I'll let it bake in our staging for a bit to see how things go. Thanks for the quick turnaround! |
Ok, thanks for all of the feedback folks! We will get this officially released as part of 1.3.1 next week. |
Hello,
We are using namerd for dynamic dtab, but when we want to update a route, change are not take into account without restart of namerd pod.
Kubernetes version:
Deployed over AWS.
Linkerd Configuration:
Namerd Configuration:
Namerd Service:
Linkerd Metrics:
metrics.txt
The text was updated successfully, but these errors were encountered: