Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to pull image when tracker pod is cycled #195

Closed
ThaoTrann opened this issue Aug 1, 2019 · 0 comments
Closed

Failed to pull image when tracker pod is cycled #195

ThaoTrann opened this issue Aug 1, 2019 · 0 comments

Comments

@ThaoTrann
Copy link
Contributor

ThaoTrann commented Aug 1, 2019

Issue: failed to pull image when tracker pod is cycled
Steps to produce:
I have Kraken setup on a cluster which uses Consul as service discovery.
When a tracker pod is killed and brought back up (i.e. tracker ip addr now has changed), agent however still tries to connect to the dead pod ip, causing the following error:

"transferer download: scheduler: create torrent: download metainfo: network error: Get <>/metainfo: dial tcp <deadpod>:80: connect: no route to host"

Thoughts: I looked at the code and looks like agent has a PassiveRing of tracker and

func (r *dnsResolver) resolve() (stringset.Set, error)

for refreshing new hosts doesn't get called after initialization step. The issue persists as long as the agent pod lives. I added c.ring.Refresh() to https://github.com/uber/kraken/blob/master/tracker/metainfoclient/client.go#L54. It refreshes tracker hashring and fixes the issue. Should we add Monitor to refresh periodically?

codygibb pushed a commit that referenced this issue Aug 2, 2019
Fix for #195
Added tracker.Monitor in agent to periodically resolves for hostname similar to how origin refreshes its hashring https://github.com/uber/kraken/blob/master/origin/cmd/cmd.go#L199
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant