Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to establish cluster in Kubernetes using Kubernetes.DNS strategy. #75

Closed
dkushner opened this issue Jul 11, 2018 · 3 comments
Closed

Comments

@dkushner
Copy link

I'm currently deploying my distillery application to a Kubernetes cluster (v1.9.7). I have set up the headless service, configured my release with the appropriate vm.args and configured my application to set up the topology and start the cluster supervisor. I have confirmed that each of the pods created by my deployment can, in fact, ping every other pod in the deployment. I have confirmed that the DNS record for the headless service correctly resolves (indeed it appears libcluster resolves the node IPs correctly as well). However, I am still getting the following error:

09:48:33.964 [warn] [libcluster:kubernetes] unable to connect to :"flywheel@172.17.0.11"
09:48:33.965 [warn] [libcluster:kubernetes] unable to connect to :"flywheel@172.17.0.14"
09:48:33.966 [error] GenServer #PID<0.1888.0> terminating
** (FunctionClauseError) no function clause matching in Cluster.Strategy.Kubernetes.DNS.load/1
    (libcluster) lib/strategy/kubernetes_dns.ex:55: Cluster.Strategy.Kubernetes.DNS.load({:noreply, %Cluster.Strategy.State{config: [service: "contrasting-lambkin-flywheel-remoting.flywheel.svc.cluster.local", application_name: "flywheel", polling_interval: 20000], connect: {:net_kernel, :connect_node, []}, disconnect: {:erlang, :disconnect_node, []}, list_nodes: {:erlang, :nodes, [:connected]}, meta: #MapSet<[:"flywheel@172.17.0.15"]>, topology: :kubernetes}})
    (libcluster) lib/strategy/kubernetes_dns.ex:48: Cluster.Strategy.Kubernetes.DNS.handle_info/2
    (stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:711: :gen_server.handle_msg/6
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: :timeout
09:48:33.971 [info] Application flywheel exited: shutdown
{"Kernel pid terminated",application_controller,"{application_terminated,flywheel,shutdown}"}
Kernel pid terminated (application_controller) ({application_terminated,flywheel,shutdown})

It would appear that, for some reason, the VM is unable to connect to any of its peers. As I mentioned, I've verified that the network traffic is unimpeded so I'm figuring it must simply be some configuration mistake I've made but I cannot for the life of me determine what it is.

@flowerett
Copy link
Contributor

flowerett commented Jul 11, 2018

Hey @dkushner,
I think there's nothing wrong with your env or nodes, there is just a typo in lib,
that's what ** (FunctionClauseError) no function clause matching in Cluster.Strategy.Kubernetes.DNS.load/1 says.
It should return just %State{} not {:noreply, %State{}}
Could you check if it's working from my PR: #76 ?

@dkushner
Copy link
Author

@flowerett: Aha! Yeah, the error message was pretty straightforward but I was hesitant to try and correct it directly just because I lack familiarity with the design decisions of the library.

I've just checked your fix and it works flawlessly. Thank you so much!

@bitwalker
Copy link
Owner

I'll push a new release today with the fix, sorry for the delay, lots on my plate unfortunately :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants