New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mesos-master sees framework registration requests as coming from localhost #193
Comments
Looks like the Chronos scheduler driver is binding to a local address (127.0.0.1) and hence the registration is unable to succeed. You could force the driver to use a public address by setting "LIBPROCESS_IP" environment variable. e.g: |
Is it possible that the master is reporting itself as being 127.0.0.1 to ZK instead of a reachable address? |
It is possible but doesn't look like the case from the master log above. The master was able to receive the registration request from the remote scheduler. But the master's reply to registration request didn't make it to the scheduler because it was sent to scheduler(1)@127.0.0.1:36198. |
Yeah, that sounds right. I would also suggest explicitly passing the IP to the mesos processes (which is what we do):
|
@vinodkone I'll give a shot to LIBPROCESS - that being said, I guess I should regard this as a temporary patch? @brndnmtthws both masters and slaves are already running with the IP configured. |
@vinodkone LIBPROCESS=myip does seem to fix the issue. That being said, it does look like something might be wrong in mesos itself - at least in the RC currently available (using 0.16.0 seems to do the trick for me when using LIBPROCESS) I think I did not mention this before, but the vagrant machines I am doing my stuff on are set with two network interfaces - any possibilities that could be a reason for this issue? |
I get the same, and it also affects marathon. Got mesos master and slaves configured to use the right IP but no obvious way to do that with marathon or chronos (will try the LIBPROCESS trick). Also using vagrant! |
I having same issue with jenkins mesos plugin. It sees 127.0.0.1. I did set up LIBPROCESS_IP=myip and the start my jenkins as " service jenkins start". any help appreciated |
@suchisubhra setting LIBPROCESS_IP in the environment should make the plugin pick up the IP address. If it is not picking it up, I suspect the environment is getting cleared somewhere along the way of the plugin startup. |
This is pretty old but it popped up when I was searching for a very similar issue and figured it might help anyone having similar issues. My environment is Ubuntu Server (14 04 Tacky I think is what its called) running 4 small VM clusters: 1 ZK, 1 Master, and 3 Slaves. As @vinodkone explained, setting up the LIBPROCESS_IP helped me solve the first problem of 127.0.0.1 appearing in the Master logs, but it didn't help with the re-registering logs in the Master. What I found was that Ubuntu has a pre-installed firewall pretty much blocking all access to all ports. The master tried to connect to an open port on my slave process (in my case it was port 47250). The slave was listening on that port, but because of the firewall the Master couldn't get through. By poking a hole on port 47250 the whole thing magically worked for me and the process successfully completed. I'm rather new to this whole Mesos thing and that really got me stuck. Once you start distributing on more than on machine these little gotchas seem to pop up all over the place. |
Closing this, as different people reported that setting LIBPROCESS_IP fixes the issue. |
I have a 3 node Mesos cluster, on which I run masters, slave and Marathon. I run for the moment Chronos only on node1.mesos1.dev. If the active mesos master is not on the same node as Chronos, I get the following error:
This is how I run Chronos:
Marathon does not seem to suffer of this issue, so my guess is that this is an issue with how Chronos advertises itself.
The text was updated successfully, but these errors were encountered: