Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store external ip in case of localhost addprocs #5995

Closed
amitmurthy opened this issue Mar 1, 2014 · 5 comments · Fixed by #6030
Closed

Store external ip in case of localhost addprocs #5995

amitmurthy opened this issue Mar 1, 2014 · 5 comments · Fixed by #6030
Assignees
Labels
domain:parallelism Parallel or distributed computation

Comments

@amitmurthy
Copy link
Contributor

Currently, "127.0.0.1" is stored in the Worker struct for addprocs(n). This causes the issue as reported in JuliaParallel/ClusterManagers.jl#6

We should preferably store the external ip-address and broadcast the same to newer workers added.

@ihnorton
Copy link
Member

ihnorton commented Mar 1, 2014

From that issue:

it might be relevant that each node in this cluster has two NICs, one facing out to the world, the other facing towards the rest of the nodes in the cluster. julia correctly finds the latter ip addr.

Note that getipaddr returns the first ip address it finds, which may not be the desired/useful one on a multi-NIC node (not sure it's relevant here, but it might be an issue when the head node has two NICs?)

@amitmurthy
Copy link
Contributor Author

In this case however, the primary cause is the recording of the loopback address in the Worker struct.

But yes, we need to address the case of multi-homed hosts, which is a separate issue in itself.

Additionally, there is an asymmetry in a situation where incoming connections to the localhost are blocked. For example.

addprocs (on EC2) followed by addprocs (on localhost) will work fine, but not the other way around. Since typically connections from localhost to the EC2 instance will be allowed but not the other way around.

The former can probably be solved by addprocs accepting a parameter specifying the interface to use, while the latter is a bit more complicated to address given the current model of connection setup.

@bjarthur
Copy link
Contributor

bjarthur commented Mar 6, 2014

would it make sense to have getipaddr() return a tuple of all the addresses?

@amitmurthy
Copy link
Contributor Author

That could be an optional keyword argument - getipaddr(; all=false)

@ivarne
Copy link
Sponsor Member

ivarne commented Mar 6, 2014

As a keyword argument, it would make the function type unstable. (Not sure if it matters in this case.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:parallelism Parallel or distributed computation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants