Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docker] Docker: "This node is not requested endpoint" issues #9219

Closed
bitsofinfo opened this issue Nov 4, 2016 · 17 comments
Closed

[docker] Docker: "This node is not requested endpoint" issues #9219

bitsofinfo opened this issue Nov 4, 2016 · 17 comments

Comments

@bitsofinfo
Copy link

@bitsofinfo bitsofinfo commented Nov 4, 2016

Since the original issue is closed, I'm re-posting this as we are re-visiting this now and still having issues: #4537 (comment)

Perhaps hazelcast can re-open the old issue or look into addressing it as it is sort of a blocker at this point (still need to try testing w/ overlay network), but regardless:

What is described below is reproducible with this sample project:
https://github.com/bmudda/hazelcast-docker-test

Referenced issues:
#4537
hazelcast/hazelcast-docker#10

a) clone, build and create a docker image for the sample project above (see the README)

b) Edit the docs/hazelcast.xml to reflect the "members" as they will appear as mapped to the docker host when running as containers (i.e. [dockerhostip]:[mapped-5701-port])

...
      <tcp-ip enabled="true">
        <member>[DOCKER_HOST_IP]:40001</member>
        <member>[DOCKER_HOST_IP]:40002</member>
      </tcp-ip>
...

c) In terminal one, launch node1 (mapped to 40001)

docker run --rm=true -v /path/to/hazelcast-docker-test/docs:/config -p 40001:5701 [IMAGE_ID] java -Dhazelcast.config=/config/hazelcast.xml -jar /hzdocker/hazelcast-docker-test.jar

d) In terminal two, launch node2 (mapped to port 40002)

docker run --rm=true -v /path/to/hazelcast-docker-test/docs:/config -p 40002:5701 [IMAGE_ID] java -Dhazelcast.config=/config/hazelcast.xml -jar /hzdocker/hazelcast-docker-test.jar

One both up you will see these kind of messages and each container trys to connect to the other to form the cluster

....
Nov 04, 2016 9:05:01 PM com.hazelcast.nio.tcp.TcpIpConnectionManager
WARNING: [172.17.0.3]:5701 [hazelcast-docker-test] [3.7.2] Wrong bind request from [172.17.0.2]:5701! This node is not requested endpoint: [192.168.0.148]:40001

...
Nov 04, 2016 9:15:44 PM com.hazelcast.nio.tcp.TcpIpConnectionManager
WARNING: [172.17.0.2]:5701 [hazelcast-docker-test] [3.7.2] Wrong bind request from [172.17.0.2]:5701! This node is not requested endpoint: [192.168.0.148]:40002
....

Again due to this:

if (ioService.isSocketBindAny() && !connection.isClient() && !thisAddress.equals(localEndpoint)) {

The only way this will work is if the user wires is if the user specifies the public-address in the network configuration as such:

  <network>
  <public-address>192.168.0.148:40001</public-address>
....

The problem with this is that in a totally dynamic environment the docker container does not know the docker host nor mapped port it is provisioned to by whatever clustering framework you are launching containers with (swarm etc) without using a tool like https://github.com/bitsofinfo/docker-discovery-registrator-consul which pre-determines this information in the java process BEFORE the HZ instance is started (and dynamically configures this public-address on the fly).... but this MUST happen before HZ starts and you have to launch your containers in a way that may not be universally acceptable/compatible for all folks use-cases of how they are launching containers.

This is also contingent on if you can set the network dynamically and if that will be obeyed if using a discovery SPI (like consul)

Seems like hazelcast should have some more flexibility in this area/scenario to maybe optionally enforce this rule that is currently throwing that exception

@bitsofinfo bitsofinfo changed the title Docker: This node is not requested endpoint issues Docker: "This node is not requested endpoint" issues Nov 4, 2016
@enesakar enesakar added this to the 3.8 milestone Nov 8, 2016
@bitsofinfo
Copy link
Author

@bitsofinfo bitsofinfo commented Dec 12, 2016

Any thoughts, updates on this?

@bilalyasar
Copy link
Contributor

@bilalyasar bilalyasar commented Dec 12, 2016

Hello @bitsofinfo ,
as you said, public-address solves the problem. But it is not dynamic solution.
I have workaround for you:
In hazelcast.xml, you can provide allowed interfaces..
So if you have some info about possible interfaces, you can list them. Then hazelcast will not use docker0 interface.

An example:

<?xml version="1.0" encoding="UTF-8"?>
<hazelcast id="hazelcast-consul-discovery" xmlns="http://www.hazelcast.com/schema/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-config-3.7.xsd">
  <group>
    <name>hazelcast-docker-test</name>
    <password>haz3lcast1</password>
  </group>
  <network>
    <port auto-increment="true">5701</port>
 <interfaces enabled="true">
            <interface>172.30.*.*</interface>
        </interfaces>  
</network>
</hazelcast>

I tried this on AWS, when i use this xml config and --net=host config, it binds to ec2 host address, otherwise it binds to docker0 interface.

@bitsofinfo
Copy link
Author

@bitsofinfo bitsofinfo commented Dec 12, 2016

We will try this out, but the problem is that depending on the overlay network the container is assigned to this "range" could be different.

@mesutcelik mesutcelik modified the milestones: 3.9, 3.8 Feb 6, 2017
@gAmUssA
Copy link

@gAmUssA gAmUssA commented Feb 6, 2017

@bitsofinfo hi. did you have a chance to try @bilalyasar's suggestion?

@bitsofinfo
Copy link
Author

@bitsofinfo bitsofinfo commented Feb 7, 2017

@gAmUssA @bilalyasar the <interfaces> solution does not work.

Putting the docker host subnet in there yields Hazelcast CANNOT start on this node. No matching network interface found..

Putting the docker host network subnet in there yields the original error that started this thread to begin with This node is not requested endpoint

Only the public-address solution works when the members are @ [docker_host_ip]:[mapped-port]

Note my members are spread across many different docker hosts and I have no idea what their ips might be, (hence using the SPI)

Either way we are able to use https://github.com/bitsofinfo/docker-discovery-registrator-consul to dynamically determine each nodes docker host ip and mapped HZ port, to collect this information and dynamically seed the public-address configuration options before HZ bootstraps.

Again if this "check" below could just be a configurable option to disable it would be great. If I configure hazelcast to listen on 5701, and 5701 is mapped to a high port on the dockerhost, and I give HZ a list of valid members that live wherever (however discovered... hardwired or via SPI), it should just work.

if (ioService.isSocketBindAny() && !connection.isClient() && !thisAddress.equals(localEndpoint)) {

@bitsofinfo
Copy link
Author

@bitsofinfo bitsofinfo commented Jun 21, 2017

@gAmUssA
Copy link

@gAmUssA gAmUssA commented Jun 21, 2017

@bitsofinfo let me try reproduce this stuff since I'm writing swarm tutorial at this moment

@bitsofinfo
Copy link
Author

@bitsofinfo bitsofinfo commented Jun 21, 2017

@gAmUssA thanks, I'm in the middle of trying to write the swarm discovery spi, so this is a bit of a blocker at this point (i.e. #10801)

https://github.com/bitsofinfo/hazelcast-docker-swarm-discovery-spi

@bitsofinfo
Copy link
Author

@bitsofinfo bitsofinfo commented Aug 9, 2017

When are all these docker networking, interface, binding issue related things going to finally be looked at or scheduled to be addressed?

@tdevopsottawa
Copy link

@tdevopsottawa tdevopsottawa commented Aug 17, 2017

@bitsofinfo I'm with you.. been struggling to get hazelcast working in docker on aws for weeks now

@jerrinot
Copy link
Contributor

@jerrinot jerrinot commented Aug 17, 2017

@bitsofinfo: we are analyzing the swarm/kubernetes issues right now.

I hope to open the document with our analysis soon. in the meantime - can you please have a look at my experimental branch. would this work for you? You can see this as an example how to use it.

@lazerion
Copy link
Contributor

@lazerion lazerion commented Aug 18, 2017

@bitsofinfo we completed our analysis and it will be open after reviews, I am also working on @jerrinot `s experimental branch to test solution on Docker Swarm.

@tombujok tombujok modified the milestones: 3.9.1, 3.9 Aug 22, 2017
@javanotes
Copy link

@javanotes javanotes commented Aug 24, 2017

Landed on the same boat! Any hopes yet?

@mmedenjak mmedenjak changed the title Docker: "This node is not requested endpoint" issues [docker] Docker: "This node is not requested endpoint" issues Sep 22, 2017
@gagangoku
Copy link

@gagangoku gagangoku commented Sep 24, 2017

@bitsofinfo hi. Been trying to run hazelcast on docker and your post saved the day, was finally able to understand whats happening and am now able to connect multiple hazelcast containers amongst themselves.

I agree with the point you make about dynamic ports. --net=host is just bad, it won't scale.

I ended up parameterizing the public-address and tcp member list so that you can pass them as arguments to docker run command:

docker_ip=$(echo $DOCKER_HOST | sed 's,^.*/,,' | sed 's/:.*//')
docker run -e PUBLIC_IP=$docker_ip:40001 -e MEMBER_LIST=$docker_ip:40001,$docker_ip:40002 -p 40001:5701 <image-name>

For example, on my machine:

docker run -e PUBLIC_IP=192.168.99.100:40001 -e MEMBER_LIST=192.168.99.100:40001,192.168.99.100:40002 -p 40001:5701 my-hazelcast
docker run -e PUBLIC_IP=192.168.99.100:40002 -e MEMBER_LIST=192.168.99.100:40001,192.168.99.100:40002 -p 40002:5701 my-hazelcast

Here's my run.sh script inside Dockerfile:

echo "----------------------- Starting Hazelcast ---------------------------"

echo "PUBLIC_IP = $PUBLIC_IP"
echo "MEMBER_LIST = $MEMBER_LIST"
MEMBERS=""

IFS=","
for m in $MEMBER_LIST
do
  MEMBERS="$MEMBERS<member>$m</member>"
done
echo "MEMBERS=$MEMBERS"

# Replace the variables
echo "Replacing variables in place"
sed -i.bak "s/{PUBLIC_IP}/$PUBLIC_IP/" hazelcast.xml
sed -i.bak "s,{MEMBER_LIST},$MEMBERS," hazelcast.xml

java -server ${JAVA_OPTS} -cp hazelcast-all-3.8.3.jar com.hazelcast.core.server.StartServer

And hazelcast.xml:

  <network>
    <port auto-increment="true" port-count="100">5701</port>
    <public-address>{PUBLIC_IP}</public-address>
    <join>
      <tcp-ip enabled="true">
        <member-list>{MEMBER_LIST}</member-list>
      </tcp-ip>
...

This makes a 2 node cluster and both hazelcast are able to talk to each other because of public address.

This could be simplified if we were able to access the port mapping from within docker container. It seems its an in-progress feature request on docker which might take a while: moby/moby#26331

With moby's change, we would not fix a port, instead let docker use a dynamic port mapping and construct the public ip from within the container itself.

Am still at a loss on how to construct the member list though. It probably has to be done via a discovery service.

@Holmistr
Copy link
Collaborator

@Holmistr Holmistr commented Feb 2, 2018

@mesutcelik Are there any updates on this?

@mesutcelik
Copy link
Contributor

@mesutcelik mesutcelik commented Feb 7, 2018

closing. please see #12275

@mesutcelik mesutcelik closed this Feb 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.