Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[discovery] Issue on connecting to HZ cluster started from Docker #9978

Closed
dsukhoroslov opened this issue Feb 25, 2017 · 8 comments
Closed
Labels
[OLD]Team: Integration Source: Community PR or issue was opened by a community user

Comments

@dsukhoroslov
Copy link
Contributor

HZ 3.8. I'm preparing Docker image for my app which is build on top oh HZ. When I start one server node from my Docker image then I can connect to it from java test app strted on my host machine via exposed ip:port (192.168.99.100:10500) with no issues. But, when I start two Docker nodes then it does not work any more. The nodes are discovered each other by multicast and form a cluster on internal Docker network:

2017-02-25 21:50:55.727 [hz.default.priority-generic-operation.thread-0] INFO  com.hazelcast.internal.cluster.ClusterService - [172.17.0.2]:10500 [default] [3.8]

Members [2] {
        Member [172.17.0.2]:10500 - faec26f7-856c-46e4-bf07-4c02ced0feed this
        Member [172.17.0.3]:10500 - f9535704-5141-42cc-92f5-89b419afa3fa
}

The nodes are exposed as 192.168.99.100:10500 and 192.168.99.100:10501 outside. Now when I start client app, initially it connects to one of the nodes, this is the node log:

2017-02-25 21:55:22.112 [hz.default.async.thread-17] INFO  c.h.c.i.protocol.task.AuthenticationCustomCredentialsMessageTask - [172.17.0.2]:10500 [default] [3.8] Received auth from Connection[id=3, /172.17.0.2:10500->/192.168.99.1:53726, endpoint=null, alive=true, type=JAVA_CLIENT], successfully authenticated, principal : ClientPrincipal{uuid='ae5e71f5-cedf-4b37-9377-104313209699', ownerUuid='faec26f7-856c-46e4-bf07-4c02ced0feed'}, owner connection : true, client version : 3.8

but then the client hangs for 2 min and throws exception after that:

2017-02-26 00:57:34.818 [Thread-6] ERROR com.bagri.xqj.BagriXQDataSource - initRepository. error creating Repository 
..........
Caused by: java.io.IOException: No available connection to address [172.17.0.3]:10500

Sure, the ip 172.17.0.3 is not exposed outside and not accessible from my host. But how client got this ip at all? Looks like it got this info from the server side after initial connect. The client has smart-routing set to `true'. As I said, there are no issues when I connect to one single node with the same client/server apps.

Please have a look.
Thanks, Denis.

@mesutcelik
Copy link

mesutcelik commented Mar 3, 2017

Hi @dsukhoroslov ,

Can you please check that comment and see if it is helpful for you?
#4537 (comment)

The problem is that if you don't set public address of the instance launched as docker container, hazelcast-client smart routing will fetch the member list from the member it first connects but that gives a list of unreachable IP addresses i.e. 172.17.0.x

@dsukhoroslov
Copy link
Contributor Author

192.168.99.100 may work for me. Thank you @mesutcelik, I'll try

@Fabryprog
Copy link

check #9963

@dsukhoroslov
Copy link
Contributor Author

Hi @Fabryprog,
Yes, Docker swarm is still on my list, definitely have to try it..

@bitsofinfo
Copy link

#10801

@bitsofinfo
Copy link

@mmedenjak mmedenjak changed the title Issue on connecting to HZ cluster started from Docker [discovery] Issue on connecting to HZ cluster started from Docker Jul 13, 2017
@mmedenjak
Copy link
Contributor

The root causes of these docker issues are:

  • the DefaultAddressPicker implementation
  • the connection checks that disallow establishing a connection

The connection checks can be disabled but the connection still won't be established because of an another issue: #11256
So your options are setting the hazelcast instance public address via the hazelcast properties or JVM param or overriding the DefaultAddressPicker implementation which picks wrong bind and public addresses. In hazelcast 3.9, a new SPI was added to allow you to plug in custom address picker implementations. For now you will have to use the SPI yourself and write an implementation which will fix your issue but we are planning on releasing implementations of our own which will be bundled into plugins such as the docker or AWS plugin for easier deployment.

Please check out the new SPI:
https://github.com/hazelcast/hazelcast/blob/3cede71cad1fe87312f0901ff77f903ed2d4383d/hazelcast/src/main/java/com/hazelcast/spi/MemberAddressProvider.java

Please create a new issue or reopen this one if this does not suit your use case.

@bitsofinfo
Copy link

Starting work on https://github.com/bitsofinfo/hazelcast-docker-swarm-discovery-spi 1.0-RC4 which will integrate this new MemberAddressProvider in a version of the SPI for hazelcast 3.9.x+

@mmedenjak mmedenjak added the Source: Community PR or issue was opened by a community user label Jul 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[OLD]Team: Integration Source: Community PR or issue was opened by a community user
Projects
None yet
Development

No branches or pull requests

5 participants