Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using MemberAddressProvider with custom discovery strategy SPI doesn't seem to work. #11997

Closed
bitsofinfo opened this issue Dec 14, 2017 · 4 comments

Comments

@bitsofinfo
Copy link

@bitsofinfo bitsofinfo commented Dec 14, 2017

I'm attempting to upgrade https://github.com/bitsofinfo/hazelcast-docker-swarm-discovery-spi to leverage the new MemberAddressProvider SPI (#11548)

Current working swarm discovery spi works fine w/ custom AddressPicker

The current working version of the swarm discovery strategy relies on defining a custom AddressPicker. This works fine w/ hazelcast 3.8 and 3.9.x (RC3) https://github.com/bitsofinfo/hazelcast-docker-swarm-discovery-spi/tree/1.0-RC3

Results against HZ 3.9.1 with a custom AddressPicker implementation: https://travis-ci.org/bitsofinfo/hazelcast-docker-swarm-discovery-spi/jobs/316600832

(see travis config in job history, the difference is just the param -Dswarm-bind-method=address-picker)

Indications of successful cluster formation (sample extract below) + member list is accurate (see travis logs above)

Dec 14, 2017 8:08:35 PM com.hazelcast.nio.tcp.TcpIpAcceptor
INFO: [10.0.0.7]:5701 [hazelcast-docker-swarm-discovery-spi] [3.9.1] Accepting socket connection from /10.0.0.9:41808
Dec 14, 2017 8:08:35 PM com.hazelcast.nio.tcp.TcpIpConnectionManager
INFO: [10.0.0.7]:5701 [hazelcast-docker-swarm-discovery-spi] [3.9.1] Established socket connection between /10.0.0.7:5701 and /10.0.0.9:41808

Pending version of swarm discovery spi with MemberAddressProvider SPI solution fails

The MASTER branch has been updated for hazelcast 3.9.x and provides a SwarmMemberAddressProvider

https://github.com/bitsofinfo/hazelcast-docker-swarm-discovery-spi/blob/master/src/main/java/org/bitsofinfo/hazelcast/discovery/docker/swarm/SwarmMemberAddressProvider.java

However when this is used everything is discovered fine by the discovery code, however hazelcast never forms a cluster

Results when the new MemberAddressProvider SPI is used:
https://travis-ci.org/bitsofinfo/hazelcast-docker-swarm-discovery-spi/jobs/316603960

(see travis config in job history, the difference is just the param -Dswarm-bind-method=member-address-provider)

Failure (sample extract below, there are no logs showing "Accepting socket connections") and member list is just ONE, (should be 10, the cluster never fully joins up) (see travis logs above)

Dec 14, 2017 8:17:07 PM com.hazelcast.nio.tcp.TcpIpConnectionManager
INFO: [10.0.0.8]:5701 [hazelcast-docker-swarm-discovery-spi] [3.9.1] Established socket connection between /10.0.0.8:34369 and /10.0.0.12:5701
Dec 14, 2017 8:17:07 PM com.hazelcast.nio.tcp.TcpIpConnectionManager
INFO: [10.0.0.8]:5701 [hazelcast-docker-swarm-discovery-spi] [3.9.1] Established socket connection between /10.0.0.8:33002 and /10.0.0.3:5701
Dec 14, 2017 8:17:07 PM com.hazelcast.nio.tcp.TcpIpConnectionManager
INFO: [10.0.0.8]:5701 [hazelcast-docker-swarm-discovery-spi] [3.9.1] Established socket connection between /10.0.0.8:51974 and /10.0.0.6:5701

The only real difference between these examples is the use of a custom AddressPicker (which works) vs using the new MemberAddressProvider SPI which was introduced to avoid the AddressPicker hack.

Configs used by the test operating in either mode are here:
https://github.com/bitsofinfo/hazelcast-docker-swarm-discovery-spi/tree/master/src/main/resources

https://github.com/bitsofinfo/hazelcast-docker-swarm-discovery-spi/blob/master/src/main/java/org/bitsofinfo/hazelcast/discovery/docker/swarm/test/DockerTestRunner.java

@mmedenjak mmedenjak added this to the 3.10 milestone Dec 15, 2017
@bitsofinfo
Copy link
Author

@bitsofinfo bitsofinfo commented Dec 15, 2017

Just curious how far off is 3.10, is there no possible way this could be addressed in a 3.9.x release?

@taburet taburet self-assigned this Dec 27, 2017
@taburet
Copy link
Contributor

@taburet taburet commented Dec 27, 2017

@bitsofinfo Seems like there is a problem in SwarmMemberAddressProvider, while constructing its SwarmDiscoveryUtil instance it passes bindSocketChannel = true to the constructor, but MemberAddressProviders are not responsible for socket creation/binding. This effectively blocks HZ from binding its own socket to the address, please try passing bindSocketChannel = false.

bitsofinfo added a commit to bitsofinfo/hazelcast-docker-swarm-discovery-spi that referenced this issue Dec 27, 2017
@bitsofinfo
Copy link
Author

@bitsofinfo bitsofinfo commented Dec 27, 2017

boom, thanks for that pointer @taburet it fixes it...

@bitsofinfo bitsofinfo closed this Dec 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants