Skip to content

GEODE-7565: Allow gw receivers with same host and port#4627

Closed
alb3rtobr wants to merge 17 commits intoapache:developfrom
Nordix:feature/GEODE-7565
Closed

GEODE-7565: Allow gw receivers with same host and port#4627
alb3rtobr wants to merge 17 commits intoapache:developfrom
Nordix:feature/GEODE-7565

Conversation

@alb3rtobr
Copy link
Contributor

There is a problem with Geode WAN replication when GW receivers are configured with the same hostname-for-senders and port on all servers. The reason for such a setup is deploying Geode cluster on a Kubernetes cluster where all GW receivers are reachable from the outside world on the same VIP and port.

The problem experienced is that shutting down one server is stopping replication to this cluster until the server is up again. This is because Geode incorrectly assumes there are no more alive servers when just one of them is down, because since they share hostname-for-senders and port, they are treated as one same server.

With these changes locator is able to distinguish the different receivers using the same hostname and port so replication is not impacted when one server is stopped.

( Link to thread in dev mailing list: https://markmail.org/thread/6qakx67rxiokdsec )

Copy link
Contributor

@jake-at-work jake-at-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use the strongly typed InternalDistributedMember rather than it's string representation?

Copy link
Contributor

@jake-at-work jake-at-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the use of ServerLocation and the ID in the key is redundant. The ID will always be unique and there will never be duplicate ID with two different ServerLocations. In the places where we need the ServerLocation we also have the LoadHolder which also has the ServerLocation, so we can just get ServerLocation from LoadHolder.

@alb3rtobr
Copy link
Contributor Author

Why use the strongly typed InternalDistributedMember rather than it's string representation?

I don't see which part of the code is your comment about, but I suppose is referred to ServerLocationAndMemberId class, am I right? I used the string representation because its the minimum information I think I need. Thanks for the fast review!

@jake-at-work
Copy link
Contributor

Why use the strongly typed InternalDistributedMember rather than it's string representation?

I don't see which part of the code is your comment about, but I suppose is referred to ServerLocationAndMemberId class, am I right? I used the string representation because its the minimum information I think I need. Thanks for the fast review!

Sorry, I should have attached it to a line of code. Yes, the issues is with using the String since it we have a more strongly typed value the string is derived from why not use it.

If you consider my other comment about the ID really being the only unique part of the key anyway it makes a stronger argument for the strong type over the string as the key for the maps.

@alb3rtobr
Copy link
Contributor Author

Hi @pivotal-jbarrett , the changes you requested are done and all checks pass, could you please take a look? Thanks!

@lgtm-com
Copy link

lgtm-com bot commented Jan 29, 2020

This pull request introduces 2 alerts when merging 3f2f49e into 9de8d20 - view on LGTM.com

new alerts:

  • 2 for Type mismatch on container access

@lgtm-com
Copy link

lgtm-com bot commented Jan 30, 2020

This pull request introduces 2 alerts when merging 924a2c4 into d294764 - view on LGTM.com

new alerts:

  • 2 for Type mismatch on container access

@alb3rtobr alb3rtobr force-pushed the feature/GEODE-7565 branch from efdf8ee to 5bc2f7e Compare March 9, 2020 13:26
Copy link
Contributor

@bschuchardt bschuchardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few failures in client/server tests that need to be addressed before this PR is ready for review. The same Pool/ServerLocation/Endpoint code is used for client/server communications and WAN communications and and these changes that are intended for WAN seem to be harming client/server functionality.

@alb3rtobr
Copy link
Contributor Author

This PR has become a bit messy, so Im closing it.
I have created a new one: #4824
And also, a wiki page describing the problem more in detail: https://cwiki.apache.org/confluence/display/GEODE/Allow+same+host+and+port+for+all+gateway+receivers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants