Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zone dependent Name resolution #2545

Closed
cforce opened this issue Dec 13, 2017 · 19 comments
Closed

Zone dependent Name resolution #2545

cforce opened this issue Dec 13, 2017 · 19 comments

Comments

@cforce
Copy link

cforce commented Dec 13, 2017

I have a hybrid cloud scenario with zone east and zone west where each zone is running on a own paas that does outbound discovery by DNS Name resolution using a router component. Discovery is done with zone affinity in each cloud west or east or if no server found by discovery in other regions zone using peered eureka in each region.
In case if discovery on same zone the extra hop vua the router that resolves discovered global unique FQDN (Wan) is to expensive. Instead I want to use for a connection to targets in same Zone the lan ipadress instead of the global Wan FQDN.
The only idea I have is to implement a selection in ribbon basd on own with target zone match, that uses either VIP or up address.

@cforce
Copy link
Author

cforce commented Dec 13, 2017

@ryanjbaxter
Copy link
Contributor

Is the problem here that you want to register with the FQDN to support the cross region communication but only want that used in that scenario. Otherwise when making a request to an app in the same region you want to use the VIP instead?

@ryanjbaxter
Copy link
Contributor

So you essentially want Ribbon to be smart and choose the correct one?

@cforce
Copy link
Author

cforce commented Dec 16, 2017

Yes, the caller checks if the callee is on same zone (could be done by comparing the zone field i think). What is needed is a address the DiscoveryClient register the microservice for a local route (same zone) and one that can be used for same region (outside same PCF) , what will be mostly a global dns name or wan address

@cforce
Copy link
Author

cforce commented Dec 20, 2017

There is partly a solution (https://content.pivotal.io/isolation-segments/6-ways-pivotal-cloud-foundry-1-10-improves-your-security-posture , https://content.pivotal.io/blog/building-spring-microservices-with-cloud-foundrys-new-container-networking-stack ) which allows to register a micoservice running on PCF/PWS with its local ip address (instead of public route) which allows combined with container to container networking since PCF 1.10 to have a direct communication setup between direct registered application/microservices in same PCF instance(local access route using overlay IP). However if i want that the same microservice should be accessed also from outside, e.g. all same zone app instances are no healthy and ribbon wants to choose an instance which runs on the "other" pcf instance in the "other" cloud, there is no "overlay network" spaned in between and i need to use the "public route" which shall be known i my zone by peered Eureka's (see http://docs.pivotal.io/spring-cloud-services/1-2/service-registry/enabling-peer-replication.html ).
So in fact a microservice has to offer also an external route which allows access by load balance or just lookup from another PCF / Eureka region that is no party of the local overlay network.
If there are different addresses ("types" of subnets or domains) in my eureka published instances, ribbon need to choose the "right" address (best route(shortest path) for the destination instance by balancer rule has chosen.
There are things that have to be fulfilled if i would choose the "local" address over the remote wan address"
1.) my destination instance has more than one address published in Eureka i might have to choose from (could be even more than two)
2.) Know what is the local (same zone one), so i have to match my own location (zone) with the destination instance zone, that should be easily possible by manifest zone attribute.
3.) THe adress i will choose is reachable from my origin - might not be possible in case Container to container networking is not enabled, and i can't connect on that "local" overlay ip. Ribbon would then markt this instance as unhealthy on client side, wouldn't it - no nice but would work. COuld be improved if spring-cloud-services-dependencies would introduce a feature that allows to find out if my local firewall allows contact to ip of my destination.

@ryanjbaxter
Copy link
Contributor

Thats all assuming you are using PCF, which may not always be the case.

@cforce
Copy link
Author

cforce commented Dec 21, 2017

@csterwa Would you help us to find a good solution, where/how to implement that shared for both worlds (pcf or any other xaas). Tx a lot.

@habuma
Copy link
Contributor

habuma commented Jan 12, 2018

I'm still digging, but it seems that...

  1. Eureka already provides zone, host, and ip...the three bits of info we'd need to achieve this.
  2. Somewhere between Eureka and Ribbon, the ip is lost somewhere along the way and not available in the ServiceInstance object.
  3. It seems that most/all of this could be achieved with a custom implementation of LoadBalancerClient, possibly extending RibbonLoadBalancerClient for simplicity's sake, if the ip were available in the ServiceInstance given to reconstructURI().

WDYT?

@cforce
Copy link
Author

cforce commented Jan 13, 2018

Using a custom RibbonLb was the first idea i had also, but not sure if that is all waht is necesarry to achieve the clean solution - however a good start.
I think for 3.) there could be support per PaaS (e.g PCF, Kubenerntes). In case of PCF we might get the information or implementation of this RibbonLB from io.pivotal.spring.cloud:spring-cloud-services-dependencies. This dep already introduces a cfg property "spring.cloud.services.registrationMethod.direct: true" thats skips a microservcies public route registration (in GoRouter and spring cloud services /Eureka and then does only publish local IP's in Eureka?)
Reading https://content.pivotal.io/blog/building-spring-microservices-with-cloud-foundrys-new-container-networking-stack i am very unsure what is already there and what not.
The scenario should anyway not only support either routing using "LAN local ips" or "WAN global routes/domain anmes" per application destination but both patterns at the same time for the same lookupe app instance and then let the Eureka Peering(don't spread/use the lan ips on other region/zones + Client side RibbonLB (use the LAN Ip if target on same region/zone as myself and route is possible e.g. query PaaS/PCF context) makes the right choice.
An app instance would offer the local ip additional as option only because if its offers the public route that would work always too (for nearby lan connectivity) but would imply an extra hop via the (go) router that introduces delay and put load on a central component that even is part of an edge security zone facing the eveil internet.

@habuma
Copy link
Contributor

habuma commented Jan 17, 2018

In my testing, I've confirmed that the default behavior is to register both the app's route (in the "hostName" property) and the app's IP address (in the "ipAddr" property), UNLESS eureka.instance.prefer-ip-address is set to true (which is what happens in SCS when you set the registration method to "direct"). In that case, the IP address is registered in the "hostName" property and in the "ipAddr" property...which is not desirable.

In order to achieve the desired outcome, BOTH the IP and the route need to be registered--the IP for local, direct access and the route for external access through the GoRouter. And, in fact, that is precisely what happens by default (if you don't use "direct" registration). As I stated, the IP is in "ipAddr" and the route is in "hostName". Along with the "zone" in the registration metadata, we have everything we need for the load balancer to make an informed choice and select either the IP or the route.

Except...

At the point where the request URL is reconstructed, only the "zone" and "hostName" properties are available. The "ipAddr" property is not available. Therefore, there's no way for LoadBalancerClient (or a custom implementation thereof) to use the IP address, because that information isn't available. (Yes, setting eureka.instance.prefer-ip-address to true will make the IP address available, but then the host name isn't available to the LoadBalancerClient if that is needed for URL reconstruction.)

Therefore...

In order for a custom LoadBalancerClient to reconstruct the request URL with either the IP or the route, depending on the circumstance, the "ipAddr" property needs to make its way into Ribbon and to the LoadBalancerClient. This currently isn't the case, so with the current implementation it isn't possible, short of eschewing Ribbon and doing your own service lookup and load-balancing.

This surfaces the question: Can Spring Cloud Netflix be modified (in a future version) to offer the value from the service registration entry's "ipAddr" field in LoadBalancerClient? Or, better yet, is there a workaround (that I'm not seeing) in the current implementation that would make the "ipAddr" value available in a custom LoadBalancerClient?

@spencergibb
Copy link
Member

I don't think a custom LoadBalancerClient is needed. In fact, see DomainExtractingServerList. It has access to InstanceInfo via DiscoveryEnabledServer which has both host and ip addr. Can probably do things in there, maybe with options?

@habuma
Copy link
Contributor

habuma commented Jan 18, 2018

@spencergibb Are you thinking that perhaps SCS could (in its connectors) configure a ribbonServerList bean (overriding the one provided by autoconfig) that returns a custom DomainExtractingServerList that instead of relying on useIpAddr as the only/universal factor when deciding to use IP vs host, it could instead compare zone with host and decide whether to use IP vs. host?

On the surface (meaning that I've not tried any experiments), that sounds like it could work and would require no changes to OSS code.

@spencergibb
Copy link
Member

Yes.

@cforce
Copy link
Author

cforce commented Jan 19, 2018

Don't forget that a compare of target host with current zone evaluating to true is not enough to fullfill a successfull route. If using the ip it's still possible that it not reachable because there is no Conatiner 2 Conatiner network (fwall open, overaly network etsbalished) connection between source and target.
There are two ways to solve that, which both could be applied or either or.
1.) The cloud specific adapter (auto congfig dep part of SCS and spring cloud kubernetes) allows the Client to ask the paas/xaas (cloud controller api or whatever) if the target can be reachable (.
2.) The client has some (re)try that can check if nthe ip is reachable, which would be cloud agnostic.

@csterwa
Copy link

csterwa commented Jun 25, 2018

@cforce @spencergibb @ryanjbaxter @habuma sounds like we have a potential solution. The SCS team has prioritized this now and will provide an update when we have more information.

@cforce
Copy link
Author

cforce commented Jul 7, 2018

Great news for multi/hybrid cloud scenarios 😀. good luck for the Sprint, looking forward to it

@habuma
Copy link
Contributor

habuma commented Jul 25, 2018

@cforce Your first option is a bit unsettling. At this point, the clients only deal with Eureka and any services that they discover from Eureka. This option would require that they also be aware of and communicate with the cloud platform to know if the target is reachable. This would be highly dependent on the capabilities of the platform in question and would make the application aware of the platform. I'm not sure I'm excited about this option.

The retry approach is intriguing, though. I think that you're suggesting that it should try one way (IP, let's say) and then, if that doesn't work, try the other way (host). On the surface it sounds reasonable, but thinking through where this would need to happen makes me think it is a bit complex. And I worry about performance, as making a bad first call and then falling back will take longer than just making the right call the first time. Even so, it's an option worth exploring.

@cforce
Copy link
Author

cforce commented Jul 30, 2018

Actually spring cloud netflix is not aware of the XaaS, what is advantage and disadvantage at the same time. I don't see a big disadvantage to introduce a XaaS native API (to ask for C2C or P2P connectivity) as dep in Spring Cloud Ribbon that is implemented at runtime by autoconfiguration for the supported XaaS (PCF or Kubernetes).

Kubernetes assumes that pods can communicate with other pods, regardless of which host they land on. A every pod has its own cluster-private-IP address so you do not need to explicitly create links between pods or mapping container ports to host ports. This means that containers within a Pod can all reach each other’s ports on localhost, and all pods in a cluster can see each other without NAT. Despite from that pods running nginx in a flat, cluster wide, address space talking to these pods directly is possible, but hen a node dies pods die with it, and the Deployment will create new ones, with different IPs. Therefore in kuberentes solves this by service exporting a logical set of Pods running somewhere in your cluster, that all provide the same functionality. Each Service is assigned a unique cluster IP address which is is tied to the lifespan of the Service.
Here a different options, like register with private IP address and let Ribbon make use of the regular health monitoring an LB instead of using the kubernetes ones or just let every spring cloud app register itself with the service clusterip what let the PAAS handle it, but would hide details on application nodes distribution and status and disallow custom LB rules on client side.
Having said that, regarding the awareness of the XaaS below the work done for spring cloud kuberentes could be interesting here.

@spencergibb
Copy link
Member

This module has entered maintenance mode. This means that the Spring Cloud team will no longer be adding new features to the module. We will fix blocker bugs and security issues, and we will also consider and review small pull requests from the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants