This project demonstrates zone-aware load balancing with Spring Cloud Kubernetes and provides three working implementations after discovering that the built-in ZonePreferenceServiceInstanceListSupplier doesn't work with Kubernetes Discovery.
Spring Cloud's built-in zone preference doesn't work with Spring Cloud Kubernetes Discovery because zone information is stored in podMetadata() but the built-in mechanism only checks getMetadata().
This repository provides:
- β Three working implementations achieving 100% zone-aware routing
- β Complete local test environment using Kind
- β Detailed documentation of the issue and workarounds
- β Ready-to-submit GitHub issue for Spring Cloud team
See SPRING_CLOUD_ISSUE.md and FINDINGS_SUMMARY.md for complete details.
- β Three different working approaches for zone-aware load balancing
- β
Spring Cloud Kubernetes LoadBalancer with
@LoadBalancedRestTemplate - β Local Kubernetes cluster with simulated availability zones
- β Fast development loop with quick rebuild scripts
- β Comprehensive testing comparing all implementations
- β Detailed investigation of why built-in zone preference fails
Before starting, ensure you have the following installed:
# Required
- Java 17+
- Maven 3.6+
- Docker Desktop for Mac
- kubectl
- kind (Kubernetes in Docker)
- jq (for JSON formatting in tests)
# Install missing tools with Homebrew
brew install kubectl kind jq mavenβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Kind Cluster β
β β
β ββββββββββββββββββββ ββββββββββββββββββββ β
β β Zone A β β Zone B β β
β β β β β β
β β ββββββββββββββββ β β ββββββββββββββββ β β
β β β Client A β β β β Client B β β β
β β β (zone-a) β β β β (zone-b) β β β
β β ββββββββ¬ββββββββ β β ββββββββ¬ββββββββ β β
β β β β β β β β
β β βΌ β β βΌ β β
β β ββββββββββββββββ β β ββββββββββββββββ β β
β β β Service A1 β β β β Service B1 β β β
β β β Service A2 β β β β Service B2 β β β
β β ββββββββββββββββ β β ββββββββββββββββ β β
β ββββββββββββββββββββ ββββββββββββββββββββ β
β β
β Client in zone-a β Prefers Service A1/A2 (same zone) β
β Client in zone-b β Prefers Service B1/B2 (same zone) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
A simple REST service that provides information about itself:
- Returns pod name, zone, IP address
- Deployed with 2 replicas in zone-a and 2 replicas in zone-b (4 total instances)
- Exposes
/infoendpoint for testing
This project includes three different client implementations to demonstrate various approaches for achieving zone-aware load balancing:
-
Custom Client (
client-service/) - β 100% Zone-Aware- Approach: Queries Kubernetes API for pod labels by IP address
- How: Uses
KubernetesClientto fetch zone from pod labels directly - Pros: Works perfectly, uses standard
topology.kubernetes.io/zonelabel - Cons: Requires extra API calls per instance
-
Simple Client (
simple-client-service/) - β 100% Zone-Aware- Approach: Accesses
DefaultKubernetesServiceInstance.podMetadata()directly - How: Custom supplier that reads zone from the pod metadata structure
- Pros: No extra API calls, uses existing discovery data, cleanest approach
- Cons: Requires Kubernetes-specific code
- Note: This demonstrates the fix for Spring Cloud's built-in zone preference
- Approach: Accesses
-
Slice Client (
slice-client-service/) - β 100% Zone-Aware- Approach: Uses Kubernetes EndpointSlices API for zone information
- How: Queries EndpointSlices which have native zone support
- Pros: Kubernetes-native approach, future-proof (EndpointSlices are standard)
- Cons: Requires additional RBAC for endpointslices resource
-
MP-Browse (
mp-browse/) - π Production Application Testing- Approach: Your actual production application (browse-webapp) deployed in the test cluster
- How: Uses the same EndpointSlice-based zone-aware load balancing as slice-client
- Pros: Test your real application locally before deploying to staging/production
- Use Case: Validate that zone-aware routing works with your actual app and configuration
- Note: You provide your own JAR file - see
mp-browse/README.mdfor setup instructions
Why Three Implementations (+ Production App)?
We discovered that Spring Cloud's built-inZonePreferenceServiceInstanceListSupplierdoesn't work with Spring Cloud Kubernetes Discovery (seeSPRING_CLOUD_ISSUE.md). These three implementations demonstrate different working approaches to achieve 100% zone-aware routing. The mp-browse integration allows you to test your actual production application in the same local environment.
- Kind Cluster - Local Kubernetes cluster with nodes labeled as different zones (zone-a, zone-b)
- RBAC - Service accounts and roles for Kubernetes API access
- Namespace - All resources deployed in
lb-demonamespace
Create a local Kind cluster with simulated availability zones:
./scripts/setup-kind-cluster.shThis will:
- Create a Kind cluster with 3 worker nodes
- Label nodes with zone information (zone-a, zone-b)
- Setup the cluster for zone-aware routing
You can deploy any or all of the client implementations:
Deploy Custom Client (Pod label queries):
./scripts/build-and-deploy.shDeploy Simple Client (podMetadata access - recommended):
./scripts/build-and-deploy-simple.shDeploy Slice Client (EndpointSlices API):
./scripts/build-and-deploy-slice.shDeploy MP-Browse (Your Production Application):
# First, copy your JAR file
cp /path/to/browse-webapp.jar mp-browse/app.jar
# Then deploy
./scripts/build-and-deploy-mp-browse.shOr deploy all at once:
./scripts/build-and-deploy.sh
./scripts/build-and-deploy-simple.sh
./scripts/build-and-deploy-slice.sh
# ./scripts/build-and-deploy-mp-browse.sh # Optional - requires your JAREach script will:
- Build the service(s) with Maven
- Create Docker images
- Load images into the Kind cluster
- Deploy Kubernetes resources
- Wait for all pods to be ready
Run the automated test to compare all implementations:
./scripts/test-loadbalancing.shThis will test all deployed client implementations and show:
- Results from Custom Client (if deployed)
- Results from Simple Client (if deployed)
- Results from Slice Client (if deployed)
- Results from MP-Browse (if deployed)
- Distribution of calls across pods and zones
- Same-zone vs cross-zone call percentages
Expected Result: All implementations should show 100% same-zone routing:
{
"clientZone": "zone-a",
"totalCalls": 20,
"sameZoneCalls": 20,
"crossZoneCalls": 0,
"sameZonePercentage": "100.0%"
}# Access client service from zone-a
./scripts/port-forward.sh client-service zone-a
# In another terminal, access client service from zone-b
./scripts/port-forward.sh client-service zone-bOnce port-forwarded, you can access:
# Get client info (shows which zone the client is in)
curl http://localhost:8081/client-info
# Make a single call to the sample service
curl http://localhost:8081/call-service
# Test load balancing with 50 calls
curl http://localhost:8081/test-loadbalancing?calls=50 | jq '.'{
"clientZone": "zone-a",
"clientPod": "client-service-zone-a-xxx",
"totalCalls": 50,
"sameZoneCalls": 50,
"crossZoneCalls": 0,
"sameZonePercentage": "100.0%",
"podDistribution": {
"sample-service-zone-a-xxx-1": 25,
"sample-service-zone-a-xxx-2": 25
},
"zoneDistribution": {
"zone-a": 50
}
}The dev-rebuild.sh script provides a fast development loop:
# Rebuild and redeploy just the client service
./scripts/dev-rebuild.sh client-service
# Rebuild and redeploy just the sample service
./scripts/dev-rebuild.sh sample-service
# Rebuild and redeploy everything
./scripts/dev-rebuild.sh allThis script:
- Builds only the changed service
- Creates a new Docker image
- Loads it into Kind
- Performs a rolling restart
- Takes ~30-60 seconds instead of several minutes
# View client-service logs from zone-a
./scripts/logs.sh client-service zone-a
# View sample-service logs from zone-b
./scripts/logs.sh sample-service zone-bkubectl get pods -n lb-demo -L zone,topology.kubernetes.io/zone -o wide# Exec into a client pod
kubectl exec -it -n lb-demo $(kubectl get pod -n lb-demo -l app=client-service,zone=zone-a -o jsonpath='{.items[0].metadata.name}') -- sh
# Inside the pod, check service endpoints
nslookup sample-serviceThe key configuration is in client-service/src/main/resources/application.yml:
spring:
cloud:
kubernetes:
loadbalancer:
mode: POD # Use POD mode for zone-aware load balancing
loadbalancer:
zone: ${ZONE:unknown} # Zone preferenceAnd the load balancer configuration in LoadBalancerConfig.java:
@Bean
public ServiceInstanceListSupplier zonePreferenceServiceInstanceListSupplier(
ConfigurableApplicationContext context) {
ServiceInstanceListSupplier delegate =
ServiceInstanceListSupplier.builder()
.withDiscoveryClient()
.build(context);
LoadBalancerZoneConfig zoneConfig = new LoadBalancerZoneConfig(zone);
return new ZonePreferenceServiceInstanceListSupplier(delegate, zoneConfig);
}- Pod Labels: Each pod has a label
topology.kubernetes.io/zoneset to its zone - Environment Variable: The
ZONEenvironment variable is passed to the container - Spring Cloud LoadBalancer: Uses the
spring.cloud.loadbalancer.zoneproperty to prefer instances in the same zone
@Bean
@LoadBalanced
public RestTemplate restTemplate() {
return new RestTemplate();
}
// Usage in controller
String url = "http://sample-service/info"; // Service name instead of host:port
Map<String, String> response = restTemplate.getForObject(url, Map.class);The @LoadBalanced annotation enables:
- Service discovery via Kubernetes
- Client-side load balancing
- Zone-aware routing when configured
kubernetes-loadbalancer/
βββ sample-service/ # Target service (provides /info endpoint)
β βββ src/main/java/.../controller/
β β βββ InfoController.java
β βββ Dockerfile
β βββ pom.xml
β
βββ client-service/ # Custom Client (Pod label queries)
β βββ src/main/java/.../
β β βββ config/
β β β βββ LoadBalancerConfig.java
β β β βββ KubernetesClientConfig.java
β β βββ loadbalancer/
β β β βββ CustomZonePreferenceServiceInstanceListSupplier.java
β β βββ controller/TestController.java
β βββ Dockerfile
β βββ pom.xml
β
βββ simple-client-service/ # Simple Client (podMetadata) - RECOMMENDED
β βββ src/main/java/.../config/
β β βββ SimpleLoadBalancerConfig.java
β β βββ LoggingServiceInstanceListSupplier.java # β Key implementation
β βββ Dockerfile
β βββ pom.xml
β
βββ slice-client-service/ # Slice Client (EndpointSlices API)
β βββ src/main/java/.../config/
β β βββ SliceLoadBalancerConfig.java
β β βββ EndpointSliceZoneServiceInstanceListSupplier.java
β βββ Dockerfile
β βββ pom.xml
β
βββ mp-browse/ # Production app integration (user provides JAR)
β βββ Dockerfile # Docker config for your browse-webapp
β βββ README.md # Detailed setup instructions
β βββ .gitignore # Excludes app.jar from git
β βββ app.jar # (Not in git - you copy your JAR here)
β
βββ k8s/ # Kubernetes manifests
β βββ namespace.yaml
β βββ rbac.yaml # Includes endpointslices permissions
β βββ sample-service.yaml
β βββ client-service.yaml
β βββ simple-client-service.yaml
β βββ slice-client-service.yaml
β βββ mp-browse.yaml # Your production app deployment
β
βββ scripts/ # Helper scripts
β βββ setup-kind-cluster.sh # Create Kind cluster
β βββ build-and-deploy.sh # Build/deploy custom client
β βββ build-and-deploy-simple.sh # Build/deploy simple client
β βββ build-and-deploy-slice.sh # Build/deploy slice client
β βββ build-and-deploy-mp-browse.sh # Build/deploy your production app
β βββ test-loadbalancing.sh # Compare all implementations
β βββ debug-simple-client.sh # Remote debugging setup
β βββ port-forward.sh
β βββ logs.sh
β βββ cleanup.sh
β βββ cleanup-all.sh
β βββ destroy-cluster.sh
β
βββ SPRING_CLOUD_ISSUE.md # Ready-to-submit GitHub issue
βββ FINDINGS_SUMMARY.md # Complete investigation summary
βββ ISSUE_SUBMISSION_GUIDE.md # How to submit the issue
βββ SOLUTION.md # Detailed solution documentation
βββ DEBUG_GUIDE.md # Remote debugging instructions
βββ pom.xml # Parent POM
Each client demonstrates a different approach to accessing zone information:
-
client-service/CustomZonePreferenceServiceInstanceListSupplier.java- Queries Kubernetes API for pod details by IP
- Extracts zone from pod labels
-
simple-client-service/LoggingServiceInstanceListSupplier.javaβ Recommended- Accesses
DefaultKubernetesServiceInstance.podMetadata()directly - Reads zone from the pod labels structure
- Accesses
-
slice-client-service/EndpointSliceZoneServiceInstanceListSupplier.java- Uses Kubernetes EndpointSlices API
- Builds IP-to-zone cache from
endpoint.getZone()
Three cleanup options available:
./scripts/cleanup.shImmediately deletes the Kind cluster.
./scripts/destroy-cluster.shAsks for confirmation before deleting the cluster.
./scripts/cleanup-all.shDeletes cluster and removes all Docker images for this project.
See CLEANUP_GUIDE.md for detailed information on each cleanup option.
From the Spring Cloud Kubernetes Documentation:
<!-- For Kubernetes Java Client Implementation -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-kubernetes-client-loadbalancer</artifactId>
</dependency>
<!-- Required for ReactiveDiscoveryClient -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
</dependency>Important: The spring-boot-starter-webflux dependency is required to provide the ReactiveDiscoveryClient bean that Spring Cloud LoadBalancer uses for service discovery, even in non-reactive applications.
-
POD vs SERVICE Mode
- POD mode: Uses DiscoveryClient to find pod endpoints (enables zone-aware routing)
- SERVICE mode: Uses Kubernetes service DNS (simpler but no zone awareness)
-
Zone Preference
- Spring Cloud LoadBalancer checks the
spring.cloud.loadbalancer.zoneproperty - Matches it against the
topology.kubernetes.io/zonelabel on pods - Prefers pods in the same zone but can fall back to other zones if needed
- Spring Cloud LoadBalancer checks the
-
Fast Development
- Kind allows running Kubernetes locally without cloud resources
- Image loading into Kind is much faster than pushing to a registry
- Rolling restarts allow testing without full redeployment
Error:
required a bean of type 'org.springframework.cloud.client.discovery.ReactiveDiscoveryClient' that could not be found
Solution: Add the spring-boot-starter-webflux dependency to your pom.xml:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webflux</artifactId>
</dependency>This is required because Spring Cloud LoadBalancer uses reactive components internally, even in non-reactive applications.
# Check pod status
kubectl get pods -n lb-demo
# Check logs
kubectl logs -n lb-demo <pod-name>
# Describe pod for events
kubectl describe pod -n lb-demo <pod-name># Check if discovery is enabled
kubectl exec -n lb-demo <client-pod> -- env | grep KUBERNETES
# Verify RBAC permissions
kubectl get clusterrolebinding spring-cloud-kubernetes-role-binding
# Check Spring Cloud logs
kubectl logs -n lb-demo <client-pod> | grep "LoadBalancer\|Discovery"- Verify pod labels:
kubectl get pods -n lb-demo -L topology.kubernetes.io/zone - Check ZONE environment variable in pods
- Ensure
ZonePreferenceServiceInstanceListSupplieris configured
Issue: After rebuilding and redeploying, Kubernetes doesn't pick up the new images.
Cause: When using the latest tag with local Kind images, Kubernetes doesn't detect image changes.
Solution 1 - Automatic (Recommended):
The build-and-deploy.sh script now automatically restarts deployments after loading new images.
Solution 2 - Manual restart:
# Force restart all deployments
./scripts/restart-deployments.sh
# Or restart individual services
kubectl rollout restart deployment/client-service-zone-a -n lb-demo
kubectl rollout restart deployment/client-service-zone-b -n lb-demoSolution 3 - Delete and recreate pods:
kubectl delete pods -n lb-demo -l app=client-service
kubectl delete pods -n lb-demo -l app=sample-serviceNote: The manifests use imagePullPolicy: Never for local Kind development, which tells Kubernetes to only use images already present in the cluster.
- Service Mesh Alternative: Consider Istio or Linkerd for production zone-aware routing
- Metrics: Add Prometheus metrics to track cross-zone traffic
- Fallback Strategy: Configure fallback behavior when no same-zone instances are available
- Health Checks: Ensure proper health checks to avoid routing to unhealthy pods
- Testing: Use this local setup to test zone failure scenarios
Feel free to modify and extend this demo for your needs. Common extensions:
- Add circuit breakers with Resilience4j
- Add tracing with Spring Cloud Sleuth
- Implement weighted load balancing
- Add chaos engineering tests (kill pods in one zone)
Happy coding! π
If you have questions or improvements, feel free to open an issue or PR.