Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

good practice for gateway peer loadbalancing #257

Closed
davidkel opened this issue Oct 19, 2021 · 10 comments
Closed

good practice for gateway peer loadbalancing #257

davidkel opened this issue Oct 19, 2021 · 10 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@davidkel
Copy link
Contributor

We should provide guidance on this as part of samples/documentation some considerations are

  • load balancers in a K8s environment
  • client handling the balancing itself

Please feel free to add more thoughts, info for capturing plus where appropriate artifacts demonstrating this should be created

  • eg samples, docs
@bestbeforetoday bestbeforetoday added the documentation Improvements or additions to documentation label Oct 19, 2021
@bestbeforetoday
Copy link
Member

These pieces of gRPC documentation may be relevant:

Definitely we should have some documentation on deployment patterns and best-practices with Fabric Gateway. Where appropriate we should reference existing documentation (like those gRPC ones above) instead of duplicating or recreating that information.

@davidkel
Copy link
Contributor Author

load balancing will not guarantee the initial endorse will be balanced as gateway decides based on block height which peer to use but does favour the gateway peer. under load the gateway information on block height could be stale

@davidkel
Copy link
Contributor Author

davidkel commented Nov 8, 2021

@mbwhite @jkneubuh This might be a good place to capture any relevant info

@jkneubuh
Copy link

jkneubuh commented Nov 8, 2021

On the Kubernetes front there are a few different approaches that we can employ to shape traffic between the gateway client and peers. All of these will provide some level of HA, failover, and traffic distribution across a set of peers.

Gateway load balancing within Kubernetes can be accomplished by establishing a Service instance matching node selectors for multiple peer Deployments. Provided that the TLS certificates for the peers have been initialized with a common SAN in the signing request, a single k8s Service can act as a front-end to multiple peers using a common DNS name. When establishing a TCP connection from the client to the gateway peer, Kubernetes will use the Service interface to dynamically resolve the address of one of the Pods bound to the Service.

One downside of using the Kubernetes Service routing is that any finer-grain message routing, e.g. at the gRPC message layer, is not possible. Kubernetes can help with the initial assignment of a TCP connection to one of the backing peers, but once a client connection is established it will be maintained for the duration of the socket.

By default a Kubernetes Service will use iptables to bind a client connection to a peer pod using random assignment. The pods backing a Service instance can be monitored via Readiness Probes, ensuring that only "ready" services receive gRPC handshakes from the gateway client SDK.

Building on the iptables routing, it looks like a Kube service can also use IPVS mode to further shape the IP resolution:

IPVS provides more options for balancing traffic to backend Pods; these are:

rr: round-robin
lc: least connection (smallest number of open connections)
dh: destination hashing
sh: source hashing
sed: shortest expected delay
nq: never queue

In addition, the IPVS routing mode includes a sessionAffinity attribute which can be set to "ClientIP", ensuring that connections from a particular client are resolved to the same peer pod.

Building on the Kubernetes defaults, we could consider an additional layer of traffic shaping by co-deploying a Fabric network within a service mesh, as provided by Ambassador, Istio, or Linkerd. This approach may be a little more involved and not generally applicable to all environments running a Fabric network on K8s.

In addition to finding a home on the Fabric docs site, I like the idea of including a reference deployment within the Kubernetes test network. Do we have a reference example app showing the best practices for authoring a Fabric application using the Gateway SDK? (e.g. fabric-rest-sample, but using the Gateway SDK)

@mbwhite
Copy link
Member

mbwhite commented Nov 10, 2021

@jkneubuh I can create a PR on the test-network-k8s that includes the changes along these lines. Ironically they are relatively minor changes, but I guess that shows the power of K8S.

(see hyperledger/fabric-samples#532)

I believe there is an example planned for the gateway, meanwhile I'll refer you to the LedgerMessaging IBM example.

@bestbeforetoday
Copy link
Member

@mbwhite @jkneubuh Have we done all we need (or realistically plan to do for now) on this issue? If so, I'll close it. If not, what needs doing and what's the outlook on that?

@mbwhite
Copy link
Member

mbwhite commented Mar 18, 2022

@bestbeforetoday I believe we've done all we can at the moment.

@jkneubuh
Copy link

I still have a draft PR (lingering) open for the peer load balancing in k8s. It's got some good info but is still too "kube specific" in the context where it's currently anchored in the docs.

Mark please leave this one open. I will connect with Josh (H) on finding the correct page / site for the doc content.

@bestbeforetoday
Copy link
Member

bestbeforetoday commented May 20, 2022

Another potentially useful snippet of information on how to configure client-side load balancing over a set of IP addresses using the Node gRPC client:

grpc/grpc-node#1307 (comment)

A key piece of information is that grpc-js now supports ipv4: and ipv6: address schemes, which allow multiple target IP addresses to be specified for a client connection.

https://github.com/grpc/grpc/blob/master/doc/naming.md

This should probably be included in a sample somewhere, or at least just in the API docs.

@bestbeforetoday
Copy link
Member

Best-practice recommendations for production deployment, which discusses load balancing / fail-over, is published in the full-stack-asset-transfer-guide sample.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants