Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC streaming keepAlive doesn't work with docker swarm #2549

Closed
npuichigo opened this issue Jan 6, 2019 · 7 comments

Comments

@npuichigo
Copy link

@npuichigo npuichigo commented Jan 6, 2019

Recently, I used the grpc-gateway to transcode HTTP/JSON to grpc. The gRPC-Gateway uses a gRPC client to talk to the backend, but in a docker swarm setup using overlay networks, idle connections between grpc client and backend service will end up in a broken state after 15 minutes. (moby/moby#31208)

So I tried the following two ways. Firstly, I set net.ipv4.tcp_keepalive_time to less than 900 seconds, to make sure the TCP connection between grpc-gateway and IPVS doesn't expire. Secondly, I configured grpc.WithKeepaliveParams to set the ping time interval to 30 seconds. However, neither method works.

Here is the code snippet of my client:

ctx := context.Background()
        ctx, cancel := context.WithCancel(ctx)
        defer cancel()

        mux := runtime.NewServeMux()
        opts := []grpc.DialOption{
                grpc.WithInsecure(),
                grpc.WithKeepaliveParams(keepalive.ClientParameters{
                        Time:                30 * time.Second,
                        Timeout:             20 * time.Second,
                        PermitWithoutStream: true,
                }),
        }
        err := gw.RegisterMyHandlerFromEndpoint(ctx, mux, *echoEndpoint, opts)

Is there anyway to configure grpc-go client to support docker swarm?

@mastersingh24

This comment has been minimized.

Copy link

@mastersingh24 mastersingh24 commented Jan 6, 2019

Did you also set https://godoc.org/google.golang.org/grpc/keepalive#EnforcementPolicy on the server? The default permitted value for time between client pings is 5 minutes ... if you set your client to ping every 30 seconds you'll need to set a matching policy on the gRPC server as well.
We've been using the gRPC keepalives on both the client and server in production for quite some time now. The main thing is making sure you get the settings correct on both sides.

@npuichigo

This comment has been minimized.

Copy link
Author

@npuichigo npuichigo commented Jan 6, 2019

Thanks for reminding, but my grpc server is based on c++. Is there any counterpart of EnforcementPolicy in grpc c++?

@mastersingh24

This comment has been minimized.

Copy link

@mastersingh24 mastersingh24 commented Jan 6, 2019

I believe the C++ server still wraps gRPC core. You can find the settings here:
https://github.com/grpc/grpc/blob/master/doc/keepalive.md

Make sure to pay attention to the GRPC_ARG_HTTP2_MIN_RECV_PING_INTERVAL_WITHOUT_DATA_MS,GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA, GRPC_ARG_KEEPALIVE_PERMIT_WITHOUT_CALLS settings.

@npuichigo

This comment has been minimized.

Copy link
Author

@npuichigo npuichigo commented Jan 6, 2019

I tried the following configurations on c++ server and golang client to make sure that the permitted ping interval of c++(5 minutes by default) is smaller than the golang client, but it still doesn't work.

grpc::ServerBuilder builder;
// Listen on the given address without any authentication mechanism.
builder.AddListeningPort(FLAGS_address, grpc::InsecureServerCredentials());
// Enable to send HTTP2 keepalive pings over the transport.
builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_PERMIT_WITHOUT_CALLS, 1);
builder.AddChannelArgument(GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA, 0);
ctx := context.Background()
        ctx, cancel := context.WithCancel(ctx)
        defer cancel()

        mux := runtime.NewServeMux()
        opts := []grpc.DialOption{
                grpc.WithInsecure(),
                grpc.WithKeepaliveParams(keepalive.ClientParameters{
                        Time:                10 * time.Minute,
                        Timeout:             20 * time.Second,
                        PermitWithoutStream: true,
                }),
        }
        err := gw.RegisterMyHandlerFromEndpoint(ctx, mux, *echoEndpoint, opts)
@mastersingh24

This comment has been minimized.

Copy link

@mastersingh24 mastersingh24 commented Jan 6, 2019

You definitely need to ping more often than every 10 min when using Docker / Docker Swarm. The Docker proxies timeout after 10 minutes ... we use 5 * time.Minute for the client keepalive time ... you can also try 6 as well ... anything less than 10 should keep idle connections open with Docker Swarm (or just Docker in general)

@npuichigo npuichigo closed this Jan 8, 2019
@mastersingh24

This comment has been minimized.

Copy link

@mastersingh24 mastersingh24 commented Jan 9, 2019

@npuichigo - hopefully you were able to resolve this?

@npuichigo

This comment has been minimized.

Copy link
Author

@npuichigo npuichigo commented Jan 9, 2019

@mastersingh24 - Yes, the problem has been solved. Thank you for your concern.

@lock lock bot locked as resolved and limited conversation to collaborators Jul 8, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants
You can’t perform that action at this time.