Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxy doesn't clean up after buffer sees a client close? #899

Closed
sfroment opened this issue May 4, 2018 · 9 comments
Closed

Proxy doesn't clean up after buffer sees a client close? #899

sfroment opened this issue May 4, 2018 · 9 comments
Assignees
Milestone

Comments

@sfroment
Copy link
Contributor

sfroment commented May 4, 2018

Hello,

I'm getting turning buffer 500, with my small understanding of conduit the conduit flow of my request is svc-b -> conduit_proxy of svc-b -> conduit_proxy of svc-a -> svc->a
1st request before kill

DBUG conduit_proxy::telemetry::tap "controller-client", inspect event=StreamRequestOpen(Request { id: 37, uri: http://svc-a.internal.svc.cluster.local/Login, method: POST, server: Server { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.166:34804), local: V4(127.0.0.1:4140), orig_dst: Some(V4(10.20.0.212:9001)), protocol: Http }, client: Client { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.165:9001), protocol: Http, dst_labels: Some(Watch { shared: Shared { value: RwLock { data: Some(DstLabels { formatted: "dst_namespace=\"internal\",dst_service=\"svc-a\",dst_pod=\"svc-a-74477fdcb8-qgzf4\",dst_deployment=\"svc-a\",dst_pod_template_hash=\"3003398764\"", original: {"service": "svc-a", "pod_template_hash": "3003398764", "namespace": "internal", "deployment": "svc-a", "pod": "svc-a-74477fdcb8-qgzf4"} }) }, version: AtomicUsize(0), watchers: Mutex { data: Watchers { next_id: 3, watchers: {2: WatchInner { task: AtomicTask }} } }, cancel: AtomicTask }, inner: WatchInner { task: AtomicTask }, id: 2, ver: 0 }) } }) with tap=Tap { match_: All([]), tx: Sender { capacity: AtomicUsize(100) } }
DBUG conduit_proxy::telemetry::tap "controller-client", inspect event=StreamRequestEnd(Request { id: 37, uri: http://svc-a.internal.svc.cluster.local/Login, method: POST, server: Server { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.166:34804), local: V4(127.0.0.1:4140), orig_dst: Some(V4(10.20.0.212:9001)), protocol: Http }, client: Client { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.165:9001), protocol: Http, dst_labels: Some(Watch { shared: Shared { value: RwLock { data: Some(DstLabels { formatted: "dst_namespace=\"internal\",dst_service=\"svc-a\",dst_pod=\"svc-a-74477fdcb8-qgzf4\",dst_deployment=\"svc-a\",dst_pod_template_hash=\"3003398764\"", original: {"service": "svc-a", "pod_template_hash": "3003398764", "namespace": "internal", "deployment": "svc-a", "pod": "svc-a-74477fdcb8-qgzf4"} }) }, version: AtomicUsize(0), watchers: Mutex { data: Watchers { next_id: 3, watchers: {2: WatchInner { task: AtomicTask }} } }, cancel: AtomicTask }, inner: WatchInner { task: AtomicTask }, id: 2, ver: 0 }) } }, StreamRequestEnd { since_request_open: Duration { secs: 0, nanos: 139097 } }) with tap=Tap { match_: All([]), tx: Sender { capacity: AtomicUsize(99) } }
DBUG conduit_proxy::telemetry::tap "controller-client", inspect event=StreamResponseOpen(Response { request: Request { id: 37, uri: http://svc-a.internal.svc.cluster.local/Login, method: POST, server: Server { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.166:34804), local: V4(127.0.0.1:4140), orig_dst: Some(V4(10.20.0.212:9001)), protocol: Http }, client: Client { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.165:9001), protocol: Http, dst_labels: Some(Watch { shared: Shared { value: RwLock { data: Some(DstLabels { formatted: "dst_namespace=\"internal\",dst_service=\"svc-a\",dst_pod=\"svc-a-74477fdcb8-qgzf4\",dst_deployment=\"svc-a\",dst_pod_template_hash=\"3003398764\"", original: {"service": "svc-a", "pod_template_hash": "3003398764", "namespace": "internal", "deployment": "svc-a", "pod": "svc-a-74477fdcb8-qgzf4"} }) }, version: AtomicUsize(0), watchers: Mutex { data: Watchers { next_id: 3, watchers: {2: WatchInner { task: AtomicTask }} } }, cancel: AtomicTask }, inner: WatchInner { task: AtomicTask }, id: 2, ver: 0 }) } }, status: 200 }, StreamResponseOpen { since_request_open: Duration { secs: 4, nanos: 800338855 } }) with tap=Tap { match_: All([]), tx: Sender { capacity: AtomicUsize(100) } }
DBUG conduit_proxy::telemetry::tap "controller-client", inspect event=StreamResponseEnd(Response { request: Request { id: 37, uri: http://svc-a.internal.svc.cluster.local/Login, method: POST, server: Server { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.166:34804), local: V4(127.0.0.1:4140), orig_dst: Some(V4(10.20.0.212:9001)), protocol: Http }, client: Client { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.165:9001), protocol: Http, dst_labels: Some(Watch { shared: Shared { value: RwLock { data: Some(DstLabels { formatted: "dst_namespace=\"internal\",dst_service=\"svc-a\",dst_pod=\"svc-a-74477fdcb8-qgzf4\",dst_deployment=\"svc-a\",dst_pod_template_hash=\"3003398764\"", original: {"service": "svc-a", "pod_template_hash": "3003398764", "namespace": "internal", "deployment": "svc-a", "pod": "svc-a-74477fdcb8-qgzf4"} }) }, version: AtomicUsize(0), watchers: Mutex { data: Watchers { next_id: 3, watchers: {2: WatchInner { task: AtomicTask }} } }, cancel: AtomicTask }, inner: WatchInner { task: AtomicTask }, id: 2, ver: 0 }) } }, status: 200 }, StreamResponseEnd { grpc_status: Some(0), since_request_open: Duration { secs: 4, nanos: 800367189 }, since_response_open: Duration { secs: 0, nanos: 15201 }, bytes_sent: 1081, frames_sent: 1 }) with tap=Tap { match_: All([]), tx: Sender { capacity: AtomicUsize(99) } }

2nd request after the kill

DBUG conduit_proxy::telemetry::tap "controller-client", inspect event=StreamRequestOpen(Request { id: 53, uri: http://svc-a.internal.svc.cluster.local/Login, method: POST, server: Server { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.166:34804), local: V4(127.0.0.1:4140), orig_dst: Some(V4(10.20.0.212:9001)), protocol: Http }, client: Client { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.4.103:9001), protocol: Http, dst_labels: Some(Watch { shared: Shared { value: RwLock { data: Some(DstLabels { formatted: "dst_namespace=\"internal\",dst_service=\"svc-a\",dst_pod=\"svc-a-74477fdcb8-t4l9k\",dst_deployment=\"svc-a\",dst_pod_template_hash=\"3003398764\"", original: {"namespace": "internal", "pod_template_hash": "3003398764", "pod": "svc-a-74477fdcb8-t4l9k", "service": "svc-a", "deployment": "svc-a"} }) }, version: AtomicUsize(0), watchers: Mutex { data: Watchers { next_id: 3, watchers: {2: WatchInner { task: AtomicTask }} } }, cancel: AtomicTask }, inner: WatchInner { task: AtomicTask }, id: 2, ver: 0 }) } }) with tap=Tap { match_: All([]), tx: Sender { capacity: AtomicUsize(100) } }
DBUG conduit_proxy::telemetry::tap "controller-client", inspect event=StreamRequestEnd(Request { id: 53, uri: http://svc-a.internal.svc.cluster.local/Login, method: POST, server: Server { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.166:34804), local: V4(127.0.0.1:4140), orig_dst: Some(V4(10.20.0.212:9001)), protocol: Http }, client: Client { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.4.103:9001), protocol: Http, dst_labels: Some(Watch { shared: Shared { value: RwLock { data: Some(DstLabels { formatted: "dst_namespace=\"internal\",dst_service=\"svc-a\",dst_pod=\"svc-a-74477fdcb8-t4l9k\",dst_deployment=\"svc-a\",dst_pod_template_hash=\"3003398764\"", original: {"namespace": "internal", "pod_template_hash": "3003398764", "pod": "svc-a-74477fdcb8-t4l9k", "service": "svc-a", "deployment": "svc-a"} }) }, version: AtomicUsize(0), watchers: Mutex { data: Watchers { next_id: 3, watchers: {2: WatchInner { task: AtomicTask }} } }, cancel: AtomicTask }, inner: WatchInner { task: AtomicTask }, id: 2, ver: 0 }) } }, StreamRequestEnd { since_request_open: Duration { secs: 0, nanos: 119650 } }) with tap=Tap { match_: All([]), tx: Sender { capacity: AtomicUsize(99) } }
DBUG conduit_proxy::telemetry::tap "controller-client", inspect event=StreamResponseOpen(Response { request: Request { id: 53, uri: http://svc-a.internal.svc.cluster.local/Login, method: POST, server: Server { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.166:34804), local: V4(127.0.0.1:4140), orig_dst: Some(V4(10.20.0.212:9001)), protocol: Http }, client: Client { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.4.103:9001), protocol: Http, dst_labels: Some(Watch { shared: Shared { value: RwLock { data: Some(DstLabels { formatted: "dst_namespace=\"internal\",dst_service=\"svc-a\",dst_pod=\"svc-a-74477fdcb8-t4l9k\",dst_deployment=\"svc-a\",dst_pod_template_hash=\"3003398764\"", original: {"namespace": "internal", "pod_template_hash": "3003398764", "pod": "svc-a-74477fdcb8-t4l9k", "service": "svc-a", "deployment": "svc-a"} }) }, version: AtomicUsize(0), watchers: Mutex { data: Watchers { next_id: 3, watchers: {2: WatchInner { task: AtomicTask }} } }, cancel: AtomicTask }, inner: WatchInner { task: AtomicTask }, id: 2, ver: 0 }) } }, status: 500 }, StreamResponseOpen { since_request_open: Duration { secs: 0, nanos: 1028917 } }) with tap=Tap { match_: All([]), tx: Sender { capacity: AtomicUsize(100) } }
DBUG conduit_proxy::telemetry::tap "controller-client", inspect event=StreamResponseEnd(Response { request: Request { id: 53, uri: http://svc-a.internal.svc.cluster.local/Login, method: POST, server: Server { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.1.166:34804), local: V4(127.0.0.1:4140), orig_dst: Some(V4(10.20.0.212:9001)), protocol: Http }, client: Client { proxy: Outbound(Process { scheduled_namespace: "internal", start_time: SystemTime { tv_sec: 1525372698, tv_nsec: 663213849 } }), remote: V4(10.20.4.103:9001), protocol: Http, dst_labels: Some(Watch { shared: Shared { value: RwLock { data: Some(DstLabels { formatted: "dst_namespace=\"internal\",dst_service=\"svc-a\",dst_pod=\"svc-a-74477fdcb8-t4l9k\",dst_deployment=\"svc-a\",dst_pod_template_hash=\"3003398764\"", original: {"namespace": "internal", "pod_template_hash": "3003398764", "pod": "svc-a-74477fdcb8-t4l9k", "service": "svc-a", "deployment": "svc-a"} }) }, version: AtomicUsize(0), watchers: Mutex { data: Watchers { next_id: 3, watchers: {2: WatchInner { task: AtomicTask }} } }, cancel: AtomicTask }, inner: WatchInner { task: AtomicTask }, id: 2, ver: 0 }) } }, status: 500 }, StreamResponseEnd { grpc_status: None, since_request_open: Duration { secs: 0, nanos: 1039716 }, since_response_open: Duration { secs: 0, nanos: 0 }, bytes_sent: 0, frames_sent: 0 }) with tap=Tap { match_: All([]), tx: Sender { capacity: AtomicUsize(99) } }

In the conduit proxy logs of svc-a:

DBUG tokio_core::reactor loop poll - Duration { secs: 9, nanos: 287922607 }
DBUG tokio_core::reactor loop time - Instant { tv_sec: 729258, tv_nsec: 537438748 }
DBUG tokio_core::reactor notifying a task handle
DBUG tokio_core::reactor loop process - 1 events, Duration { secs: 0, nanos: 112986 }
DBUG tokio_core::reactor loop poll - Duration { secs: 0, nanos: 8245 }
DBUG tokio_core::reactor loop time - Instant { tv_sec: 729258, tv_nsec: 537575112 }
DBUG h2::codec::framed_read received; frame=Frame::Headers(Headers { stream_id: StreamId(21), stream_dep: None, flags: HeadersFlag { end_stream: false, end_headers: true, padded: false, priority: false } })
DBUG h2::codec::framed_read received; frame=Frame::Data(Data { stream_id: StreamId(21), flags: DataFlags { end_stream: true }, pad_len: None })
DBUG tokio_core::reactor scheduling direction for: 37
DBUG tokio_core::reactor blocking
DBUG tokio_core::reactor scheduling direction for: 37
DBUG tokio_core::reactor blocking
DBUG tokio_core::reactor loop process - 1 events, Duration { secs: 0, nanos: 461020 }
DBUG tokio_core::reactor loop poll - Duration { secs: 0, nanos: 3092 }
DBUG tokio_core::reactor loop time - Instant { tv_sec: 729258, tv_nsec: 538058390 }
ERR! conduit_proxy turning buffer closed into 500

You can reproduce the error with the demo app just scale the voting pods to 2.
I've changed a bit the server.go file to get a grpc client that do many request

package main

import (
	"context"
	"log"
	"os"
	"os/signal"
	"syscall"

	pb "github.com/runconduit/conduit-examples/emojivoto/emojivoto-web/gen/proto"
	"google.golang.org/grpc"
)

var (
	webPort              = os.Getenv("WEB_PORT")
	emojisvcHost         = "emoji-svc.emojivoto.svc.cluster.local:8080"
	votingsvcHost        = "voting-svc.emojivoto.svc.cluster.local:8080"
	indexBundle          = os.Getenv("INDEX_BUNDLE")
	webpackDevServerHost = os.Getenv("WEBPACK_DEV_SERVER")
)

func main() {
	votingSvcConn := openGrpcClientConnection(votingsvcHost)
	votingClient := pb.NewVotingServiceClient(votingSvcConn)
	defer votingSvcConn.Close()

	c := make(chan os.Signal, 1)
	signal.Notify(c, syscall.SIGINT, syscall.SIGTERM)
	log.Printf("Start voting")
	ctx := context.Background()
	for i := 0; i < 4; i++ {
		go func() {
			for {
				voteRequest := &pb.VoteRequest{}
				votingClient.Vote100(ctx, voteRequest)
			}
		}()
	}
	<-c
}

func openGrpcClientConnection(host string) *grpc.ClientConn {
	log.Printf("Connecting to [%s]", host)
	conn, err := grpc.Dial(host, grpc.WithInsecure())
	if err != nil {
		panic(err)
	}
	return conn
}

then just run the go and you'll see error rapidly on voting

@sfroment sfroment changed the title error buffer closed after pod get killed error buffer closed May 4, 2018
@sfroment
Copy link
Contributor Author

sfroment commented May 4, 2018

Potentialy related tout #887

@olix0r
Copy link
Member

olix0r commented May 4, 2018

thanks for the helpful bug report, @sfroment !

@olix0r olix0r added this to the 0.5.0 milestone May 4, 2018
@sfroment
Copy link
Contributor Author

sfroment commented May 5, 2018

The voting-svcm error:

goroutine 121 [running]:
runtime.throw(0x86f4ed, 0x15)
	/usr/local/Cellar/go/1.9/libexec/src/runtime/panic.go:605 +0x95 fp=0xc420199780 sp=0xc420199760 pc=0x42a315
runtime.mapassign_faststr(0x7e6f00, 0xc4200597a0, 0x865d5d, 0x5, 0xb02600)
	/usr/local/Cellar/go/1.9/libexec/src/runtime/hashmap_fast.go:685 +0x4da fp=0xc420199800 sp=0xc420199780 pc=0x40cbea
github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/voting.(*inMemoryPoll).Vote(0xc42000c058, 0x865d5d, 0x5, 0xc420199938, 0x410517)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/voting/poll.go:34 +0x250 fp=0xc420199878 sp=0xc420199800 pc=0x4d0110
github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/api.(*PollServiceServer).vote(0xc42000eff0, 0x865d5d, 0x5, 0x7e95a0, 0xc420163a01, 0xb01868)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/api/api.go:17 +0x4b fp=0xc4201998c0 sp=0xc420199878 pc=0x77a1bb
github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/api.(*PollServiceServer).Vote100(0xc42000eff0, 0xa9df60, 0xc4201e3650, 0xb01868, 0xc42000eff0, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/api/api.go:406 +0x40 fp=0xc420199900 sp=0xc4201998c0 pc=0x77cc70
github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/gen/proto._VotingService_Vote100_Handler(0x864b60, 0xc42000eff0, 0xa9df60, 0xc4201e3650, 0xc420163ab0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/gen/proto/Voting.pb.go:2968 +0x276 fp=0xc420199978 sp=0xc420199900 pc=0x773196
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e8f00, 0xc420059920, 0xadeb60, 0x0, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:900 +0x9d3 fp=0xc420199d90 sp=0xc420199978 pc=0x7565c3
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).handleStream(0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e8f00, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:1122 +0x1528 fp=0xc420199f80 sp=0xc420199d90 pc=0x75a6d8
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc420160020, 0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e8f00)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:617 +0x9f fp=0xc420199fb8 sp=0xc420199f80 pc=0x75c0cf
runtime.goexit()
	/usr/local/Cellar/go/1.9/libexec/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc420199fc0 sp=0xc420199fb8 pc=0x457ca1
created by github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:615 +0xa1

goroutine 1 [IO wait]:
internal/poll.runtime_pollWait(0x7fe463023f70, 0x72, 0xffffffffffffffff)
	/usr/local/Cellar/go/1.9/libexec/src/runtime/netpoll.go:173 +0x57
internal/poll.(*pollDesc).wait(0xc420134018, 0x72, 0xc420037b00, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/internal/poll/fd_poll_runtime.go:85 +0xae
internal/poll.(*pollDesc).waitRead(0xc420134018, 0xffffffffffffff00, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/internal/poll/fd_poll_runtime.go:90 +0x3d
internal/poll.(*FD).Accept(0xc420134000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/internal/poll/fd_unix.go:334 +0x1e2
net.(*netFD).accept(0xc420134000, 0xc420142000, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/net/fd_unix.go:238 +0x42
net.(*TCPListener).accept(0xc42000c060, 0x754212, 0x4547c0, 0xc420037db8)
	/usr/local/Cellar/go/1.9/libexec/src/net/tcpsock_posix.go:136 +0x2e
net.(*TCPListener).Accept(0xc42000c060, 0x885c08, 0xc420001380, 0xaa02c0, 0xc420142000)
	/usr/local/Cellar/go/1.9/libexec/src/net/tcpsock.go:247 +0x49
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).Serve(0xc420001380, 0xa9d420, 0xc42000c060, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:495 +0x196
main.main()
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/cmd/server.go:33 +0x25f

goroutine 17 [IO wait]:
internal/poll.runtime_pollWait(0x7fe463023eb0, 0x72, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/runtime/netpoll.go:173 +0x57
internal/poll.(*pollDesc).wait(0xc42013e018, 0x72, 0xffffffffffffff00, 0xa997e0, 0xa95348)
	/usr/local/Cellar/go/1.9/libexec/src/internal/poll/fd_poll_runtime.go:85 +0xae
internal/poll.(*pollDesc).waitRead(0xc42013e018, 0xc42014c000, 0x8000, 0x8000)
	/usr/local/Cellar/go/1.9/libexec/src/internal/poll/fd_poll_runtime.go:90 +0x3d
internal/poll.(*FD).Read(0xc42013e000, 0xc42014c000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/internal/poll/fd_unix.go:125 +0x18a
net.(*netFD).Read(0xc42013e000, 0xc42014c000, 0x8000, 0x8000, 0x404050, 0xc42014bd38, 0xc4201e3950)
	/usr/local/Cellar/go/1.9/libexec/src/net/fd_unix.go:202 +0x52
net.(*conn).Read(0xc420142000, 0xc42014c000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/net/net.go:176 +0x6d
bufio.(*Reader).Read(0xc42014a000, 0xc42015c038, 0x9, 0x9, 0xc4201480c0, 0x10000c420146000, 0x886610)
	/usr/local/Cellar/go/1.9/libexec/src/bufio/bufio.go:213 +0x30b
io.ReadAtLeast(0xa96a20, 0xc42014a000, 0xc42015c038, 0x9, 0x9, 0x9, 0x743716, 0xc4201e3980, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/io/io.go:309 +0x86
io.ReadFull(0xa96a20, 0xc42014a000, 0xc42015c038, 0x9, 0x9, 0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/io/io.go:327 +0x58
github.com/runconduit/conduit-examples/emojivoto/vendor/golang.org/x/net/http2.readFrameHeader(0xc42015c038, 0x9, 0x9, 0xa96a20, 0xc42014a000, 0x0, 0x1000100000000, 0xdd00000005, 0x500010001)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/golang.org/x/net/http2/frame.go:237 +0x7b
github.com/runconduit/conduit-examples/emojivoto/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc42015c000, 0xc420140090, 0xc420140090, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/golang.org/x/net/http2/frame.go:492 +0xa4
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.(*http2Server).HandleStreams(0xc420146300, 0xc420140210, 0x885c40)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/http2_server.go:390 +0x3a
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).serveStreams(0xc420001380, 0xaa00e0, 0xc420146300)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:613 +0x140
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).handleRawConn(0xc420001380, 0xaa02c0, 0xc420142000)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:574 +0x5e0
created by github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).Serve
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:535 +0x632

goroutine 19 [runnable]:
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.loopyWriter(0xa9dea0, 0xc420148080, 0xc420140180, 0xc420039fb8)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/transport.go:719 +0x3c6
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.newHTTP2Server.func1(0xc420146300)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/http2_server.go:256 +0x5e
created by github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.newHTTP2Server
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/http2_server.go:255 +0xe3f

goroutine 20 [select]:
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.(*http2Server).keepalive(0xc420146300)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/http2_server.go:951 +0x264
created by github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.newHTTP2Server
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/http2_server.go:259 +0xe64

goroutine 120 [runnable]:
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.(*recvBufferReader).read(0xc4201bdef0, 0xc4201d5170, 0x5, 0x5, 0x429144, 0xc42003b798, 0x4292a8)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/transport.go:133 +0x28b
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.(*recvBufferReader).Read(0xc4201bdef0, 0xc4201d5170, 0x5, 0x5, 0x3, 0xc420019300, 0xc420019370)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/transport.go:122 +0x67
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.(*transportReader).Read(0xc4201e34a0, 0xc4201d5170, 0x5, 0x5, 0x46135e, 0xc4201e8d78, 0xc420148d80)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/transport.go:387 +0x55
io.ReadAtLeast(0xa974e0, 0xc4201e34a0, 0xc4201d5170, 0x5, 0x5, 0x5, 0xc420146300, 0xc4201e8c80, 0x5)
	/usr/local/Cellar/go/1.9/libexec/src/io/io.go:309 +0x86
io.ReadFull(0xa974e0, 0xc4201e34a0, 0xc4201d5170, 0x5, 0x5, 0x0, 0x0, 0xc42003b938)
	/usr/local/Cellar/go/1.9/libexec/src/io/io.go:327 +0x58
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport.(*Stream).Read(0xc4201e8c80, 0xc4201d5170, 0x5, 0x5, 0x60000000000, 0x8, 0x18)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/transport/transport.go:371 +0xbf
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*parser).recvMsg(0xc4201d5160, 0x400000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/rpc_util.go:285 +0x65
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e8c80, 0xc420059920, 0xadeb60, 0x0, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:823 +0x2a2
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).handleStream(0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e8c80, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:1122 +0x1528
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc420160020, 0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e8c80)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:617 +0x9f
created by github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:615 +0xa1

goroutine 122 [runnable]:
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc420160020, 0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e9180)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:615
created by github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:615 +0xa1

goroutine 123 [runnable]:
syscall.Syscall(0x1, 0x2, 0xc42018c000, 0x4a, 0x4a, 0x4a, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/syscall/asm_linux_amd64.s:18 +0x5
syscall.write(0x2, 0xc42018c000, 0x4a, 0x80, 0xc420175501, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/syscall/zsyscall_linux_amd64.go:1029 +0x55
syscall.Write(0x2, 0xc42018c000, 0x4a, 0x80, 0xc420175698, 0x410517, 0x32)
	/usr/local/Cellar/go/1.9/libexec/src/syscall/syscall_unix.go:181 +0x49
internal/poll.(*FD).Write(0xc4200600a0, 0xc42018c000, 0x4a, 0x80, 0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/internal/poll/fd_unix.go:218 +0x14f
os.(*File).write(0xc42000c020, 0xc42018c000, 0x4a, 0x80, 0xc4201756d0, 0x4494a8, 0xc4201756f8)
	/usr/local/Cellar/go/1.9/libexec/src/os/file_unix.go:233 +0x4e
os.(*File).Write(0xc42000c020, 0xc42018c000, 0x4a, 0x80, 0xae2120, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/os/file.go:140 +0x72
log.(*Logger).Output(0xc4200600f0, 0x2, 0xc42005ccc0, 0x35, 0x0, 0x0)
	/usr/local/Cellar/go/1.9/libexec/src/log/log.go:176 +0x20b
log.Printf(0x87be96, 0x33, 0xc420175848, 0x2, 0x2)
	/usr/local/Cellar/go/1.9/libexec/src/log/log.go:299 +0x80
github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/voting.(*inMemoryPoll).Vote(0xc42000c058, 0x865d5d, 0x5, 0xc420175938, 0x410517)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/voting/poll.go:36 +0x1f7
github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/api.(*PollServiceServer).vote(0xc42000eff0, 0x865d5d, 0x5, 0x7e95a0, 0xc42018f801, 0xb01868)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/api/api.go:17 +0x4b
github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/api.(*PollServiceServer).Vote100(0xc42000eff0, 0xa9df60, 0xc4201e3a10, 0xb01868, 0xc42000eff0, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/api/api.go:406 +0x40
github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/gen/proto._VotingService_Vote100_Handler(0x864b60, 0xc42000eff0, 0xa9df60, 0xc4201e3a10, 0xc42018f810, 0x0, 0x0, 0x0, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/emojivoto-voting-svc/gen/proto/Voting.pb.go:2968 +0x276
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e9400, 0xc420059920, 0xadeb60, 0x0, 0x0, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:900 +0x9d3
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).handleStream(0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e9400, 0x0)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:1122 +0x1528
github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc420160020, 0xc420001380, 0xaa00e0, 0xc420146300, 0xc4201e9400)
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:617 +0x9f
created by github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
	/Users/mars/workspace/go/src/github.com/runconduit/conduit-examples/emojivoto/vendor/google.golang.org/grpc/server.go:615 +0xa1

Then it restart well but conduit doesn't communicate with the new server

@seanmonstar
Copy link
Contributor

The "buffer closed" error is this one right here:https://github.com/tower-rs/tower/blob/master/tower-buffer/src/lib.rs#L129

We wrap the h2 client service to the endpoint in a Buffer, and it has noticed that the wrapped client has been disconnected. Under that, we also wrap the service in a Reconnect, which should handle when it is disconnected and connect again automatically. Looking at that, though, it looks like it will fail out if the new reconnect task fails.

@wmorgan
Copy link
Member

wmorgan commented May 7, 2018

@seanmonstar so to confirm, this is a bug? If so can you update the title with a better description and we can add it to the 0.5.0 milestone?

@seanmonstar
Copy link
Contributor

I actually suspect there is a bug here: while it's fine for the connect task to error, and expected if the other service isn't accepting connections, I think we might not properly clean up on our end when that happens.

When this error shows itself, does it error forever on? That would mean we saw the closure, and the closed buffer binding is still in the proxy yelling at everyone. If it only appears for a tiny period of time, then it's probably not a bug, if it does clean up, and in between, it has to say something...

@seanmonstar seanmonstar changed the title error buffer closed Proxy doesn't clean up after buffer sees a client close? May 8, 2018
@sfroment
Copy link
Contributor Author

sfroment commented May 9, 2018

Yep for this pod the proxy will still be yelling, I need to kill it to get it working again

@seanmonstar
Copy link
Contributor

So we probably need to do 2 things here:

  1. Better surface the connect error, since the buffer error message is near worthless ;)
  2. Determine the correct behavior here. If there is a connect error, what should be done? We likely want to keep trying to connect in the future, but we'll also need to eventually reply on the HTTP requests about failures. Hm, perhaps @olix0r can check me too, but: I think we may want to just keep trying to connect, without trashing the buffer or destroying it's queue of requests. We'd log for ourselves the connect errors, and just keep trying as long as the buffer has requests to fulfill. From outside the buffer, we'd just apply a timeout to the requests, so they don't sit there forever. Is that reasonable?

@olix0r
Copy link
Member

olix0r commented May 9, 2018

@seanmonstar yeah, that's my expectation, too: that connection failures should result in backpressure until a connection can be satisfied or the bind timeout is reached

@olix0r olix0r added this to In progress in Production Readiness May 14, 2018
Production Readiness automation moved this from In progress to Done May 17, 2018
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

4 participants