Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor refactoring on the keepAliveListener and keepAliveConn #14356

Merged
merged 2 commits into from
Aug 17, 2022

Conversation

ahrtr
Copy link
Member

@ahrtr ahrtr commented Aug 17, 2022

Two issues:

  1. When starting etcd using ./etcd --socket-reuse-address --socket-reuse-port, actually etcd setting the SetKeepAlive or SetKeepAlivePeriod fails, but the current code ignores the errors. This issue can happen on both Linux and Mac.
  2. When running the functional test on Mac, the etcd always runs into issue below. This issue only happens on Mac,
panic: interface conversion: transport.timeoutConn is not transport.keepAliveConn: missing method SetKeepAlive

goroutine 295 [running]:
go.etcd.io/etcd/client/pkg/v3/transport.(*keepaliveListener).Accept(0x1f)
        /Users/wachao/go/src/go.etcd.io/etcd/client/pkg/transport/keepalive_listener.go:53 +0x51
github.com/soheilhy/cmux.(*cMux).Serve(0xc00057c3c0)
        /Users/wachao/go/gopath/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:170 +0xa9
go.etcd.io/etcd/server/v3/embed.(*serveCtx).serve(0xc0001b51f0, 0xc00037c300, 0xc0001a6868, {0x1d789c0, 0xc0003a2400}, 0xc0002f9250, {0xc0002a4280, 0x2, 0x2})
        /Users/wachao/go/src/go.etcd.io/etcd/server/embed/serve.go:220 +0x191d
go.etcd.io/etcd/server/v3/embed.(*Etcd).serveClients.func1(0xc000280ba0)
        /Users/wachao/go/src/go.etcd.io/etcd/server/embed/etcd.go:735 +0xa8
created by go.etcd.io/etcd/server/v3/embed.(*Etcd).serveClients
        /Users/wachao/go/src/go.etcd.io/etcd/server/embed/etcd.go:734 +0x5f6
,"--socket-reuse-address=true","--socket-reuse-port=true"]}

Analysis and solution

Currently etcd defines an keepAliveConn interface, and the
limitListenerConn implements this interface.

The high level diagram/relationship is something like below (excluding the unix(s) and TLS for simplicity),
etcd_listener

There are three issues:

  1. Only net.TCPConn supports SetKeepAlive and SetKeepAlivePeriod by default. So the customized conns (previously the limitListenConn, now the keepAliveConn) use golang's type assertion to convert the net.Conn to net.TCPConn, and it just leverage the net.TCPConn to set the keepalive and keepalivePeriod. If there are multiple layers of net.Listener encapsulations, then it must be the one which is closest to the original net.Listener implementation, namely TCPListener. Otherwise, the type assertion fails. It's exactly the reason for the first issue above.
  2. Mac (Darwin) doesn't somehow support FDLimit , so the net.Listener chain doesn't include the limitListener, which is the only one which implements the keepAliveConn interface. It's exactly the reason for the failure on Mac.
  3. It doesn't make sense to let limitListenConn to support SetKeepAlive and SetKeepAlivePeriod, because it should focus on the functionality (rate limit) it's designed to.

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@ahrtr
Copy link
Member Author

ahrtr commented Aug 17, 2022

cc @spzala

Copy link
Member

@spzala spzala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrtr great, thank you! I have few nits inline.

client/pkg/transport/keepalive_listener.go Outdated Show resolved Hide resolved
client/pkg/transport/limit_listen.go Show resolved Hide resolved
client/pkg/transport/keepalive_listener.go Outdated Show resolved Hide resolved
client/pkg/transport/listener.go Outdated Show resolved Hide resolved
The proxy must be waiting for the etcd to be running, but the current
implementation hard codes the wating time as 5 seconds. The improvement
is to dynamically check whether the etcd is running, and start the
proxy when etcd port is reachable.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
Only `net.TCPConn` supports `SetKeepAlive` and `SetKeepAlivePeriod`
by default, so if you want to warp multiple layers of net.Listener,
the `keepaliveListener` should be the one which is closest to the
original `net.Listener` implementation, namely `TCPListener`.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
@ahrtr
Copy link
Member Author

ahrtr commented Aug 17, 2022

Thanks @spzala for the quick review, and I resolved all the comments. PTAL, thx

@codecov-commenter
Copy link

Codecov Report

Merging #14356 (7450673) into main (ff6b85d) will decrease coverage by 0.18%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main   #14356      +/-   ##
==========================================
- Coverage   75.44%   75.26%   -0.19%     
==========================================
  Files         457      457              
  Lines       37122    37139      +17     
==========================================
- Hits        28007    27952      -55     
- Misses       7361     7418      +57     
- Partials     1754     1769      +15     
Flag Coverage Δ
all 75.26% <ø> (-0.19%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/embed/etcd.go 75.58% <ø> (+0.24%) ⬆️
client/pkg/v3/transport/keepalive_listener.go 69.38% <0.00%> (-22.92%) ⬇️
client/pkg/v3/transport/limit_listen.go 57.14% <0.00%> (-21.43%) ⬇️
client/v3/experimental/recipes/priority_queue.go 56.66% <0.00%> (-10.00%) ⬇️
server/proxy/grpcproxy/register.go 69.76% <0.00%> (-9.31%) ⬇️
client/pkg/v3/fileutil/lock_linux.go 72.22% <0.00%> (-8.34%) ⬇️
raft/rafttest/node.go 95.00% <0.00%> (-5.00%) ⬇️
server/storage/wal/file_pipeline.go 90.69% <0.00%> (-4.66%) ⬇️
client/v3/concurrency/session.go 88.63% <0.00%> (-4.55%) ⬇️
server/etcdserver/api/rafthttp/msgappv2_codec.go 69.56% <0.00%> (-3.48%) ⬇️
... and 21 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@spzala
Copy link
Member

spzala commented Aug 17, 2022

Thanks @spzala for the quick review, and I resolved all the comments. PTAL, thx

Thank you @ahrtr for quickly addressing the comments.

@spzala spzala merged commit ba0c7c3 into etcd-io:main Aug 17, 2022
ahrtr added a commit to ahrtr/etcd that referenced this pull request Aug 20, 2022
Only `net.TCPConn` supports `SetKeepAlive` and `SetKeepAlivePeriod`
by default, so if you want to warp multiple layers of net.Listener,
the `keepaliveListener` should be the one which is closest to the
original `net.Listener` implementation, namely `TCPListener`.

Also refer to etcd-io#14356

Signed-off-by: Benjamin Wang <wachao@vmware.com>
ahrtr added a commit to ahrtr/etcd that referenced this pull request Aug 21, 2022
…eListener and keepAliveConn

Refer to etcd-io#14366,
Also refer to etcd-io#14356

Signed-off-by: Benjamin Wang <wachao@vmware.com>
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/etcd that referenced this pull request Oct 7, 2022
Only `net.TCPConn` supports `SetKeepAlive` and `SetKeepAlivePeriod`
by default, so if you want to warp multiple layers of net.Listener,
the `keepaliveListener` should be the one which is closest to the
original `net.Listener` implementation, namely `TCPListener`.

Also refer to etcd-io#14356

Signed-off-by: Benjamin Wang <wachao@vmware.com>
tjungblu pushed a commit to tjungblu/etcd that referenced this pull request Jul 26, 2023
Only `net.TCPConn` supports `SetKeepAlive` and `SetKeepAlivePeriod`
by default, so if you want to warp multiple layers of net.Listener,
the `keepaliveListener` should be the one which is closest to the
original `net.Listener` implementation, namely `TCPListener`.

Also refer to etcd-io#14356

Signed-off-by: Benjamin Wang <wachao@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants