Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC WS refactor #811

Closed
wants to merge 1 commit into from
Closed

RPC WS refactor #811

wants to merge 1 commit into from

Conversation

baum
Copy link
Contributor

@baum baum commented Dec 8, 2021

RPC WS refactor

- Use channel/goroutine for sending messages
- Rework connect/reconnect logic
- Remove WS specific logic from common RPC layer

Description

Abstract

RPCConnWS implements RPCConn interface used by the RPC layer in terms of nhooyr.io/websocket transport. RPC messages are dispatched according to address URL protocol in GetConnection().

States

RPCConnWS is a WebSocket connection that is shared and multiplexed
for all concurrent requests to the same address. An RPCConnWS instance is constructed in a connected state and ready for executing RPC Calls. nil is returned if connection refused or within reconnect delay. RPCConnWS detects peer disconnect on send and receives paths enhanced by heartbeat Ping goroutine and closes gracefully.

Connection, reconnection map

NewRPCConnWS uses connection and reconnection tables mapped by the connection's address URL. The state shared by the construction and closing logic is guarded by the mutex, unlocking is insured by deferring unlock.

IO

Threads

An open connection is served by three goroutines two of those pump messages between the WebSocket and Call and Reply channels, while one is used for WS control ping-pong to improve reaction time to peer disconnect.

  1. SendMessages() pumps request messages from the Call channel into the WebSocket connection
  2. ReadMessages() handles incoming messages and dispatches responses into Reply channel by HandleResponse()
  3. SendPings() sends pings periodically to improve detection of a peer disconnect. nhooyr.io/websocket library responds to pongs automatically independently of a reading goroutine, but RPCConnWS must send pings itself.

Shared state

IO "in-flight" state is tracked using RPCConnWS PendingRequests mapped by the request's ID. Request ID counter increment, the state shared by the IO read/send/close threads is guarded by the connection mutex, unlocking is insured by deferring unlock.

@baum baum force-pushed the rpc_refactor branch 12 times, most recently from d917f82 to 3149675 Compare December 8, 2021 17:20
- Use channel/goroutine for sending messages
- Rework connect/reconnect logic
- Remove WS specific logic from common RPC layer

Signed-off-by: Alexander Indenbaum <aindenba@redhat.com>
@baum baum marked this pull request as ready for review December 8, 2021 19:59
logrus.Infof("RPC: no connection to %v, reconnect in %v at %v", address, time.Until(*reconn), reconn)
return nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you handle deleting items from reconnMap?

var (
connMap = make(map[string]*RPCConnWS)
reconnMap = make(map[string]*time.Time)
connMapLock = sync.Mutex{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something I was wondering about, do we need another lock for the reconnMap?

@@ -18,7 +18,7 @@ type RPCConnHTTP struct {
}

// NewRPCConnHTTP returns a new http connection
func NewRPCConnHTTP(r *RPC, address string) *RPCConnHTTP {
func NewRPCConnHTTP(r *RPC, address string) RPCConn {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not keep using RPCConnHTTP?

@guymguym guymguym removed their request for review April 13, 2022 07:57
@baum baum mentioned this pull request Aug 17, 2022
@dannyzaken
Copy link
Contributor

Since this PR is changing a very infrastructural module and is not addressing a specific bug, I want to avoid unintended consequences.
@baum please keep these changes and let's revisit them if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants