Skip to content

Commit

Permalink
kvserver: add x-region, x-zone Raft msg metrics to Store
Browse files Browse the repository at this point in the history
Previously, there were no metrics to observe cross-region, cross-zone traffic in
raft messages requests sent / received at each store.

To improve this issue, this commit adds six new store metrics -

```
"raft.rcvd.bytes"
"raft.sent.bytes"
"raft.rcvd.cross_region.bytes"
"raft.sent.cross_region.bytes"
"raft.rcvd.cross_zone.bytes"
"raft.sent.cross_zone.bytes"
```

The first two metrics track the total byte of raft messages received and sent in
a store. Additionally, there are four metrics to track the aggregate byte count
of cross-region, cross-zone Raft messages sent and received in the store.

Note that these metrics capture the byte count of requests immediately upon
message reception and just prior to message transmission. In the case of
messages containing heartbeats or heartbeat_resps, they capture the byte count
of requests with coalesced heartbeats.

To facilitate metrics updating, this commit also introduces a new raft message
handler interface `OutgoingRaftMessageHandler`. This interface captures outgoing
messages right before they are sent to `raftSendQueue`. Note that the message
may not be successfully queued if the outgoing queue is full.

Resolves: #103983

Release note (ops change): Six new metrics -
"raft.rcvd.bytes"
"raft.sent.bytes"
"raft.rcvd.cross_region.bytes"
"raft.sent.cross_region.bytes"
"raft.rcvd.cross_zone.bytes"
"raft.sent.cross_zone.bytes" - are now added to store metrics.

For accurate metrics, follow these assumptions:
- Configure region and zone tier keys consistently across nodes.
- Within a node locality, ensure unique region and zone tier keys.
- Maintain consistent configuration of region and zone tiers across nodes.
- Cross-region but same zone activities should be impossible.
  • Loading branch information
wenyihu6 committed Jun 20, 2023
1 parent 35676af commit 7c1d8ec
Show file tree
Hide file tree
Showing 6 changed files with 747 additions and 9 deletions.
44 changes: 44 additions & 0 deletions pkg/kv/kvserver/client_raft_helpers_test.go
Expand Up @@ -172,6 +172,50 @@ func (h *unreliableRaftHandler) HandleDelegatedSnapshot(
return h.IncomingRaftMessageHandler.HandleDelegatedSnapshot(ctx, req)
}

type filterRaftHandlerFuncs struct {
// If set to nil, all incoming messages are dropped. If non-nil, returning
// true can prevent the message from being dropped.
filterReq func(*kvserverpb.RaftMessageRequest) bool
// If set to nil, no additional processing is applied to outgoing messages
// from the sender's end. If non-nil, returning true can allow the additional
// processing.
filterReqSent func(*kvserverpb.RaftMessageRequest) bool
}

// filterRaftHandler applies the filter functions within filterRaftHandlerFuncs
// to the incoming and outgoing messages. It ensures that only messages with
// filters that evaluate to true are forwarded to the underlying store
// interface.
type filterRaftHandler struct {
kvserver.IncomingRaftMessageHandler
kvserver.OutgoingRaftMessageHandler
filterRaftHandlerFuncs
}

var _ kvserver.IncomingRaftMessageHandler = &filterRaftHandler{}
var _ kvserver.OutgoingRaftMessageHandler = &filterRaftHandler{}

func (f *filterRaftHandler) HandleRaftRequest(
ctx context.Context,
req *kvserverpb.RaftMessageRequest,
respStream kvserver.RaftMessageResponseStream,
) *kvpb.Error {
if f.filterReq == nil || !f.filterReq(req) {
return nil
}

return f.IncomingRaftMessageHandler.HandleRaftRequest(ctx, req, respStream)
}

func (f *filterRaftHandler) HandleRaftRequestSent(
ctx context.Context, req *kvserverpb.RaftMessageRequest,
) {
if f.filterReqSent == nil || !f.filterReqSent(req) {
return
}
f.OutgoingRaftMessageHandler.HandleRaftRequestSent(ctx, req)
}

// testClusterStoreRaftMessageHandler exists to allows a store to be stopped and
// restarted while maintaining a partition using an unreliableRaftHandler.
type testClusterStoreRaftMessageHandler struct {
Expand Down

0 comments on commit 7c1d8ec

Please sign in to comment.