Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add more labels to RequestRecorder #12727

Merged
merged 5 commits into from
Apr 12, 2022
Merged

Conversation

acpana
Copy link
Contributor

@acpana acpana commented Apr 8, 2022

overview

This PR adds more labels to the RequestRecorder. The following are mandatory labels:

  • leader = {true | false | unreported}

The optional labels, given duck typing, are:

  • blocking = {true | false}
  • target_datacenter = {string}
  • locality = {forwarded | local}

These labels allow further observability for the new rpc metric(s).

todos left:

Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
agent/consul/server.go Outdated Show resolved Hide resolved
Comment on lines +69 to +70
labelsArr := flattenLabels(labels)
r.Logger.Trace(requestLogName, labelsArr...)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i did my best to not add another helper func here -- let me know if there's a easier method to enumerate keys, values as []interface{} in go 🙈

@acpana acpana requested a review from a team April 8, 2022 00:11
Copy link
Contributor

@eculver eculver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few non-blocking suggestions.

agent/rpc/middleware/interceptors_test.go Show resolved Hide resolved
agent/consul/server.go Outdated Show resolved Hide resolved
agent/consul/server.go Outdated Show resolved Hide resolved
@@ -1344,11 +1347,13 @@ func TestServer_RPC_MetricsIntercept(t *testing.T) {
{Name: "errored", Value: "false"},
{Name: "request_type", Value: "read"},
{Name: "rpc_type", Value: "test"},
{Name: "server_role", Value: "unreported"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this get wired up with the test's (*Server).IsLeader func and thus not be "unreported"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, that's a fair point; the way i was thinking is that this kind of test needs to happen at the middleware layer 🤔

also agent/metrics_test.go should take care of the server_role/ e2e.

Copy link
Collaborator

@dhiaayachi dhiaayachi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this @FFMMM. I have few comments related to the use of IsLeader mostly.

@@ -425,6 +404,29 @@ func NewServer(config *Config, flat Deps, publicGRPCServer *grpc.Server) (*Serve
fsm: newFSMFromConfig(flat.Logger, gc, config),
}

var recorder *middleware.RequestRecorder
if flat.NewRequestRecorderFunc != nil {
recorder = flat.NewRequestRecorderFunc(serverLogger, s.IsLeader, s.config.Datacenter)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what are the implication of passing IsLeader to the request recorder in here. IsLeader go as deep as atomically reading the raft state to be able to tell and that could have sides effects that are hard to reason about (specially performance effects)
I would recommend to have a separate state for the recorder that we can update when we acquire or loose leadership.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to have a separate state for the recorder that we can update when we acquire or loose leadership.

Would you mind chatting thru this some more with me? When I thought about this, I started to think that it can get too complex keeping tabs on the leadership state for a server but I may be wrong here 🤔


On another note, I see this "pattern" (of passing the func/ a smaller interface) being loosely used in gateway code here:

type serverDelegate interface {
blockingQuery(queryOpts blockingQueryOptions, queryMeta blockingQueryResponseMeta, fn queryFn) error
IsLeader() bool
LeaderLastContact() time.Time
setDatacenterSupportsFederationStates()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were chatting offline and believe that the underlying raft.getState() per RPC call would not adversely impact performance.

We are still interested in coming up with a different way of the server telling the request recorder about its leadership state. I will backlog an issue for this.

agent/consul/server.go Outdated Show resolved Hide resolved
agent/rpc/middleware/interceptors.go Show resolved Hide resolved
}

key := keyMakingFunc(middleware.OneTwelveRPCSummary[0].Name, expectedLabels)

if _, ok := storage[key]; !ok {
// the compound key will look like: "rpc+server+call+Status.Ping+false+read+test+unreported"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can simplify the content of the key and use assertion to verify the values in here. I think in the test you can most of the time infer the right entry without having all of this in the key.

Copy link
Contributor Author

@acpana acpana Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need the name of the rpc call (otherwise the key gets overwritten in the map).

The labels help distinguish the happy path for us so we don't "fail silently" on the calls and mark the test as passed.


labels := []metrics.Label{
{Name: "method", Value: requestName},
{Name: "errored", Value: strconv.FormatBool(respErrored)},
{Name: "request_type", Value: reqType},
{Name: "rpc_type", Value: rpcType},
{Name: "server_role", Value: serverRole},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit/bikeshed:

Suggested change
{Name: "server_role", Value: serverRole},
{Name: "leader", Value: strconv.FormatBool(isLeader)},

RPC metrics will only be emitted by servers so there is no need to prefix the metric name with server. Secondly since there are only two possible values of leader and follower I think the metric might be more intuitive for someone exploring their grafana or prometheus to have the name be leader and the value be a boolean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes a lot of sense! and that's what I am changing for a new commit ⌨️

I think it would be nice to keep unreported, and subsequently keep this as a tri state

leader: {`true`, `false`, `unreported`}

for a defense in depth kind of approach. If the function is not passed to the RequestRecorder, I wouldn't think it useful or accurate to report leader: false

@acpana acpana added the backport-inactive/1.12 This release series is no longer active label Apr 11, 2022
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
@vercel vercel bot temporarily deployed to Preview – consul April 11, 2022 20:18 Inactive
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging April 11, 2022 20:18 Inactive
@acpana acpana requested a review from dhiaayachi April 11, 2022 20:23
@acpana acpana mentioned this pull request Apr 11, 2022
@hc-github-team-consul-core
Copy link
Collaborator

🍒 If backport labels were added before merging, cherry-picking will start automatically.

To retroactively trigger a backport after merging, add backport labels and re-run https://circleci.com/gh/hashicorp/consul/635231.

@hc-github-team-consul-core
Copy link
Collaborator

🍒✅ Cherry pick of commit a46bbe8 onto release/1.12.x succeeded!

hc-github-team-consul-core pushed a commit that referenced this pull request Apr 12, 2022
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-inactive/1.12 This release series is no longer active
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants