Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-leader metrics-prometheus-collector eagerly computes metrics #42

Merged
merged 1 commit into from
Apr 10, 2023

Conversation

pwittrock
Copy link
Contributor

  • sampler pushes metrics to all collector replicas
  • non-leader collector computes and calculates metrics, but doesn't publish

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 31, 2023
@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 31, 2023
@pwittrock pwittrock force-pushed the startup branch 2 times, most recently from 4114c28 to f86192b Compare March 31, 2023 22:40
Copy link
Contributor

@ndipebot ndipebot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 31, 2023
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 31, 2023
@pwittrock pwittrock force-pushed the startup branch 2 times, most recently from c7f834b to f76c72d Compare April 3, 2023 15:59
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 3, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 3, 2023
@pwittrock pwittrock force-pushed the startup branch 4 times, most recently from ecf0bf5 to 0917237 Compare April 4, 2023 22:47
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 4, 2023
@pwittrock pwittrock force-pushed the startup branch 4 times, most recently from ebe9d62 to d958175 Compare April 4, 2023 23:39
@logicalhan
Copy link

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 6, 2023
Comment on lines 203 to 206
if options.LeaderOptions.LeaderElection {
metrics.Col.IsLeaderElected.Store(false)
} else {
metrics.Col.IsLeaderElected.Store(true)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if options.LeaderOptions.LeaderElection {
metrics.Col.IsLeaderElected.Store(false)
} else {
metrics.Col.IsLeaderElected.Store(true)
}
metrics.Col.IsLeaderElected.Store(!options.LeaderOptions.LeaderElection)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 7, 2023
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 10, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 10, 2023
@pwittrock pwittrock force-pushed the startup branch 2 times, most recently from 00854f1 to cbffb8b Compare April 10, 2023 18:54
@pwittrock
Copy link
Contributor Author

Had to significantly re-work this after thoroughly testing it in a cluster. We should follow up with some e2e tests that run for ~hours and do things like rolling updates to ensure a proper handoff. That is a non-trivial task though, and shouldn't block fixing the issue.

ms.doLeaderElection(ctx)
} else {
electedMetric.WithLabelValues(os.Getenv("POD_NAME")).Set(1)
ms.Col.IsLeaderElected.Store(true)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line seems redundant, since we already set this to !options.LeaderElection.

pkg/api/samplerserverv1alpha1/types.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ndipebot ndipebot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved with nits

cmd/metrics-prometheus-collector/cmd/cmd.go Outdated Show resolved Hide resolved
cmd/metrics-prometheus-collector/cmd/cmd.go Outdated Show resolved Hide resolved
pkg/api/samplerserverv1alpha1/types.go Outdated Show resolved Hide resolved
pkg/collector/collector.go Show resolved Hide resolved
* sampler pushes metrics to all collector replicas
  * samplers query DNS and headless service at startup and periodically -- fix periodic registration delay from collectors
  * collectors explicitly register themselves at startup and periodically -- fix DNS propagation delay
* sampler sets pod name, namespace and container name from container labels when using containerd
* non-leader collector computes and calculates metrics, but doesn't publish to prometheus
@ndipebot
Copy link
Contributor

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 10, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ndipebot, pwittrock

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 72d3176 into kubernetes-sigs:main Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants