Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dependency for kube-rbac-proxy and add secure-metrics feature #3833

Merged
merged 12 commits into from
Mar 8, 2024

Conversation

super-harsh
Copy link
Collaborator

@super-harsh super-harsh commented Feb 28, 2024

Closes #3741

What this PR does / why we need it:

This PR removes dependency for kube-rbac-proxy for securely serving metrics.
Instead, we have a secure-metrics feature flag in ASO using which users can toggle to enable metrics via HTTPs.

Bonus: Have added pprof handlers as well.

If applicable:

  • this PR contains documentation
  • this PR contains tests
  • this PR contains YAML Samples

@codecov-commenter
Copy link

codecov-commenter commented Feb 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 53.40%. Comparing base (1650813) to head (99969f3).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3833      +/-   ##
==========================================
- Coverage   53.42%   53.40%   -0.03%     
==========================================
  Files        1521     1521              
  Lines      546189   546189              
==========================================
- Hits       291781   291665     -116     
- Misses     209457   209573     +116     
  Partials    44951    44951              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@theunrepentantgeek theunrepentantgeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but needs a couple minor tweaks. Approving so you can merge when they're done.

@@ -7,7 +7,7 @@ The metrics exposed fall into two groups: Azure based metrics, and reconciler me

## Toggling the metrics

By default, metrics for ASOv2 are turned on and can be toggled by the following options:
By default, secure metrics for ASOv2 are turned on and can be toggled by the following options:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they really turned on, given the option below says default: false ?


- Use the settings below in your deployment:

- #### ASOv2 Helm Chart
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this bullet point needs to be a heading - and if it were a heading, it would be 3rd level, not 4th.

Suggest changing to plain text.

Comment on lines 45 to 46
#grep -E $KUBE_RBAC_PROXY "$GEN_FILES_DIR"/*_deployment_* > /dev/null # Ensure that what we're about to try to replace actually exists (if it doesn't we want to fail)
#sed -i "s@$KUBE_RBAC_PROXY.*@{{.Values.image.kubeRBACProxy}}@g" "$GEN_FILES_DIR"/*_deployment_*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out - either restore or delete.

@@ -92,6 +92,9 @@ image:
# 'address' field defines the metrics binding address on which metrics
metrics:
enable: true
# secureMetrics controls whether metrics should be served via 'http' or 'https'.
# Flagging secureMetrics as true would use https
secureMetrics: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in this yaml file reinforces my earlier question about whether secure metrics are turned on by default.

docs/hugo/content/guide/metrics.md Outdated Show resolved Hide resolved
docs/hugo/content/guide/metrics.md Outdated Show resolved Hide resolved
docs/hugo/content/guide/metrics.md Outdated Show resolved Hide resolved
docs/hugo/content/guide/metrics.md Outdated Show resolved Hide resolved
docs/hugo/content/guide/metrics.md Outdated Show resolved Hide resolved
v2/cmd/controller/app/flags.go Outdated Show resolved Hide resolved
v2/cmd/controller/app/flags.go Outdated Show resolved Hide resolved
SecureServing: true,
FilterProvider: filters.WithAuthenticationAndAuthorization,
// Note that pprof endpoints are meant to be sensitive and shouldn't be exposed publicly.
ExtraHandlers: map[string]http.Handler{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should have a specific cmdline flag that enables this and it should default to off.

This is what apiserver does via the --profiling argument, though it defaults it to true.

I'd argue we should have both metrics.secure and metrics.profiling flags in Helm, and two cmdline args in ASO to control these two bits.

Secure metrics shouldn't (IMO) require that you expose pprof.

v2/config/rbac/crd_manager_role.yaml Show resolved Hide resolved
func getMetricsOpts(flags Flags) server.Options {
var metricsOptions server.Options

if !flags.SecureMetrics {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code doesn't seem right, because it doesn't return here, it falls through and then overwrites the metricsOptions below?

I think you want something like:

var metricsOptions
if secure {
    // secure
} else {
    // insecure
}

if profiling {
    // profiling
}

return

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if you want to only enable profiling if secure, put that bit into the secure bit, and/or (maybe better) add a check that secure is not off with profiling on and if so return an error and crash the pod.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch. I missed return there. I'll update

# Flagging secure as 'true' would use https
# Refer to https://azure.github.io/azure-service-operator/guide/metrics/ for more information
secure: true
# profiling endpoints are only enabled when serving metrics securely
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: Say what profiling is (it enables /pprof endpoints), in addition to its restrictions. Though, see below about this restriction because I am not sure it is needed?

@@ -54,6 +60,9 @@ func ParseFlags(args []string) (Flags, error) {

// default here for 'MetricsAddr' is set to "0", which sets metrics to be disabled if 'metrics-addr' flag is omitted.
flagSet.StringVar(&metricsAddr, "metrics-addr", "0", "The address the metric endpoint binds to.")
flagSet.BoolVar(&secureMetrics, "secure-metrics", true, "Enable secure metrics. This secures the pprof and metrics endpoints via Kubernetes RBAC and HTTPS")
flagSet.BoolVar(&profilingMetrics, "profiling-metrics", true, "Enable pprof metrics, only enabled in conjunction with secure-metrics. This will enable serving pprof metrics endpoints")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default should probably be false

Copy link
Member

@matthchr matthchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with one minor style comment


Follow the steps below to scrape metrics securely.

### ASOv2 Helm Chart
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor: Could use one of the table-selector thingies here, like we do for Windows versus Linux in the installation instructions?

@super-harsh super-harsh added this pull request to the merge queue Mar 8, 2024
Merged via the queue into main with commit 034c9e3 Mar 8, 2024
9 checks passed
@super-harsh super-harsh deleted the remove/rbac-proxy branch March 8, 2024 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Feature: Securely serve metrics using controller-runtime, instead of kube-rbac-proxy
4 participants