Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Router stats-port=0 error #16621

Merged
merged 1 commit into from Oct 7, 2017

Conversation

pecameron
Copy link
Contributor

@pecameron pecameron commented Sep 29, 2017

When type=haproxy-router, stats-port must not be 0.

Fixes bug: 1466133
https://bugzilla.redhat.com/show_bug.cgi?id=1466133

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 29, 2017
@pecameron
Copy link
Contributor Author

@knobunc PTAL

pecameron added a commit to pecameron/openshift-docs that referenced this pull request Oct 2, 2017
STATS_PORT=0 when ROUTER_METRICS_TYPE=haproxy is not supported.

Change STATS_PORT description, add ROUTER_METRICS_TYPE description.

Fixes bug 1466133
https://bugzilla.redhat.com/show_bug.cgi?id=1466133

origin PR 16621
openshift/origin#16621
@@ -248,6 +248,9 @@ func (o *TemplateRouterOptions) Run() error {
statsPort := o.StatsPort
switch {
case o.MetricsType == "haproxy":
if statsPort == 0 {
return fmt.Errorf("haproxy metrics requires a nonzero StatsPort")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-zero

@@ -64,10 +64,7 @@ var (
%[1]s %[2]s router-west --replicas=2

# Use a different router image
%[1]s %[2]s region-west --images=myrepo/somerouter:mytag

# Run the router with a hint to the underlying implementation to _not_ expose statistics.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update this to say how to not expose stats.

@knobunc
Copy link
Contributor

knobunc commented Oct 2, 2017

We should make --stats-port work and it should make the stats come from the router controller.

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 2, 2017
@pecameron pecameron force-pushed the bz1466133 branch 2 times, most recently from ea3ed93 to 6451486 Compare October 2, 2017 20:27
@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 2, 2017
@pecameron
Copy link
Contributor Author

@knobunc PTAL
Reverted previous changes an docs PR. All changes are now in pkg/router/metrics/metrics.go

@smarterclayton
Copy link
Contributor

ROUTER_METRICS_TYPE: haproxy and stats_port: 0 is not a supported combination.

@smarterclayton
Copy link
Contributor

The router should error out rather than let you get into this state.

@smarterclayton
Copy link
Contributor

If you don't want to expose stats, you should be setting stats port to zero and clearing metrics_type

@pecameron
Copy link
Contributor Author

@smarterclayton what is the relationship between healthz and stats. Shouldn't helathz always be supported whether or not we support stats? If so do we do that on the controller or haproxy? Are you saying that when stats are not wanted, healthz will be supported by haproxy?

ROUTER_METRICS_TYPE is an internal environment variable (not controlled on the command line) and stats-port is from a command line arg (oc adm router). How does the admin create a router that doesn't support stats?

@smarterclayton
Copy link
Contributor

ROUTER_METRICS_TYPE is a flag metrics-type. It dominates stats port. You cannot turn on metrics-type and disable stats port. If you have metrics-type off, then stats port 0 behaves as you expect. When metrics-type is on, we always start a listener, and it always handles health and metrics checking. Metrics and stats are effectively the same thing.

@pecameron
Copy link
Contributor Author

@smarterclayton the current changes start healthz but not metrics. So in the next change there is going to be no healthz without stats.

@knobunc
Copy link
Contributor

knobunc commented Oct 4, 2017

Hold on... oadm router has a --type argument. That sets ROUTER_METRICS_TYPE in the dc that gets generated which populates the --metrics-type arg to the router. Does --type mean "metrics type"? Or should we add a different knob for that (and default it based on the router type)?

I want to use --type to set things correctly for F5 and potentially nginx in the future. At the moment, we jam a bunch of F5-specific env into the dc because there is no way to tell what type of router we are making.

Are you advocating we add --metrics-type to oadm router? Or are you saying that --type should be overloaded?

If we add --metrics-type (and default it when --type is set) then when the metrics are set to haproxy we should forbid stats port 0, and forbid --expose-metrics=true then I think that is reasonable.

@smarterclayton Thoughts?

@knobunc
Copy link
Contributor

knobunc commented Oct 4, 2017

Or... perhaps we don't add --metrics-type to oadm rotuer and we change the meaning of --expose-metrics for router types of haproxy and use that to determine whether to set ROUTER_METRICS_TYPE and change the default for --expose-metrics to true.

Maybe if --metrics-image is set, then we set the ROUTER_METRICS_TYPE and use the specified image for metrics.

Is that reasonable?

@smarterclayton
Copy link
Contributor

smarterclayton commented Oct 4, 2017 via email

@knobunc
Copy link
Contributor

knobunc commented Oct 4, 2017

Actually, I just realized there is no way to use --expose-metrics on master and have it work. Should we yank them now?

The short version is that --expose-metrics sets the scrape url to:
http://$(STATS_USERNAME):$(STATS_PASSWORD)@localhost:$(STATS_PORT)/haproxy?stats;csv

And that won't work when the metrics type is haproxy. BUT the only time the sidecar is emitted is when the type is haproxy-router.

@knobunc
Copy link
Contributor

knobunc commented Oct 4, 2017

So... can we take over --expose-metrics and make it toggle whether we expose metrics vs. just health checks? I really would rather not have to set --type="" --stats-port=0 if I want to use the haproxy router, but disable stats.

@smarterclayton
Copy link
Contributor

smarterclayton commented Oct 4, 2017 via email

@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 5, 2017
@pecameron
Copy link
Contributor Author

@knobunc @smarterclayton PTAL

@@ -251,6 +251,9 @@ func (o *TemplateRouterOptions) Run() error {
if len(o.StatsUsername) == 0 || len(o.StatsPassword) == 0 {
glog.Warningf("Metrics were requested but no username or password has been provided - the metrics endpoint will not be accessible to prevent accidental security breaches")
}
if statsPort == 0 {
return fmt.Errorf("When type is haprocy-router, STATS_PORT must not be 0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case, typo, and not the right flag:

when --metrics-type is haproxy --stats-port may not be set to 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton This message comes up when you run oc adm router --stats-port=0
--metrics-type is an invalid 'oc adm router' parameter so message will just confuse the user.
Made the requested changes.

@@ -279,7 +279,7 @@ func NewCmdRouter(f *clientcmd.Factory, parentName, name string, out, errout io.
},
}

cmd.Flags().StringVar(&cfg.Type, "type", "haproxy-router", "The type of router to use - if you specify --images this flag may be ignored.")
cmd.Flags().StringVar(&cfg.Type, "type", "haproxy-router", "The type of router to use, default is haproxy-router - if you specify --images this flag may be ignored.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to describe defaults here, because the default is already printed when you run help

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@knobunc
Copy link
Contributor

knobunc commented Oct 5, 2017

Ah, got it.

@@ -293,11 +293,11 @@ func NewCmdRouter(f *clientcmd.Factory, parentName, name string, out, errout io.
cmd.Flags().StringVar(&cfg.DefaultCertificate, "default-cert", cfg.DefaultCertificate, "Optional path to a certificate file that be used as the default certificate. The file should contain the cert, key, and any CA certs necessary for the router to serve the certificate. Does not apply to external appliance based routers (e.g. F5)")
cmd.Flags().StringVar(&cfg.Selector, "selector", cfg.Selector, "Selector used to filter nodes on deployment. Used to run routers on a specific set of nodes.")
cmd.Flags().StringVar(&cfg.ServiceAccount, "service-account", cfg.ServiceAccount, "Name of the service account to use to run the router pod.")
cmd.Flags().IntVar(&cfg.StatsPort, "stats-port", cfg.StatsPort, "If the underlying router implementation can provide statistics this is a hint to expose it on this port. Specify 0 if you want to turn off exposing the statistics.")
cmd.Flags().IntVar(&cfg.StatsPort, "stats-port", cfg.StatsPort, "If the underlying router implementation can provide statistics this is a hint to expose it on this port. Specify type= and stats-port=0 to turn off exposing the statistics.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last sentence To disable all statistics, you must disable --metrics-type and set --stats-port to zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton --metrics-type is not a valid option to "oc adm router". BTW: how do you set --metrics-type? Is it in the DC as an arg to the router controller?
Changed to suggested message.

cmd.Flags().BoolVar(&cfg.ExposeMetrics, "expose-metrics", cfg.ExposeMetrics, "If true, attempts to run an extra container in the pod to expose metrics - the image will either be set depending on the router implementation or provided with --metrics-image. Not useful where comprehensive metrics are available through the stats-port (e.g. haproxy router)")
cmd.Flags().StringVar(&cfg.MetricsImage, "metrics-image", cfg.MetricsImage, "If --expose-metrics is specified this is the image to use to run a sidecar container in the pod exposing metrics. If not set and --expose-metrics is true the image will depend on router implementation. Not useful where comprehensive metrics are available through the stats-port (e.g. haproxy router)")
cmd.Flags().BoolVar(&cfg.ExposeMetrics, "expose-metrics", cfg.ExposeMetrics, "DEPRECATED If true, attempts to run an extra container in the pod to expose metrics - the image will either be set depending on the router implementation or provided with --metrics-image. Not useful where comprehensive metrics are available through the stats-port (e.g. haproxy router)")
cmd.Flags().StringVar(&cfg.MetricsImage, "metrics-image", cfg.MetricsImage, "DEPRECATED If --expose-metrics is specified this is the image to use to run a sidecar container in the pod exposing metrics. If not set and --expose-metrics is true the image will depend on router implementation. Not useful where comprehensive metrics are available through the stats-port (e.g. haproxy router)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not how flags are marked deprecated. There is an attribute on the flag to set it as deprecated.

Copy link
Contributor Author

@pecameron pecameron Oct 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted "DEPRECATED". This was not part of this bug. It will get fixed in another bug.

@@ -683,6 +683,9 @@ func RunCmdRouter(f *clientcmd.Factory, cmd *cobra.Command, out, errout io.Write
}
// automatically start the internal metrics agent if we are handling a known type
if cfg.Type == "haproxy-router" {
if cfg.StatsPort == 0 {
return fmt.Errorf("When --type is haproxy-router, --stats-port must not be 0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, type has nothing to do with stats port.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton This is your code:
db5b0fb pkg/cmd/admin/router/router.go (Clayton Coleman 2017-03-10 01:43:10 -0500 685) if cfg.Type == "haproxy-router" {
6e9c0ada pkg/oc/admin/router/router.go (Phil Cameron 2017-09-29 15:39:40 -0400 686) if cfg.StatsPort == 0 {
6e9c0ada pkg/oc/admin/router/router.go (Phil Cameron 2017-09-29 15:39:40 -0400 687) return fmt.Errorf("When --typ
e is haproxy-router, --stats-port must not be 0")
6e9c0ada pkg/oc/admin/router/router.go (Phil Cameron 2017-09-29 15:39:40 -0400 688) }
ff44d23 pkg/cmd/admin/router/router.go (Clayton Coleman 2017-06-20 21:38:36 -0400 689) env["ROUTER_LISTEN_ADDR"] = fmt.Sprin
tf("0.0.0.0:%d", cfg.StatsPort)
db5b0fb pkg/cmd/admin/router/router.go (Clayton Coleman 2017-03-10 01:43:10 -0500 690) env["ROUTER_METRICS_TYPE"] = "haproxy
"
db5b0fb pkg/cmd/admin/router/router.go (Clayton Coleman 2017-03-10 01:43:10 -0500 691) }

You use cfg.StatsPort to set up the ROUTER_LISTEN_ADDR - 0.0.0.0:0 is the error that causes the router to not come up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, wrong package. If stats-port is 0 AND type is haproxy-router, then in this code path you should avoid setting ROUTER_METRICS_TYPE and listen addr, not return this error. This is defaulting metrics on, and stats-port 0 would mean defaulting metrics off.

if cfg.Type == "haproxy-router" && cfg.StatsPort != 0 {
  env["ROUTER_LISTEN_ADDR"] = fmt.Sprintf("0.0.0.0:%d", cfg.StatsPort)
  env["ROUTER_METRICS_TYPE"] = "haproxy"
}

@pecameron
Copy link
Contributor Author

@smarterclayton could you please take a moment and address the questions that @knobunc and I have raised in the comments? I am not on the same page as you and would like to get there. What you have been saying is not how you coded this.

stats-port=0 properly disables statistics.

Fixes bug: 1466133
https://bugzilla.redhat.com/show_bug.cgi?id=1466133
@openshift-ci-robot openshift-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 6, 2017
@pecameron
Copy link
Contributor Author

@knobunc @smarterclayton We have come full circle, stats-port=0 disables router statistics. PTAL

@smarterclayton
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 6, 2017
@smarterclayton
Copy link
Contributor

/retest

@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 6, 2017
Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: knobunc, pecameron, smarterclayton

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@knobunc knobunc added the kind/bug Categorizes issue or PR as related to a bug. label Oct 6, 2017
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 7, 2017

@pecameron: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/origin/verify a2e5501 link /test origin-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue.

@openshift-merge-robot openshift-merge-robot merged commit 0c514c9 into openshift:master Oct 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants