Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oc adm router --expose-metrics generates DC which is incompatible with the latest prom/haproxy-exporter image #15982

Closed
jperville opened this issue Aug 25, 2017 · 7 comments
Assignees
Labels
component/routing lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2

Comments

@jperville
Copy link

The oc adm router --expose-metrics command generates a router DC that runs the prom/haproxy-exporter:latest image as sidecar to the main router container, as documented in https://docs.openshift.com/container-platform/3.5/install_config/router/default_haproxy_router.html#exposing-the-router-metrics .

Recently, a new version of the prom/haproxy-exporter has been released which uses incompatible way to declare its options (prefixing with double dash instead of single one). As a consequence, the oc adm router --expose-metrics can no longer use the prom/haproxy-exporter:latest safely since that version has 2 incompatible ways to pass its arguments. We must specify the image version explicitly, and adjust the entrypoint args manually.

With prom/haproxy-exporter:v0.7.1, the option to declare the scrape URI is called -haproxy.scrape-uri (with a single dash).
As of prom/haproxy-exporter:v0.8.0, the same option is now prefixed with a double dash ( --haproxy.scrape-uri).

As a consequence, booting the metrics-exporter container using -haproxy.scrape-uri=http://$(STATS_USERNAME):$(STATS_PASSWORD)@localhost:$(STATS_PORT)/haproxy?stats;csv as the first entrypoint argument does not work anymore and the pod now crashes with the following message in the metrics-exporter container log:

[root@origin-centos-72 ~]# oc logs -f router-1-kkd6m -c metrics-exporter
haproxy_exporter: error: unknown short flag '-a', try --help

As a workaround, I tried to generate the DC by passing --metrics-image explicitly, but in this case the generated DC does not include entrypoint argument and the haproxy exporter image will boot but it will be unable to contact its scrape uri (since it was not passed any argument).

My final workaround was to:

Version
[root@master ~]# oc version
oc v1.5.1
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://192.168.33.220.xip.io:8443
openshift v1.5.1
kubernetes v1.5.2+43a9be4
Steps To Reproduce
oadm router --expose-metrics
# wait until the router pod starts
wget --no-proxy -qO- http://$(oc get pod -n default -l router=router -o jsonpath="{ .items[*].status.podIP }"):9101/metrics
Current Result

The router pod crashes in loop, then the deployment fails.

Expected Result

The router pod should start and serve metrics on port 9101.

@knobunc
Copy link
Contributor

knobunc commented Aug 29, 2017

I suspect that now the router supports metrics, we should deprecate this option. This needs some investigation.

@ingcsmoreno
Copy link

I'm running a OO v1.4 cluster, and still have this issue. The "integrated" router metrics mentioned above don't work for me since they get constantly reset.

As a workaround, I deployed the router as @jperville did, then modified the router deploymentconfig and added the missing "-" at the args section.
Say:
from:

args:
            - >-
              -haproxy.scrape-uri=http://$(STATS_USERNAME):$(STATS_PASSWORD)@localhost:$(STATS_PORT)/haproxy?stats;csv

to:

args:
            - >-
              --haproxy.scrape-uri=http://$(STATS_USERNAME):$(STATS_PASSWORD)@localhost:$(STATS_PORT)/haproxy?stats;csv

@smarterclayton
Copy link
Contributor

We no longer support this path (we announced deprecation with innate metrics). The recommended path going forward is to manage your router config (plus image) as a config object.

@pecameron
Copy link
Contributor

Removing --expose-metrics and --metrics-image is tracked by:
https://github.com/openshift/origin/issues/1499026

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 16, 2018
pecameron added a commit to pecameron/origin that referenced this issue Apr 19, 2018
bug 1499026
https://bugzilla.redhat.com/show_bug.cgi?id=1499026
1499026 - Deprecate oc adm router --expose-metrics and --metrics-image

origin issue: 15982
openshift#15982

OCP 3.11

Signed-off-by: Phil Cameron <pcameron@redhat.com>
pecameron added a commit to pecameron/origin that referenced this issue Apr 19, 2018
bug 1499026
https://bugzilla.redhat.com/show_bug.cgi?id=1499026
1499026 - Deprecate oc adm router --expose-metrics and --metrics-image

origin issue: 15982
openshift#15982

OCP 3.11

Signed-off-by: Phil Cameron <pcameron@redhat.com>
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 16, 2018
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/routing lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2
Projects
None yet
Development

No branches or pull requests

8 participants