Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jsonnet,pkg: watch kubelet CA #250

Merged
merged 2 commits into from
Feb 18, 2019

Conversation

squat
Copy link
Contributor

@squat squat commented Feb 15, 2019

This commit addds a new configmap-reloader container to the Prometheus
K8s pods that watch the configmap holding the CAs used to sign the
Kubelet CSRs. These CSRs are signed by the CSR controller. The CA is
rolled frequently and so must be watched by the
cluster-monitoring-operator and mirrored into the openshift-monitoring
namespace whenever it changes.

These changes fix scraping of all kubelets on worker nodes, however, scraping
master kubelets will be broken until
openshift/cluster-kube-apiserver-operator#247 lands and
makes it into the installer. Once that is in, we can change the CA configmap to
kubelet-serving-ca.

cc @s-urbaniak @deads2k @brancz

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 15, 2019
@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 15, 2019
@squat squat force-pushed the kubelet-ca branch 2 times, most recently from 20ba7a5 to af5723a Compare February 16, 2019 01:49
@squat squat changed the title [WIP] jsonnet,pkg: watch kubelet CA jsonnet,pkg: watch kubelet CA Feb 16, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 16, 2019
@squat
Copy link
Contributor Author

squat commented Feb 16, 2019

/retest

2 similar comments
@squat
Copy link
Contributor Author

squat commented Feb 16, 2019

/retest

@squat
Copy link
Contributor Author

squat commented Feb 16, 2019

/retest

@squat
Copy link
Contributor Author

squat commented Feb 17, 2019

/test e2e-aws-operator

Copy link
Contributor

@brancz brancz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that I think we can skip, otherwise lgtm

@@ -89,6 +90,18 @@ spec:
- --label=namespace
image: quay.io/coreos/prom-label-proxy:v0.1.0
name: prom-label-proxy
- args:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary. Prometheus reloads the scape managers every 5 seconds, which reloads all scrape pools, and reloading a scrape pool re-creates the http client, which also re-reads the TLS configuration. 5 seconds delay is acceptable I think as kubelets reloading their certificates and the secret being re-mounted is racy anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, let's persist this fact in a comment inside jsonnet, otherwise it is not obvious why a configmap reloader is not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done and done

This commit addds a new configmap-reloader container to the Prometheus
K8s pods that watch the configmap holding the CAs used to sign the
Kubelet CSRs. These CSRs are signed by the CSR controller. The CA is
rolled frequently and so must be watched by the
cluster-monitoring-operator and mirrored into the openshift-monitoring
namespace whenever it changes.
@@ -612,6 +613,18 @@ func (f *Factory) PrometheusK8sServingCertsCABundle() (*v1.ConfigMap, error) {
return c, nil
}

func (f *Factory) PrometheusK8sCSRControllerCABundle(data map[string]string) (*v1.ConfigMap, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mini-nit: i think we are unit-testing all those factory methods. let's add a test for that one too.

@brancz
Copy link
Contributor

brancz commented Feb 18, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 18, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brancz, squat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 47f30f2 into openshift:master Feb 18, 2019
@sjenning
Copy link
Contributor

kubelet metrics targets are working once again 👍 thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants