Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement/use node exporter pkg #113

Merged
merged 6 commits into from
Jul 5, 2018

Conversation

Zempashi
Copy link

@Zempashi Zempashi commented Jul 4, 2018

Fixes: #31
Fixes: #56

repo_gpgcheck: True
state: present

# https://github.com/ansible/ansible/issues/20711
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ballot-scality I'm not sure how to detect that the key has been accepted or not.

@NicolasT NicolasT added kind:enhancement New feature or request topic:operations Operations-related issues labels Jul 4, 2018
@NicolasT NicolasT added this to the MetalK8s 0.1.0 milestone Jul 4, 2018
@NicolasT NicolasT requested review from a team and ballot-scality July 4, 2018 16:00
@@ -103,3 +105,17 @@ EOF
assert_fails 'make_shell kubectl get pv -o jsonpath={.items[*].status.phase} | grep Released > /dev/null' \
"PVs in Released state found"
}

test_prometheus_node_exporter_metrics() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test rely on being on the server itself (yum and accessing prometheus cluster IP directly) but test are better than no test. As already discussed, we might want to move to a more feature-full framework (list of features to be defined)

@Zempashi Zempashi force-pushed the improvement/use_node_exporter_pkg branch from 43af70d to 85d7952 Compare July 4, 2018 16:06
metal-k8s.yml Outdated
- role: kube_heapster
tags: ['heapster']
- role: kube_elasticsearch
tags: ['elasticsearch']
- role: kube_metrics_server
tags: ['metrics-server']

- hosts: kube-node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this go on kube-node, kube-master (so actually k8s-cluster) and etcd?

- name: 'make assertions'
assert:
that:
- prometheus_addon_dir is defined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assertion comes after first usage (above), which is odd?

- role: kube_heapster
tags: ['heapster']
- role: kube_elasticsearch
tags: ['elasticsearch']
when: metalk8s_elasticsearch_enabled|bool
- role: kube_metrics_server
tags: ['metrics-server']

- hosts: kube-node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be k8s-cluster:etcd?

baseurl: https://packagecloud.io/prometheus-rpm/release/el/$releasever/$basearch
description: prometheus-rpm_release
enabled: True
gpgcheck: False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but repo_gpgcheck: True (which is an oddity specific to packagecloud.io)

name: prometheus-rpm_release
baseurl: https://packagecloud.io/prometheus-rpm/release/el/$releasever/$basearch
description: prometheus-rpm_release
enabled: True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep disabled by default and explicitly enable when installing node_exporter?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For update sake ? Talking to @ballot-scality, there is some plugin to install yum-version-lock. Might be better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update would all of a sudden install 0.16 :) I'm not too fond of version-locking myself. I believe Kubespray even uses a separate yum.conf for Docker installs to keep things separated. You can enable repos on yum invocation (so likely also through the yum module)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repo docker are enabled (and yes install docker-ce will install 18.03, which is bad).
We must avoid doing the same mistake of kubespray. Version locking could fix docker problem (because disabling docker repo after kubespray will not prevent kubespray itself to reenable it).
It will also allow some upgrade and prevent what we have on the ring multiple times: "oh the package is now on epel so the system decide to take the version of epel instead of our repo."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough 😃

@Zempashi Zempashi force-pushed the improvement/use_node_exporter_pkg branch from 85d7952 to 0c1446b Compare July 4, 2018 16:12
@Zempashi Zempashi force-pushed the improvement/use_node_exporter_pkg branch from 0c1446b to 8ac8b42 Compare July 4, 2018 16:14
NicolasT
NicolasT previously approved these changes Jul 4, 2018
Copy link
Contributor

@NicolasT NicolasT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't test, but LGTM on-sight...

Looking into making the packagecloud.io repo not enabled by default could be nice, but could be a follow-up PR as far as I'm concerned.

Also, a bit surprised you need the makecache, I thought yum does that automatically when it finds new repositories.

@NicolasT
Copy link
Contributor

NicolasT commented Jul 4, 2018

CI seems unhappy 🤖

jq '.data.activeTargets |[.[]|select(.labels.job == $job)]|length' \
--arg job "node-exporter")
echo "Found ${NB_TARGET} targets"
assert [ "${NB_TARGET}" -gt "0" ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future:

assert_not_equals 0 ${NB_TARGET}

@NicolasT NicolasT force-pushed the improvement/use_node_exporter_pkg branch from 7d4c9ab to 1c15210 Compare July 4, 2018 17:20
@NicolasT
Copy link
Contributor

NicolasT commented Jul 4, 2018

Fixed CI by applying #113 (comment)

NicolasT
NicolasT previously approved these changes Jul 4, 2018
@Zempashi Zempashi force-pushed the improvement/use_node_exporter_pkg branch 2 times, most recently from d478cdd to 2d35169 Compare July 4, 2018 20:13
NicolasT
NicolasT previously approved these changes Jul 4, 2018
Copy link
Contributor

@NicolasT NicolasT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this has been tested, and existing Grafana dashboards still work 😃

NicolasT added a commit that referenced this pull request Jul 5, 2018
@Zempashi
Copy link
Author

Zempashi commented Jul 5, 2018

I've checked that when reverting from 0.16 to 0.15 on one node, some dashboards where back. I'll double check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:enhancement New feature or request topic:operations Operations-related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants