Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add node-exporter #13

Merged
merged 24 commits into from
Sep 10, 2020
Merged

Add node-exporter #13

merged 24 commits into from
Sep 10, 2020

Conversation

akshaymankar
Copy link
Member

@akshaymankar akshaymankar commented Aug 28, 2020

@akshaymankar akshaymankar marked this pull request as ready for review August 31, 2020 10:17
* Add role to create CA and client certificates

This role will likely only be used only for boostrapping an environment. It
doesn't take care of certificate rotation yet, but it can be made to do so as
and when required.
Copy link

@arianvp arianvp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.

Would be nicer if we had some more central internal CA; and that we dont store it in S3; but babysteps. We have 3150 days to fix it :P

Copy link
Contributor

@lucendio lucendio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some comments and have some questions:

Since you included everything related to metrics into this role, instead of putting it in a separate one, would it make sense to make it optional?

Can you elaborate on why the monitoring-certs role came into existence?

roles/sft-server/templates/sftd.vhost.conf.j2 Show resolved Hide resolved
roles/sft-server/tasks/traffic.yml Outdated Show resolved Hide resolved
roles/sft-server/tasks/traffic.yml Show resolved Hide resolved
roles/sft-server/tasks/assert.yml Show resolved Hide resolved
roles/sft-server/molecule/default/verify.yml Outdated Show resolved Hide resolved
roles/sft-server/meta/main.yml Show resolved Hide resolved
roles/monitoring-certs/tasks/main.yml Outdated Show resolved Hide resolved
roles/monitoring-certs/tasks/main.yml Outdated Show resolved Hide resolved
roles/sft-server/tasks/configure.yml Outdated Show resolved Hide resolved
Remove documentation for obsolete 'metrics_fqdn'
Also:
* comment out usage of use_common_name_for_san as it is not available in
ansible-2.7
In ansible 2.7, when openssl_csr sees and empty file, it panics.
They don't work so nicely with ansible 2.7, they need exact timestamp and
calculating that is not fun. Fortunately they default to 10 years anyways.
Copy link
Contributor

@lucendio lucendio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you included everything related to metrics into this role, instead of putting it in a separate one, would it make sense to make it optional?

Can you elaborate on why the sft-monitoring-certs role came into existence?

Minor comments. Also, the two questions above remain.

roles/sft-monitoring-certs/tasks/main.yml Outdated Show resolved Hide resolved
roles/sft-monitoring-certs/tasks/ca-cert.yml Outdated Show resolved Hide resolved

# NOTE: defaults to 'inherit' which leads to error messages not
# appearing if nginx.conf is set to 'error_log: stderr'
StandardError=journal
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we sleeping? Is it to work around the bug that is linked? (I don't see why) or is it causing the bug that is linked?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised inherit doesn't work StandardOutput= defaults to journal and StandardError=inherit just means "Connect to the same fd as StandardOutput" Honestly sounds like a systemd bug to me. @flokli thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't added in this PR, but I fixed it anyways :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akshaymankar can you elaborate on why you removed this? The issue that you and I linked closes with Confirmed → Won't Fix. I experienced the issue just last week, which was when I introduced the suggested workaround.

@arianvp the line you commented on, was introduced by #18. Happy to discuss anything related over there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understood from the launchpad thread, this is marked Won't Fix for the ubuntu package only. It was fixed in the upstream according to this comment from over a year ago.

I could be wrong though. I did run the role and make sure it works. But as it is a race condition problem, I should've probably run this many times, which I didn't, so maybe I could run systemctl restart nginx a few times to make sure it works. To be sure, what version of nginx did you face this issue with?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nginx/1.18.0. The comment also seems to refer to the Ubuntu package, no? The role installs from Nginx directly. Did you spot a link to a commit/change in the thread, so we can see that the actual fix was?

This reverts commit 86b11d3.

To be discussed and done in a separate PR
Copy link
Contributor

@lucendio lucendio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last change I'd like to ask for: move the Prometheus collector port 8443 into a role variable in defaults/main.yml. Otherwise, LGTM.

roles/sft-server/tasks/traffic.yml Outdated Show resolved Hide resolved
Comment on lines 4 to 5
listen 8443 ssl http2;
listen [::]:8443 ssl http2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to a variable in defaults/main.yml.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved it to a constant. Let's make it overridable as we need to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

Moved it to a constant.

I'm the last one disagreeing with an extensive use of constants ;) I just had this line in the back of my head, which could be - when configured as part of the environment variables - feed into both, Terraform and the Ansible role. Anyway, later.

@akshaymankar akshaymankar merged commit 3f4f804 into develop Sep 10, 2020
@akshaymankar akshaymankar deleted the akshaymankar/node-exporter branch September 10, 2020 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants