Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add monitor to operator, fix monitoring setup #256

Merged
merged 65 commits into from
Apr 26, 2024

Conversation

mjnagel
Copy link
Contributor

@mjnagel mjnagel commented Mar 13, 2024

Description

This adds a new portion to the Package spec, monitor. This functionality is briefly documented here. Also included is a mutation for all service monitors to handle them in the expected istio mTLS way. This PR also fixes some missing dashboards and enables all default dashboards from the upstream kube-prometheus-stack chart (this may be excessive?).

Currently monitored:

  • Prometheus stack (operator, self, alertmanager)
  • Loki
  • kube-system things (kubelet, coredns, apiserver)
  • Promtail
  • Metrics-server
  • Velero
  • Keycloak
  • Grafana
  • All istio envoy proxies (podmonitor)

Not added here:

  • NeuVector: Currently has limited config options with regards to auth, keeping this disabled in anticipation of SSO on NeuVector, with a desire to contribute upstream to enable our use case of SSO-only auth.

In addition this PR switches single package testing to use/add Istio. This appears to add around 1 minute to each pipeline run, but:

  • allows us to test the istio VS endpoints in single package checks (some quirks with this in current state due to SSO redirects, but will allow us to do e2e testing in those pipelines in the future)
  • allows us to assume istio always and not build bespoke pepr code for the specific no-istio scenario (ex: prometheus can assume certs)

This would still allow someone to run locally without istio for some packages / un-inject, mess around with mTLS, etc - which may be useful still to identify where Istio is causing problems. This just switches our CI posture so that we assume istio always.

Related Issue

Fixes #17

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Other (security config, docs update, etc)

Checklist before merging

@mjnagel mjnagel self-assigned this Mar 13, 2024
UnicornChance
UnicornChance previously approved these changes Apr 24, 2024
@mjnagel mjnagel merged commit bf67722 into main Apr 26, 2024
21 checks passed
@mjnagel mjnagel deleted the operator-monitor-magic branch April 26, 2024 02:07
mjnagel pushed a commit that referenced this pull request Apr 30, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.21.0](v0.20.0...v0.21.0)
(2024-04-30)


### Features

* add `monitor` to operator, fix monitoring setup
([#256](#256))
([bf67722](bf67722))


### Bug Fixes

* loki s3 overrides
([#365](#365))
([3545066](3545066))
* update neuvector values for least privilege
([#373](#373))
([7f4de4f](7f4de4f))


### Miscellaneous

* add debug logging to endpointslice watch
([#359](#359))
([da3eb5a](da3eb5a))
* **deps:** update grafana to v7.3.9
([#353](#353))
([4a70f40](4a70f40))
* **deps:** update istio to v1.21.2
([#258](#258))
([51c6540](51c6540))
* **deps:** update keycloak
([#349](#349))
([2ef1813](2ef1813))
* **deps:** update keycloak to v0.4.2
([#375](#375))
([b0bb8e4](b0bb8e4))
* **deps:** update zarf to v0.33.1
([#368](#368))
([296e547](296e547))
* move api service watch to reconcile
([#362](#362))
([1822bca](1822bca))
* refactor promtail extraScrapeConfigs into scrapeConfigs
([#367](#367))
([2220272](2220272))
* trigger eks nightly when related files are updated
([#366](#366))
([6d6e4e0](6d6e4e0))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
rjferguson21 pushed a commit that referenced this pull request Jul 11, 2024
## Description

This adds a new portion to the `Package` spec, `monitor`. This
functionality is briefly documented
[here](https://github.com/defenseunicorns/uds-core/blob/operator-monitor-magic/docs/MONITOR.md).
Also included is a mutation for all service monitors to handle them in
the expected istio mTLS way. This PR also fixes some missing dashboards
and enables all default dashboards from the upstream
kube-prometheus-stack chart (this may be excessive?).

Currently monitored:
- Prometheus stack (operator, self, alertmanager)
- Loki
- kube-system things (kubelet, coredns, apiserver)
- Promtail
- Metrics-server
- Velero
- Keycloak
- Grafana
- All istio envoy proxies (podmonitor)

Not added here:
- NeuVector: Currently has limited config options with regards to auth,
keeping this disabled in anticipation of SSO on NeuVector, with a desire
to contribute upstream to enable our use case of SSO-only auth.

In addition this PR switches single package testing to use/add Istio.
This appears to add around 1 minute to each pipeline run, but:
- allows us to test the istio VS endpoints in single package checks
(some quirks with this in current state due to SSO redirects, but will
allow us to do e2e testing in those pipelines in the future)
- allows us to assume istio always and not build bespoke pepr code for
the specific no-istio scenario (ex: prometheus can assume certs)

This would still allow someone to run locally without istio for some
packages / un-inject, mess around with mTLS, etc - which may be useful
still to identify where Istio is causing problems. This just switches
our CI posture so that we assume istio always.

## Related Issue

Fixes #17

## Type of change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Other (security config, docs update, etc)

## Checklist before merging

- [x] Test, docs, adr added or updated as needed
- [x] [Contributor Guide
Steps](https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md)(https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md#submitting-a-pull-request)
followed

---------

Co-authored-by: Tristan Holaday <40547442+TristanHoladay@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
rjferguson21 pushed a commit that referenced this pull request Jul 11, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.21.0](v0.20.0...v0.21.0)
(2024-04-30)


### Features

* add `monitor` to operator, fix monitoring setup
([#256](#256))
([bf67722](bf67722))


### Bug Fixes

* loki s3 overrides
([#365](#365))
([3545066](3545066))
* update neuvector values for least privilege
([#373](#373))
([7f4de4f](7f4de4f))


### Miscellaneous

* add debug logging to endpointslice watch
([#359](#359))
([da3eb5a](da3eb5a))
* **deps:** update grafana to v7.3.9
([#353](#353))
([4a70f40](4a70f40))
* **deps:** update istio to v1.21.2
([#258](#258))
([51c6540](51c6540))
* **deps:** update keycloak
([#349](#349))
([2ef1813](2ef1813))
* **deps:** update keycloak to v0.4.2
([#375](#375))
([b0bb8e4](b0bb8e4))
* **deps:** update zarf to v0.33.1
([#368](#368))
([296e547](296e547))
* move api service watch to reconcile
([#362](#362))
([1822bca](1822bca))
* refactor promtail extraScrapeConfigs into scrapeConfigs
([#367](#367))
([2220272](2220272))
* trigger eks nightly when related files are updated
([#366](#366))
([6d6e4e0](6d6e4e0))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle Prometheus metrics + Istio mTLS
4 participants