Skip to content

Conversation

markgoddard
Copy link
Contributor

  • Ubuntu: bump OVS and OVN packages
  • Don't use interactive docker cmds in rabbitmq-reset.yml
  • Add Ubuntu image tags
  • Enable Ubuntu AIO CI
  • Add SMART Monitoring with dash and alerts
  • Increase job timeout for kolla image build GHA
  • Fix oom-killer graph
  • Rephrase the match logic for interfaces monitored for package drops
  • Add docs for SMART Monitoring
  • Add note on enabling standard configuration
  • Docs edit
  • Fail if the controller clocks are not synced
  • Bump cloudkitty tag
  • docs: Add current_series replacement and extlinks
  • docs: overview
  • Xena: batched release notes
  • docs: Add release notes into main docs
  • reno: Match on version-specific tag
  • docs: overcloud host image
  • docs: LVM
  • docs: swap
  • docs: Current branch variable
  • docs: update ci-aio & ci-builder prerequisites
  • README: link to rtd.io
  • docs: improve release train docs
  • docs: add info on how to build docs
  • docs: move environments index
  • docs: add info on generating release notes

markgoddard and others added 30 commits October 19, 2022 13:24
The Open vSwitch and OVN packages in Ubuntu Wallaby UCA repository are
quite old - 2.15 and 20.12 respectively. Pull in these packages from the
Yoga UCA, which are 2.17 and 22.03, to more closely match the CentOS
packages.
Remove --interactive and --tty args to docker exec commands in rabbitmq-reset.yml.
Ubuntu: always use kolla-ansible to install docker-ce repo
…cker

Don't use interactive docker cmds in rabbitmq-reset.yml
Enabled Textfile collector in node exporter in kolla/globals.yml

Added smartmon script as is from the prometheus-community github and then
removed NVME support from this script in favour of using the nvme-cli script,
which has also been added in. This is because the nvme-cli script provides
better metrics than the smartmon script does. The script also adds the serial
number of the disk as a label to all SMART metrics.

Added a Kayobe custom playbook to easily deploy the script and associated
cron job. This playbook installs smartmontool and nvmecli then copies these
over to the hosts and sets up a cronjob which runs the scripts and stores
the metrics in the docker volume for node exporter. The playbook changes
the way the metrics are saved to a file by making use of the mv command
as it is atomic. This was needed as at times prometheus would read a
partially completed file.

Added a prometheus alert to alert when a drive is reported as not healthy
for more than 10 minutes.

Added a Grafana dashboard to display the number of healthy and unhealthy
drives reported in prometheus.

(cherry picked from commit d83ecde)

Add docs for SMART Monitoring

(cherry picked from commit 595429a)

Update doc/source/configuration/monitoring.rst

Fix kayobe command

Co-authored-by: Will Szumski <will@stackhpc.com>
(cherry picked from commit 9a5fc53)

Update doc/source/configuration/monitoring.rst

Fix Spelling

Co-authored-by: Will Szumski <will@stackhpc.com>
(cherry picked from commit ef25d6f)

Add release note

(cherry picked from commit 3d4d011)

Amend docs and add release note

(cherry picked from commit b6cb511)

Move SMART prometheus alert to own file

(cherry picked from commit b353fd3)

Fix typo

(cherry picked from commit 611f2fb)

fixup
Changes the oom-killer graph from a smoothed irate to a discrete delta
function.

Change-Id: I2e4a8576c628610409ade4aad2bd98754bec3860
(cherry picked from commit ef1a449)
OVS bridge interfaces drop packets during normal operation.  Change
the regex to filter out interfaces that don't matter for packet
drops.

(cherry picked from commit 9c3f15a)
The cloudkitty image was missing our latest backports. Only Ubuntu
images have been rebuilt so far.
Xena: Add note on standard monitoring config usage
This allows us to reference the current release series in the documentation.
This seems to help reno collect the right notes.
Xena: include release notes in docs
@markgoddard markgoddard requested a review from a team as a code owner December 21, 2022 14:34
@markgoddard markgoddard self-assigned this Dec 21, 2022
@markgoddard markgoddard merged commit 3027ffc into stackhpc/yoga Dec 21, 2022
@markgoddard markgoddard deleted the yoga-xena-merge branch December 21, 2022 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants