From d421c851456e599d73676a743624883e7c46627a Mon Sep 17 00:00:00 2001 From: Leif Madsen Date: Thu, 1 Dec 2022 11:29:01 -0500 Subject: [PATCH 01/17] Add procedure to disable services on OSP side (#407) (#410) * Add procedure to disable services on OSP side Add a procedure that disables the services provisioned when enabling STF. Resolves: rhbz#2096853 * Add warning to not use procedure with gnocchi Add a warning to not use the disable procedure when making use of the Sending metrics to Gnocchi and Service Telemetry Framework procedure since not all dependencies are provided as part of that instruction set, since they are a super-set of the STF deployment instructions. In the future we should probably just remove the Gnocchi deployment instructions since we've re-written the autoscaling guide and that is the one procedure that would provide Gnocchi deployments, and would contain all the necessary dependencies in the provided THT environment files. * Apply suggestions from code review Small changes from jof * Update doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> * Update doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> * Update doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> (cherry picked from commit 20d6f11497c0b27053ac9a96369a225bb4187d37) --- ...mbly_completing-the-stf-configuration.adoc | 1 + ...ling-openstack-services-used-with-stf.adoc | 57 +++++++++++++++++++ 2 files changed, 58 insertions(+) create mode 100644 doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc diff --git a/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc b/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc index 86271296..79fab438 100644 --- a/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc +++ b/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc @@ -37,6 +37,7 @@ include::../modules/proc_creating-the-base-configuration-for-stf.adoc[leveloffse include::../modules/proc_configuring-the-stf-connection-for-the-overcloud.adoc[leveloffset=+2] include::../modules/proc_deploying-the-overcloud.adoc[leveloffset=+2] include::../modules/proc_validating-clientside-installation.adoc[leveloffset=+2] +include::../modules/proc_disabling-openstack-services-used-with-stf.adoc[leveloffset=+1] //Sending metrics to Gnocchi and to STF ifdef::include_when_16[] diff --git a/doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc b/doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc new file mode 100644 index 00000000..374e9864 --- /dev/null +++ b/doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc @@ -0,0 +1,57 @@ +[id="disabling-openstack-services-used-with-stf_{context}"] += Disabling {OpenStack} services used with {Project} + +[role="_abstract"] +Disable the services used when deploying {OpenStack} ({OpenStackShort}) and connecting it to {Project} ({ProjectShort}). There is no removal of logs or generated configuration files as part of the disablement of the services. + +[WARNING] +Do not use this procedure when also using the xref:sending-metrics-to-gnocchi-and-to-stf_assembly-completing-the-stf-configuration[] procedure because the `gnocchi-connectors.yaml` does not contain all dependencies required. If you want to remove {ProjectShort}-related services on {OpenStackShort}, ensure that you update your environment to enable data collection and data storage dependencies. + +.Procedure + +. Log in to the {OpenStackShort} undercloud as the `stack` user. + +. Source the authentication file: ++ +[source,bash] +---- +[stack@undercloud-0 ~]$ source stackrc + +(undercloud) [stack@undercloud-0 ~]$ +---- + +. Create the `disable-stf.yaml` environment file: ++ +[source,yaml,options="nowrap"] +---- +(undercloud) [stack@undercloud-0]$ cat > $HOME/disable-stf.yaml <_ +--templates /usr/share/openstack-tripleo-heat-templates \ + --environment-file /home/stack/disable-stf.yaml + --environment-file __ \ +---- From 660228df33ddad959355cce5caf756837fce838a Mon Sep 17 00:00:00 2001 From: Leif Madsen Date: Thu, 1 Dec 2022 11:32:47 -0500 Subject: [PATCH 02/17] Reference stable-1.5 channel and nightly-1.5 tags (#408) Update the upstream portions of the documentation to refer to the new stable-1.5 channels which are part of the new infrawatch-catalog index image with the nightly-1.5 tag, which references built artifacts created from the stable-1.5 branch of the various STF components. --- .../proc_deploying-stf-to-the-openshift-environment.adoc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc-Service-Telemetry-Framework/modules/proc_deploying-stf-to-the-openshift-environment.adoc b/doc-Service-Telemetry-Framework/modules/proc_deploying-stf-to-the-openshift-environment.adoc index 9819e6c1..2448a8eb 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_deploying-stf-to-the-openshift-environment.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_deploying-stf-to-the-openshift-environment.adoc @@ -46,7 +46,7 @@ metadata: namespace: openshift-marketplace spec: displayName: InfraWatch Operators - image: quay.io/infrawatch-operators/infrawatch-catalog:nightly + image: quay.io/infrawatch-operators/infrawatch-catalog:nightly-1.5 publisher: InfraWatch sourceType: grpc updateStrategy: @@ -254,7 +254,7 @@ metadata: name: smart-gateway-operator namespace: service-telemetry spec: - channel: unstable + channel: stable-1.5 installPlanApproval: Automatic name: smart-gateway-operator source: infrawatch-operators @@ -275,7 +275,7 @@ metadata: name: service-telemetry-operator namespace: service-telemetry spec: - channel: unstable + channel: stable-1.5 installPlanApproval: Automatic name: service-telemetry-operator source: infrawatch-operators From 087551972db763b4b64e5eb6200057dacc8133c8 Mon Sep 17 00:00:00 2001 From: Leif Madsen Date: Thu, 1 Dec 2022 15:03:10 -0500 Subject: [PATCH 03/17] Build and push documentation for STF 1.5 (#409) * Build and push documentation for STF 1.5 Update scripts and path references for STF 1.5 so that we can have upstream documentation for this release. Going forward major changes will happen in the main branch in preparation for STF 2.0. It will be good to have a current version of documentation for the stable-1.5 branch which will reference the new release automation changes that provides an upstream nightly build for stable-1.5 in its own index image. * Don't attempt to move non-existant file --- build_tools/ci.sh | 15 ++++++--------- doc-Service-Telemetry-Framework/Makefile | 8 ++++---- 2 files changed, 10 insertions(+), 13 deletions(-) diff --git a/build_tools/ci.sh b/build_tools/ci.sh index 44775009..dcf5338e 100755 --- a/build_tools/ci.sh +++ b/build_tools/ci.sh @@ -9,7 +9,7 @@ echo "--- installing dependencies" dnf install findutils git make ruby rubygems -y gem install --no-document --minimal-deps asciidoctor -# get the current working branch, if we're master, we'll end up pushing new docs +# get the current working branch, if we're stable-1.5, we'll end up pushing new docs echo "--- current working branch is $BRANCH" echo "--- building documentation" @@ -37,21 +37,18 @@ rm -rf images/ echo "--- moving built files into the top-level directory" touch .nojekyll mv build/doc-Service-Telemetry-Framework/* ./ -mv index-upstream.html index.html rm -rf build/ -# Add everything, get ready for commit. But only do it if we're on -# master. If you want to deploy on different branches, you can change -# this. -if [[ "$BRANCH" =~ ^master$|^[0-9]+\.[0-9]+\.X$ ]]; then - echo "Branch is master, so pushing docs to gh-pages" +# Build this for stable-1.5 branch and push custom paths to gh-pages +if [[ "$BRANCH" =~ ^stable-1\.5$ ]]; then + echo "Branch is stable-1.5, so pushing docs to gh-pages" git add --all - git commit -am '[ci skip] publishing updated documentation...' + git commit -am '[ci skip] publishing updated documentation for STF 1.5...' git remote rm origin git remote add origin https://$GH_NAME:$GH_TOKEN@github.com/infrawatch/documentation.git git push origin gh-pages else - echo "Not on master, so won't push doc" + echo "Not on stable-1.5, so won't push doc" fi diff --git a/doc-Service-Telemetry-Framework/Makefile b/doc-Service-Telemetry-Framework/Makefile index 961db74b..d91623ef 100644 --- a/doc-Service-Telemetry-Framework/Makefile +++ b/doc-Service-Telemetry-Framework/Makefile @@ -3,10 +3,10 @@ BUILD_DIR = ../build ROOTDIR = $(realpath .) NAME = $(notdir $(ROOTDIR)) DEST_DIR = $(BUILD_DIR)/$(NAME) -DEST_HTML = $(DEST_DIR)/index-$(BUILD).html -DEST_HTML_170 = $(DEST_DIR)/index-$(BUILD)-170.html -DEST_HTML_162 = $(DEST_DIR)/index-$(BUILD)-162.html -DEST_HTML_13 = $(DEST_DIR)/index-$(BUILD)-13.html +DEST_HTML = $(DEST_DIR)/index-1-5-$(BUILD).html +DEST_HTML_170 = $(DEST_DIR)/index-1-5-$(BUILD)-170.html +DEST_HTML_162 = $(DEST_DIR)/index-1-5-$(BUILD)-162.html +DEST_HTML_13 = $(DEST_DIR)/index-1-5-$(BUILD)-13.html DEST_PDF = $(BUILD_DIR)/$(NAME)-$(BUILD).pdf IMAGES_DIR = $(DEST_DIR)/images IMAGES_TS = $(DEST_DIR)/.timestamp-images From eff778c83257bcd1b0df5c21c34cc6bd0f54f5e4 Mon Sep 17 00:00:00 2001 From: Leif Madsen Date: Fri, 2 Dec 2022 16:23:19 -0500 Subject: [PATCH 04/17] Don't remove existing 'master' generated files (#411) --- build_tools/ci.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/build_tools/ci.sh b/build_tools/ci.sh index dcf5338e..b642fc2b 100755 --- a/build_tools/ci.sh +++ b/build_tools/ci.sh @@ -28,8 +28,8 @@ git config --global user.email "$GH_EMAIL" > /dev/null 2>&1 git config --global user.name "$GH_NAME" > /dev/null 2>&1 # Remove all files that are not in the .git dir -echo "--- removing all files" -find . -maxdepth 1 -not -wholename ".git/*" -type f -delete +echo "--- removing all files related to stable-1.5" +find . -maxdepth 1 -not -wholename ".git/*" -type f -not -wholename "./index.html" -not -wholename "./index-upstream*" -delete rm -rf images/ # We need this empty file for git not to try to build a jekyll project. From 5c89edf53936ba5bdd64f766e405887ce32c173e Mon Sep 17 00:00:00 2001 From: Leif Madsen Date: Wed, 7 Dec 2022 08:30:32 -0500 Subject: [PATCH 05/17] Minor updates to dashboarding guide (#413) (#415) * Minor updates to dashboarding guide Perform some minor updates to the dashboarding guide, referencing existing dashboards we have for virtual machine and memcached views. Update the ServiceTelemetry manifest to reference the rhel8/grafana:7 container image which should provide more consistency in how things are deployed, helping avoid a situation where newer versions of Grafana out of hub.docker.com no longer interface with the version of Elasticsearch that would be used when enabling events support by default. Update the path to the dashboards being created to reference the 'stf-1/' directory to reduce confusion by no longer referencing stf-1.3 in the links. Depends-On: https://github.com/infrawatch/dashboards/pull/50 * Update doc-Service-Telemetry-Framework/modules/con_dashboards.adoc plural to single * Update doc-Service-Telemetry-Framework/modules/proc_setting-up-grafana-to-host-the-dashboard.adoc * Add note about STF 1.3 and revert path change Revert path changes to the dashboards as use of symlinks as intended is not possible. Instead, add a note stating that the reference to STF 1.3 is the earliest version compatible with the dashboards, and can be used with STF versions 1.3 through 1.5. Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> (cherry picked from commit e23a1b360e3981645bebc8048f67d7bae33279ac) --- .../modules/con_dashboards.adoc | 12 +++++++++--- .../modules/proc_importing-dashboards.adoc | 2 ++ ...roc_setting-up-grafana-to-host-the-dashboard.adoc | 3 ++- 3 files changed, 13 insertions(+), 4 deletions(-) diff --git a/doc-Service-Telemetry-Framework/modules/con_dashboards.adoc b/doc-Service-Telemetry-Framework/modules/con_dashboards.adoc index 4d929e1c..3381a761 100644 --- a/doc-Service-Telemetry-Framework/modules/con_dashboards.adoc +++ b/doc-Service-Telemetry-Framework/modules/con_dashboards.adoc @@ -2,13 +2,13 @@ = Dashboards in {Project} [role="_abstract"] -Use the third-party application, Grafana, to visualize system-level metrics that collectd and Ceilometer gathers for each individual host node. +Use the third-party application, Grafana, to visualize system-level metrics that the data collectors collectd and Ceilometer gather for each individual host node. -For more information about configuring collectd, see xref:configuring-red-hat-openstack-platform-overcloud-for-stf_assembly-completing-the-stf-configuration[]. +For more information about configuring data collectors, see xref:configuring-red-hat-openstack-platform-overcloud-for-stf_assembly-completing-the-stf-configuration[]. ifdef::include_when_16[] //TODO: can re-work this once we have OSP13 dashboard(s) to show. Can't use container health checks or monitoring in OSP13. -You can use two dashboards to monitor a cloud: +You can use dashboards to monitor a cloud: Infrastructure dashboard:: Use the infrastructure dashboard to view metrics for a single node at a time. Select a node from the upper left corner of the dashboard. @@ -17,4 +17,10 @@ Cloud view dashboard:: Use the cloud view dashboard to view panels to monitor service resource usage, API stats, and cloud events. You must enable API health monitoring and service monitoring to provide the data for this dashboard. API health monitoring is enabled by default in the {ProjectShort} base configuration. For more information, see xref:creating-the-base-configuration-for-stf_assembly-completing-the-stf-configuration[]. ** For more information about API health monitoring, see xref:container-health-and-api-status_assembly-advanced-features[]. ** For more information about {OpenStackShort} service monitoring, see xref:resource-usage-of-openstack-services_assembly-advanced-features[]. + +Virtual machine view dashboard:: +Use the virtual machine view dashboard to view panels to monitor virtual machine infrastructure usage. Select a cloud and project from the upper left corner of the dashboard. + +Memcached view dashboard:: +Use the memcached view dashboard to view panels to monitor connections, availability, system metrics and cache performance. Select a cloud from the upper left corner of the dashboard. endif::include_when_16[] diff --git a/doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc b/doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc index e13bebf0..d9f7dc29 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc @@ -7,6 +7,8 @@ The Grafana Operator can import and manage dashboards by creating `GrafanaDashbo .Procedure +NOTE: The paths and dashboards names refer to STF 1.3 which is the earliest version of STF the dashboards are compatible and can be used with STF versions 1.3 through 1.5. + . Import the infrastructure dashboard: + [source,bash,options="nowrap"] diff --git a/doc-Service-Telemetry-Framework/modules/proc_setting-up-grafana-to-host-the-dashboard.adoc b/doc-Service-Telemetry-Framework/modules/proc_setting-up-grafana-to-host-the-dashboard.adoc index bf6d58bf..949cccec 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_setting-up-grafana-to-host-the-dashboard.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_setting-up-grafana-to-host-the-dashboard.adoc @@ -45,7 +45,7 @@ NAME DISPLAY VERSION REPLACES grafana-operator.v4.6.0 Grafana Operator 4.6.0 grafana-operator.v4.5.1 Succeeded ---- -. To launch a Grafana instance, create or modify the `ServiceTelemetry` object. Set `graphing.enabled` and `graphing.grafana.ingressEnabled` to `true`: +. To launch a Grafana instance, create or modify the `ServiceTelemetry` object. Set `graphing.enabled` and `graphing.grafana.ingressEnabled` to `true`. Optionally, set the value of `graphing.grafana.baseImage` to the Grafana workload container image that will be deployed: + [source,bash] ---- @@ -60,6 +60,7 @@ spec: enabled: true grafana: ingressEnabled: true + baseImage: 'registry.redhat.io/rhel8/grafana:7' ---- . Verify that the Grafana instance deployed: From fb33ce2832e3da215439aa22bb3d512d92a18e99 Mon Sep 17 00:00:00 2001 From: Leif Madsen Date: Wed, 7 Dec 2022 14:35:04 -0500 Subject: [PATCH 06/17] Fix syntax error in certificate renewal module (#416) (#417) * Fix syntax error in certificate renewal module Fix a syntax error in the certificate renewal module. Fixing this results in another issue that was hidden, whereby extra source lines are shown when building for version 17.0 due to loose version ranges. Unfortunately asciidoc doesn't provide an AND function in an ifeval or ifdef so we need separate parameters defined and nested to perform what is effectively a greater-than AND less-than evaluation. This was caught by QE when identifying a link that didn't have a corresponding section being built. The syntax error on the endif resulted in everything after that not being built, but did not result in a build error oddly enough. With this fix, everything is working as intended and included assemblies after this one are now visible. * Update ifdef to use AND syntax Update ifdef to use AND syntax per https://docs.asciidoctor.org/asciidoc/latest/directives/ifdef-ifndef/#checking-multiple-attributes (cherry picked from commit 1a6cecf44316f9a944c654b28f78101a75b08370) --- common/global/stf-attributes.adoc | 4 ++++ .../proc_updating-the-amq-interconnect-ca-certificate.adoc | 7 +++---- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/common/global/stf-attributes.adoc b/common/global/stf-attributes.adoc index 5dab06f5..84064f5d 100644 --- a/common/global/stf-attributes.adoc +++ b/common/global/stf-attributes.adoc @@ -16,6 +16,10 @@ ifeval::[{vernum} < 16.0] :include_when_13: endif::[] +ifeval::[{vernum} < 17.0] +:include_before_17: +endif::[] + ifeval::[{vernum} >= 17.0] :include_when_17: endif::[] diff --git a/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc b/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc index 7d5c4ef6..2ee8529a 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc @@ -41,13 +41,12 @@ ifdef::include_when_13[] [stack@undercloud-0 ~]$ tripleo-ansible-inventory --static-yaml-inventory ./tripleo-ansible-inventory.yaml [stack@undercloud-0 ~]$ ansible -i tripleo-ansible-inventory.yaml allovercloud -m shell -a "sudo podman restart metrics_qdr" ---- -endif:include_when_13[] -ifdef::include_when_16[] +endif::include_when_13[] +ifdef::include_when_16+include_before_17[] ---- [stack@undercloud-0 ~]$ ansible -i tripleo-ansible-inventory.yaml allovercloud -m shell -a "sudo podman restart metrics_qdr" ---- -endif::include_when_16[] - +endif::include_when_16+include_before_17[] ifdef::include_when_17[] ---- [stack@undercloud-0 ~]$ ansible -i overcloud-deploy/overcloud/tripleo-ansible-inventory.yaml allovercloud -m shell -a "sudo podman restart metrics_qdr" From bcf1a21990230408d3ae4dd0cfe13c6eb72b8804 Mon Sep 17 00:00:00 2001 From: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> Date: Wed, 7 Dec 2022 20:16:01 +0000 Subject: [PATCH 07/17] =?UTF-8?q?Removed=20sending-metrics-to-gnocchi-and-?= =?UTF-8?q?to-stf=20m=E2=80=A6=20(#414)=20(#419)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Removed xrefs and include for sending-metrics-to-gnocchi-and-to-stf module * New folder named departure to hold files before deletion, updated gitignore file * I removed an assembly xref from additional resources section so that build is 100% * Removed upgrade section * Added link at assembly level re disabling services Co-authored-by: Leif Madsen Co-authored-by: Leif Madsen --- .gitignore | 1 + ...mbly_completing-the-stf-configuration.adoc | 9 +-- ...installing-the-core-components-of-stf.adoc | 2 +- doc-Service-Telemetry-Framework/master.adoc | 2 +- ...eating-the-base-configuration-for-stf.adoc | 4 -- ...ling-openstack-services-used-with-stf.adoc | 3 - ...sending-metrics-to-gnocchi-and-to-stf.adoc | 71 ------------------- 7 files changed, 4 insertions(+), 88 deletions(-) delete mode 100644 doc-Service-Telemetry-Framework/modules/proc_sending-metrics-to-gnocchi-and-to-stf.adoc diff --git a/.gitignore b/.gitignore index 567609b1..0a9f24ca 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1,2 @@ build/ +departure/ diff --git a/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc b/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc index 79fab438..301c42b2 100644 --- a/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc +++ b/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc @@ -19,9 +19,7 @@ To collect metrics, events, or both, and to send them to the {Project} ({Project ** To deploy data collection and transport to {ProjectShort} on {OpenStackShort} cloud nodes that employ routed L3 domains, such as distributed compute node (DCN) or spine-leaf, see xref:deploying-to-non-standard-network-topologies_assembly-completing-the-stf-configuration[]. -ifdef::include_when_16[] -** To send metrics to both Gnocchi and {ProjectShort}, see xref:sending-metrics-to-gnocchi-and-to-stf_assembly-completing-the-stf-configuration[]. -endif::include_when_16[] +** To disable the data collector services, see xref:disabling-openstack-services-used-with-stf_assembly-completing-the-stf-configuration[]. ifdef::include_when_13[] ** If you synchronized container images to a local registry, you must create an environment file and include the paths to the container images. For more information, see xref:adding-container-images-to-the-undercloud_assembly-completing-the-stf-configuration[]. @@ -39,11 +37,6 @@ include::../modules/proc_deploying-the-overcloud.adoc[leveloffset=+2] include::../modules/proc_validating-clientside-installation.adoc[leveloffset=+2] include::../modules/proc_disabling-openstack-services-used-with-stf.adoc[leveloffset=+1] -//Sending metrics to Gnocchi and to STF -ifdef::include_when_16[] -include::../modules/proc_sending-metrics-to-gnocchi-and-to-stf.adoc[leveloffset=+1] -endif::include_when_16[] - // Gather information for deployment in non-standard network topologies in the OSP overcloud include::../modules/proc_deploying-to-non-standard-network-topologies.adoc[leveloffset=+1] diff --git a/doc-Service-Telemetry-Framework/assemblies/assembly_installing-the-core-components-of-stf.adoc b/doc-Service-Telemetry-Framework/assemblies/assembly_installing-the-core-components-of-stf.adoc index b31756b5..0413d78e 100644 --- a/doc-Service-Telemetry-Framework/assemblies/assembly_installing-the-core-components-of-stf.adoc +++ b/doc-Service-Telemetry-Framework/assemblies/assembly_installing-the-core-components-of-stf.adoc @@ -39,7 +39,7 @@ endif::[] .Additional resources * For more information about Operators, see the https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/operators/understanding/olm-what-operators-are.html[_Understanding Operators_] guide. -* For more information about how to remove {ProjectShort} from the {OpenShift} environment, see xref:assembly-removing-stf-from-the-openshift-environment_assembly[]. +//* For more information about how to remove {ProjectShort} from the {OpenShift} environment, see xref:assembly-removing-stf-from-the-openshift-environment_{}[]. include::../modules/proc_deploying-stf-to-the-openshift-environment.adoc[leveloffset=+1] include::../modules/proc_creating-a-servicetelemetry-object-in-openshift.adoc[leveloffset=+1] diff --git a/doc-Service-Telemetry-Framework/master.adoc b/doc-Service-Telemetry-Framework/master.adoc index 1a7764d0..e836b7a4 100644 --- a/doc-Service-Telemetry-Framework/master.adoc +++ b/doc-Service-Telemetry-Framework/master.adoc @@ -38,7 +38,7 @@ include::assemblies/assembly_advanced-features.adoc[leveloffset=+1] include::assemblies/assembly_renewing-the-amq-interconnect-certificate.adoc[leveloffset=+1] // upgrading to 1.4 -include::assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-4.adoc[leveloffset=+1] +//include::assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-4.adoc[leveloffset=+1] // removing include::assemblies/assembly_removing-stf-from-the-openshift-environment.adoc[leveloffset=+1] diff --git a/doc-Service-Telemetry-Framework/modules/proc_creating-the-base-configuration-for-stf.adoc b/doc-Service-Telemetry-Framework/modules/proc_creating-the-base-configuration-for-stf.adoc index 79dbe600..c52e383d 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_creating-the-base-configuration-for-stf.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_creating-the-base-configuration-for-stf.adoc @@ -13,10 +13,6 @@ To configure the base parameters to provide a compatible data collection and tra [IMPORTANT] ==== Setting `EventPipelinePublishers` and `PipelinePublishers` to empty lists results in no event or metric data passing to {OpenStackShort} telemetry components, such as Gnocchi or Panko. If you need to send data to additional pipelines, the Ceilometer polling interval of 30 seconds, as specified in `ExtraConfig`, might overwhelm the {OpenStackShort} telemetry components, and you must increase the interval to a larger value, such as `300`. Increasing the value to a longer polling interval results in less telemetry resolution in {ProjectShort}. - -ifdef::include_when_16[] -To enable collection of telemetry with {ProjectShort} and Gnocchi, see xref:sending-metrics-to-gnocchi-and-to-stf_assembly-completing-the-stf-configuration[] -endif::include_when_16[] ==== + .enable-stf.yaml diff --git a/doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc b/doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc index 374e9864..5f6edaf1 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc @@ -4,9 +4,6 @@ [role="_abstract"] Disable the services used when deploying {OpenStack} ({OpenStackShort}) and connecting it to {Project} ({ProjectShort}). There is no removal of logs or generated configuration files as part of the disablement of the services. -[WARNING] -Do not use this procedure when also using the xref:sending-metrics-to-gnocchi-and-to-stf_assembly-completing-the-stf-configuration[] procedure because the `gnocchi-connectors.yaml` does not contain all dependencies required. If you want to remove {ProjectShort}-related services on {OpenStackShort}, ensure that you update your environment to enable data collection and data storage dependencies. - .Procedure . Log in to the {OpenStackShort} undercloud as the `stack` user. diff --git a/doc-Service-Telemetry-Framework/modules/proc_sending-metrics-to-gnocchi-and-to-stf.adoc b/doc-Service-Telemetry-Framework/modules/proc_sending-metrics-to-gnocchi-and-to-stf.adoc deleted file mode 100644 index 321cb527..00000000 --- a/doc-Service-Telemetry-Framework/modules/proc_sending-metrics-to-gnocchi-and-to-stf.adoc +++ /dev/null @@ -1,71 +0,0 @@ -[id="sending-metrics-to-gnocchi-and-to-stf_{context}"] -= Sending metrics to Gnocchi and {Project} - -[role="_abstract"] - -To send metrics to {Project} ({ProjectShort}) and Gnocchi simultaneously, you must include an environment file in your deployment to enable an additional publisher. - -[WARNING] -If you need to send data to additional pipelines, the Ceilometer polling interval of 30 seconds, as specified in `ExtraConfig`, might overwhelm the {OpenStackShort} telemetry components, and you must increase the interval to a larger value, such as `300`. Increasing the value to a longer polling interval results in less telemetry resolution in {ProjectShort}. - -.Prerequisites - -* You have created a file that contains the connection configuration of the {MessageBus} for the overcloud to {ProjectShort}. For more information, see xref:configuring-the-stf-connection-for-the-overcloud_assembly-completing-the-stf-configuration[]. - -.Procedure - -. Create an environment file named `gnocchi-connectors.yaml` in the `/home/stack` directory. -+ -[source,yaml,options="nowrap",subs="none"] ----- -resource_registry: - OS::TripleO::Services::GnocchiApi: /usr/share/openstack-tripleo-heat-templates/deployment/gnocchi/gnocchi-api-container-puppet.yaml - OS::TripleO::Services::GnocchiMetricd: /usr/share/openstack-tripleo-heat-templates/deployment/gnocchi/gnocchi-metricd-container-puppet.yaml - OS::TripleO::Services::GnocchiStatsd: /usr/share/openstack-tripleo-heat-templates/deployment/gnocchi/gnocchi-statsd-container-puppet.yaml - OS::TripleO::Services::AodhApi: /usr/share/openstack-tripleo-heat-templates/deployment/aodh/aodh-api-container-puppet.yaml - OS::TripleO::Services::AodhEvaluator: /usr/share/openstack-tripleo-heat-templates/deployment/aodh/aodh-evaluator-container-puppet.yaml - OS::TripleO::Services::AodhNotifier: /usr/share/openstack-tripleo-heat-templates/deployment/aodh/aodh-notifier-container-puppet.yaml - OS::TripleO::Services::AodhListener: /usr/share/openstack-tripleo-heat-templates/deployment/aodh/aodh-listener-container-puppet.yaml - -parameter_defaults: - CeilometerEnableGnocchi: true - CeilometerEnablePanko: false - GnocchiArchivePolicy: 'high' - GnocchiBackend: 'rbd' - GnocchiRbdPoolName: 'metrics' - - EventPipelinePublishers: ['gnocchi://?filter_project=service'] - PipelinePublishers: ['gnocchi://?filter_project=service'] ----- - -. Add the environment file `gnocchi-connectors.yaml` to the deployment command. Replace __ with files that are applicable to your environment. -+ -[source,bash,options="nowrap",subs="+quotes"] ----- -$ openstack overcloud deploy __ ---templates /usr/share/openstack-tripleo-heat-templates \ - --environment-file _<...other_environment_files...>_ \ - --environment-file /usr/share/openstack-tripleo-heat-templates/environments/metrics/ceilometer-write-qdr.yaml \ - --environment-file /usr/share/openstack-tripleo-heat-templates/environments/metrics/collectd-write-qdr.yaml \ - --environment-file /usr/share/openstack-tripleo-heat-templates/environments/metrics/qdr-edge-only.yaml \ - --environment-file /home/stack/enable-stf.yaml \ - --environment-file /home/stack/stf-connectors.yaml \ - --environment-file /home/stack/gnocchi-connectors.yaml ----- - -. To ensure that the configuration was successful, verify the content of the file `/var/lib/config-data/puppet-generated/ceilometer/etc/ceilometer/pipeline.yaml` on a Controller node. Ensure that the `publishers` section of the file contains information for both `notifier` and `Gnocchi`. -+ -[source,yaml,options="nowrap"] ----- -sources: - - name: meter_source - meters: - - "*" - sinks: - - meter_sink -sinks: - - name: meter_sink - publishers: - - gnocchi://?filter_project=service - - notifier://172.17.1.35:5666/?driver=amqp&topic=metering ----- From e5dcfab39c51804a5177eb03c12ad8b0704e5880 Mon Sep 17 00:00:00 2001 From: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> Date: Mon, 12 Dec 2022 13:51:58 +0000 Subject: [PATCH 08/17] Jof mas minor edits 1.5 (#421) (#422) * changes to Primary parameters of the ServiceTelemetry object * Minor edits mostly reducing future tense * Apply suggestions from code review Routers -> dispatch routers --- ...rimary-parameters-of-the-servicetelemetry-object.adoc | 9 ++------- ...g-for-an-expired-amq-interconnect-ca-certificate.adoc | 7 ++++--- ...roc_updating-the-amq-interconnect-ca-certificate.adoc | 9 +++++---- 3 files changed, 11 insertions(+), 14 deletions(-) diff --git a/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc b/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc index bdae3db3..fcf223c1 100644 --- a/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc +++ b/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc @@ -14,11 +14,6 @@ The `ServiceTelemetry` object comprises the following primary configuration para You can configure each of these configuration parameters to provide different features in an {ProjectShort} deployment. -[IMPORTANT] -==== -Support for `servicetelemetry.infra.watch/v1alpha1` was removed from {ProjectShort} 1.3. -==== - [id="backends_{context}"] [discrete] == The backends parameter @@ -129,7 +124,7 @@ Use the `pvcStorageRequest` parameter to define the minimum required volume size .Procedure -* List the available storage classes: +. List the available storage classes: + [source,bash,options="nowrap"] ---- @@ -140,7 +135,7 @@ standard (default) kubernetes.io/cinder Delete WaitForFirstCons standard-csi cinder.csi.openstack.org Delete WaitForFirstConsumer true 20h ---- -* Configure the `ServiceTelemetry` object: +. Configure the `ServiceTelemetry` object: + [source,yaml] ---- diff --git a/doc-Service-Telemetry-Framework/modules/proc_checking-for-an-expired-amq-interconnect-ca-certificate.adoc b/doc-Service-Telemetry-Framework/modules/proc_checking-for-an-expired-amq-interconnect-ca-certificate.adoc index c11bd787..f8e9c1ed 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_checking-for-an-expired-amq-interconnect-ca-certificate.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_checking-for-an-expired-amq-interconnect-ca-certificate.adoc @@ -2,7 +2,7 @@ = Checking for an expired {MessageBus} CA certificate [role="_abstract"] -When the CA certificate expires the {MessageBus} connections will remain up, but will be unable to reconnect if they are interrupted. Eventually you will find that some or all of the connections from your {Openstack} ({OpenStackShort}) Routers have failed, showing errors on both sides, and the expiry (or "Not After") field in your CA certificate will be in the past. +When the CA certificate expires, the {MessageBus} connections remain up, but cannot reconnect if they are interrupted. Eventually, some or all of the connections from your {Openstack} ({OpenStackShort}) dispatch routers fail, showing errors on both sides, and the expiry or *Not After* field in your CA certificate is in the past. .Procedure @@ -14,7 +14,7 @@ When the CA certificate expires the {MessageBus} connections will remain up, but $ oc project service-telemetry ---- -. Check that some or all Router connections have failed: +. Verify that some or all dispatch router connections have failed: + [source,bash,options="nowrap"] ---- @@ -32,7 +32,8 @@ $ oc logs -l application=default-interconnect | tail ---- . Log into your {OpenStackShort} undercloud. -. Check for this error in the {OpenStackShort}-hosted {MessageBus} logs of a node where the connection has failed: + +. Check for this error in the {OpenStackShort}-hosted {MessageBus} logs of a node with a failed connection: + [source,bash,options="nowrap"] ---- diff --git a/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc b/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc index 2ee8529a..22cb96c5 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc @@ -2,7 +2,7 @@ = Updating the {MessageBus} CA certificate [role="_abstract"] -To update the {MessageBus} certificate, you will need to export it from {OpenShift} and copy it to your {OpenStack} ({OpenStackShort}) nodes. +To update the {MessageBus} certificate, you must export it from {OpenShift} and copy it to your {OpenStack} ({OpenStackShort}) nodes. .Procedure @@ -21,13 +21,14 @@ $ oc project service-telemetry $ oc get secret/default-interconnect-selfsigned -o jsonpath='{.data.ca\.crt}' | base64 -d > STFCA.pem ---- -. Copy STFCA.pem to your {OpenStackShort} undercloud. +. Copy `STFCA.pem` to your {OpenStackShort} undercloud. . Log into your {OpenStackShort} undercloud. -. Edit the stf-connectors.yaml file to contain the new caCertFileContent (For more information, see xref:configuring-the-stf-connection-for-the-overcloud_assembly-completing-the-stf-configuration[]) +. Edit the `stf-connectors.yaml` file to contain the new caCertFileContent. For more information, see xref:configuring-the-stf-connection-for-the-overcloud_assembly-completing-the-stf-configuration[]. ++ [NOTE] You do not need to perform an overcloud deploy after performing the steps below. We edit the stf-connectors.yaml file only to make sure that future deployments will not overwrite the new CA certificate. -. Copy the STFCA.pem file to each {OpenStackShort} overcloud node: +. Copy the `STFCA.pem` file to each {OpenStackShort} overcloud node: + [source,bash,options="nowrap"] ---- From b8699020fd613cd7bcbad4c9de3661d97a627574 Mon Sep 17 00:00:00 2001 From: Leif Madsen Date: Thu, 15 Dec 2022 09:00:25 -0500 Subject: [PATCH 09/17] Update link to STF life cycle page (#423) (#424) Update the link to the STF life cycle page and remove reference to supporting the two most recent versions of STF. This has recently changed. STF 1.4 will be supported until the EOL of OpenShift 4.8 at which point STF 1.4 will also be EOL. In the meantime STF 1.4 is now in maintenance mode (CVE fixes only, no backport of features). STF 1.5 is supported as of 4.10 and will EOL with RHOSP 17.1. (cherry picked from commit 2e3fd4b819e2877468c1a2501c6a620d1691caa7) --- .../modules/con_support-for-stf.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc-Service-Telemetry-Framework/modules/con_support-for-stf.adoc b/doc-Service-Telemetry-Framework/modules/con_support-for-stf.adoc index c32b43e9..bfb3ebbc 100644 --- a/doc-Service-Telemetry-Framework/modules/con_support-for-stf.adoc +++ b/doc-Service-Telemetry-Framework/modules/con_support-for-stf.adoc @@ -2,8 +2,8 @@ = Support for {Project} [role="_abstract"] -Red Hat supports the two most recent versions of {Project} ({ProjectShort}). Earlier versions are not supported. For more information, see the https://access.redhat.com/articles/5662081[{Project} Supported Version Matrix]. - Red Hat supports the core Operators and workloads, including {MessageBus}, Service Telemetry Operator, and Smart Gateway Operator. Red Hat does not support the community Operators or workload components, such as Elasticsearch, Prometheus, Alertmanager, Grafana, and their Operators. You can only deploy {ProjectShort} in a fully connected network environment. You cannot deploy {ProjectShort} in {OpenShift}-disconnected environments or network proxy environments. + +For more information about {ProjectShort} life cycle and support status, see the https://access.redhat.com/node/6225361[{Project} Supported Version Matrix]. From 9a1654a43d5709aedb127e9e6cf76299834511cf Mon Sep 17 00:00:00 2001 From: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> Date: Fri, 13 Jan 2023 14:28:03 +0000 Subject: [PATCH 10/17] updated link (#429) --- .../assemblies/assembly_introduction-to-stf.adoc | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/doc-Service-Telemetry-Framework/assemblies/assembly_introduction-to-stf.adoc b/doc-Service-Telemetry-Framework/assemblies/assembly_introduction-to-stf.adoc index 298a7fc7..006a62a5 100644 --- a/doc-Service-Telemetry-Framework/assemblies/assembly_introduction-to-stf.adoc +++ b/doc-Service-Telemetry-Framework/assemblies/assembly_introduction-to-stf.adoc @@ -37,10 +37,9 @@ endif::[] .Additional resources -* For more information about how to deploy {OpenShift}, see the https://access.redhat.com/documentation/en-us/openshift_container_platform/{NextSupportedOpenShiftVersion}/[{OpenShift} product documentation]. -* You can install {OpenShift} on cloud platforms or on bare metal. For more information about {ProjectShort} performance and scaling, see https://access.redhat.com/articles/4907241. - -* You can install {OpenShift} on bare metal or other supported cloud platforms. For more information about installing {OpenShift}, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/welcome/index.html#cluster-installer-activities[OpenShift Container Platform {NextSupportedOpenShiftVersion} Documentation]. +* https://access.redhat.com/documentation/en-us/openshift_container_platform/{NextSupportedOpenShiftVersion}/[{OpenShift} product documentation] +* https://access.redhat.com/articles/4907241[Service Telemetry Framework Performance and Scaling] +* https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/welcome/index.html#cluster-installer-activities[OpenShift Container Platform {NextSupportedOpenShiftVersion} Documentation] From 0e75679441a93f0e11ff9e1fbc0d09b360742cb8 Mon Sep 17 00:00:00 2001 From: Chris Sibbitt Date: Thu, 19 Jan 2023 16:22:12 -0500 Subject: [PATCH 11/17] Eliminate mentions of sensubility in OSP13 (#431) (#435) (cherry picked from commit 251a2c20c281bdb98518eba54f03453d7051731b) --- ...-parameters-of-the-servicetelemetry-object.adoc | 14 ++++++++++++++ .../proc_configuring-observability-strategy.adoc | 2 ++ .../proc_configuring-openshift-monitoring.adoc | 14 ++++++++++++++ ...uring-the-stf-connection-for-the-overcloud.adoc | 5 +++++ ...ing-a-servicetelemetry-object-in-openshift.adoc | 4 ++++ 5 files changed, 39 insertions(+) diff --git a/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc b/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc index fcf223c1..f5df5cb0 100644 --- a/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc +++ b/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc @@ -163,9 +163,16 @@ spec: Use the `clouds` parameter to define which Smart Gateway objects deploy, thereby providing the interface for multiple monitored cloud environments to connect to an instance of {ProjectShort}. If a supporting back end is available, then metrics and events Smart Gateways for the default cloud configuration are created. By default, the Service Telemetry Operator creates Smart Gateways for `cloud1`. +ifndef::include_when_13[] You can create a list of cloud objects to control which Smart Gateways are created for the defined clouds. Each cloud consists of data types and collectors. Data types are `metrics` or `events`. Each data type consists of a list of collectors, the message bus subscription address, and a parameter to enable debugging. Available collectors for metrics are `collectd`, `ceilometer`, and `sensubility`. Available collectors for events are `collectd` and `ceilometer`. Ensure that the subscription address for each of these collectors is unique for every cloud, data type, and collector combination. The default `cloud1` configuration is represented by the following `ServiceTelemetry` object, which provides subscriptions and data storage of metrics and events for collectd, Ceilometer, and Sensubility data collectors for a particular cloud instance: +endif::[] +ifdef::include_when_13[] +You can create a list of cloud objects to control which Smart Gateways are created for the defined clouds. Each cloud consists of data types and collectors. Data types are `metrics` or `events`. Each data type consists of a list of collectors, the message bus subscription address, and a parameter to enable debugging. Available collectors are `collectd`, and `ceilometer`. Ensure that the subscription address for each of these collectors is unique for every cloud, data type, and collector combination. + +The default `cloud1` configuration is represented by the following `ServiceTelemetry` object, which provides subscriptions and data storage of metrics and events for collectd, and data collectors for a particular cloud instance: +endif::[] [source,yaml] ---- @@ -183,9 +190,11 @@ spec: subscriptionAddress: collectd/telemetry - collectorType: ceilometer subscriptionAddress: anycast/ceilometer/metering.sample +ifndef::include_when_13[] - collectorType: sensubility subscriptionAddress: sensubility/telemetry debugEnabled: false +endif::[] events: collectors: - collectorType: collectd @@ -194,7 +203,12 @@ spec: subscriptionAddress: anycast/ceilometer/event.sample ---- +ifndef::include_when_13[] Each item of the `clouds` parameter represents a cloud instance. A cloud instance consists of three top-level parameters: `name`, `metrics`, and `events`. The `metrics` and `events` parameters represent the corresponding back end for storage of that data type. The `collectors` parameter specifies a list of objects made up of two required parameters, `collectorType` and `subscriptionAddress`, and these represent an instance of the Smart Gateway. The `collectorType` parameter specifies data collected by either collectd, Ceilometer, or Sensubility. The `subscriptionAddress` parameter provides the {MessageBus} address to which a Smart Gateway subscribes. +endif::[] +ifdef::include_when_13[] +Each item of the `clouds` parameter represents a cloud instance. A cloud instance consists of three top-level parameters: `name`, `metrics`, and `events`. The `metrics` and `events` parameters represent the corresponding back end for storage of that data type. The `collectors` parameter specifies a list of objects made up of two required parameters, `collectorType` and `subscriptionAddress`, and these represent an instance of the Smart Gateway. The `collectorType` parameter specifies data collected by either collectd, or Ceilometer. The `subscriptionAddress` parameter provides the {MessageBus} address to which a Smart Gateway subscribes. +endif::[] You can use the optional Boolean parameter `debugEnabled` within the `collectors` parameter to enable additional console debugging in the running Smart Gateway pod. diff --git a/doc-Service-Telemetry-Framework/modules/proc_configuring-observability-strategy.adoc b/doc-Service-Telemetry-Framework/modules/proc_configuring-observability-strategy.adoc index bb781061..ebf9536e 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_configuring-observability-strategy.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_configuring-observability-strategy.adoc @@ -31,7 +31,9 @@ $ oc get pods NAME READY STATUS RESTARTS AGE default-cloud1-ceil-meter-smartgateway-59c845d65b-gzhcs 3/3 Running 0 132m default-cloud1-coll-meter-smartgateway-75bbd948b9-d5phm 3/3 Running 0 132m +ifndef::include_when_13[] default-cloud1-sens-meter-smartgateway-7fdbb57b6d-dh2g9 3/3 Running 0 132m +endif::[] default-interconnect-668d5bbcd6-57b2l 1/1 Running 0 132m interconnect-operator-b8f5bb647-tlp5t 1/1 Running 0 47h service-telemetry-operator-566b9dd695-wkvjq 1/1 Running 0 156m diff --git a/doc-Service-Telemetry-Framework/modules/proc_configuring-openshift-monitoring.adoc b/doc-Service-Telemetry-Framework/modules/proc_configuring-openshift-monitoring.adoc index aac00d81..a5090250 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_configuring-openshift-monitoring.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_configuring-openshift-monitoring.adoc @@ -25,7 +25,13 @@ metadata: + [source,bash,options="nowrap"] ---- +ifndef::include_when_13[] $ for collector_type in ceil coll sens; do oc apply -f <(sed -e "s/<>/${collector_type}/g" << EOF +endif::[] +ifdef::include_when_13[] +$ for collector_type in ceil coll; do oc apply -f <(sed -e "s/<>/${collector_type}/g" << EOF +endif::[] + apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: @@ -64,7 +70,9 @@ EOF ); done servicemonitor.monitoring.coreos.com/default-cloud1-ceil-meter configured servicemonitor.monitoring.coreos.com/default-cloud1-coll-meter configured +ifndef::include_when_13[] servicemonitor.monitoring.coreos.com/default-cloud1-sens-meter configured +endif::[] ---- . To verify the successful configuration of openshift-monitoring, ensure that Smart Gateway metrics appear in Prometheus. . Retrieve the route for the openshift-monitoring prometheus: @@ -81,14 +89,18 @@ prometheus-k8s prometheus-k8s-openshift-monitoring.apps.infra.watch p . Verify that the following targets are visible under the `Status -> Targets` tab: ** service-telemetry/default-cloud1-ceil-meter/0 ** service-telemetry/default-cloud1-coll-meter/0 +ifndef::include_when_13[] ** service-telemetry/default-cloud1-sens-meter/0 +endif::[] + If there are problems with the configuration, find them on this page. . Issue the following queries on the `Graph` tab: ** `sg_total_collectd_metric_decode_count` ** `sg_total_ceilometer_metric_decode_count` +ifndef::include_when_13[] ** `sg_total_sensubility_metric_decode_count` +endif::[] . There should be one result from each Smart Gateway, as shown in the following example: + @@ -97,4 +109,6 @@ If the values returned are 0, it means that {ProjectShort} is not receiving that + ** `sg_total_collectd_metric_decode_count{container="sg-core", endpoint="prom-https", service="default-cloud1-coll-meter", source="SG"}` ** `sg_total_ceilometer_metric_decode_count{container="sg-core", endpoint="prom-https", service="default-cloud1-ceil-meter", source="SG"}` +ifndef::include_when_13[] ** `sg_total_sensubility_metric_decode_count{container="sg-core", endpoint="prom-https", service="default-cloud1-sens-meter", source="SG"}` +endif::[] diff --git a/doc-Service-Telemetry-Framework/modules/proc_configuring-the-stf-connection-for-the-overcloud.adoc b/doc-Service-Telemetry-Framework/modules/proc_configuring-the-stf-connection-for-the-overcloud.adoc index af45cf7c..c591538d 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_configuring-the-stf-connection-for-the-overcloud.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_configuring-the-stf-connection-for-the-overcloud.adoc @@ -18,7 +18,12 @@ endif::include_when_13,include_when_17[] . Create a configuration file called `stf-connectors.yaml` in the `/home/stack` directory. +ifndef::include_when_13[] . In the `stf-connectors.yaml` file, configure the `MetricsQdrConnectors` address to connect the {MessageBus} on the overcloud to the {ProjectShort} deployment. You configure the topic addresses for Sensubility, Ceilometer, and collectd in this file to match the defaults in {ProjectShort}. For more information about customizing topics and cloud configuration, see xref:configuring-multiple-clouds_assembly-completing-the-stf-configuration[]. +endif::[] +ifdef::include_when_13[] +. In the `stf-connectors.yaml` file, configure the `MetricsQdrConnectors` address to connect the {MessageBus} on the overcloud to the {ProjectShort} deployment. You configure the topic addresses for Ceilometer and collectd in this file to match the defaults in {ProjectShort}. For more information about customizing topics and cloud configuration, see xref:configuring-multiple-clouds_assembly-completing-the-stf-configuration[]. +endif::[] * The `resource_registry` configuration directly loads the collectd service because you do not include the `collectd-write-qdr.yaml` environment file for multiple cloud deployments. * Replace the `host` parameter with the value of `HOST/PORT` that you retrieved in xref:retrieving-the-qdr-route-address_assembly-completing-the-stf-configuration[]. diff --git a/doc-Service-Telemetry-Framework/modules/proc_creating-a-servicetelemetry-object-in-openshift.adoc b/doc-Service-Telemetry-Framework/modules/proc_creating-a-servicetelemetry-object-in-openshift.adoc index b125ac7f..46e7eb6e 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_creating-a-servicetelemetry-object-in-openshift.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_creating-a-servicetelemetry-object-in-openshift.adoc @@ -102,9 +102,11 @@ spec: - collectorType: ceilometer debugEnabled: false subscriptionAddress: anycast/ceilometer/cloud1-metering.sample +ifndef::include_when_13[] - collectorType: sensubility debugEnabled: false subscriptionAddress: sensubility/cloud1-telemetry +endif::[] name: cloud1 graphing: enabled: false @@ -154,7 +156,9 @@ NAME READY STATUS REST alertmanager-default-0 2/2 Running 0 17m default-cloud1-ceil-meter-smartgateway-6484b98b68-vd48z 2/2 Running 0 17m default-cloud1-coll-meter-smartgateway-799f687658-4gxpn 2/2 Running 0 17m +ifndef::include_when_13[] default-cloud1-sens-meter-smartgateway-c7f4f7fc8-c57b4 2/2 Running 0 17m +endif::[] default-interconnect-54658f5d4-pzrpt 1/1 Running 0 17m elastic-operator-66b7bc49c4-sxkc2 1/1 Running 0 52m interconnect-operator-69df6b9cb6-7hhp9 1/1 Running 0 50m From 10d11f031ed72bc21f735da19abde072abd07ff2 Mon Sep 17 00:00:00 2001 From: Chris Sibbitt Date: Thu, 19 Jan 2023 16:22:28 -0500 Subject: [PATCH 12/17] Updated path to match PR#52 in dashboard repo (#427) (#432) (cherry picked from commit eb0a98da238fef61062032975f32ca150ea3c1ed) --- .../modules/proc_importing-dashboards.adoc | 26 +++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc b/doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc index d9f7dc29..17d316ab 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc @@ -13,9 +13,9 @@ NOTE: The paths and dashboards names refer to STF 1.3 which is the earliest vers + [source,bash,options="nowrap"] ---- -$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1.3/rhos-dashboard.yaml +$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1/rhos-dashboard.yaml -grafanadashboard.integreatly.org/rhos-dashboard-1.3 created +grafanadashboard.integreatly.org/rhos-dashboard-1 created ---- . Import the cloud dashboard: + @@ -24,15 +24,15 @@ For some panels in the cloud dashboard, you must set the value of the collectd ` + [source,bash,options="nowrap"] ---- -$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1.3/rhos-cloud-dashboard.yaml +$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1/rhos-cloud-dashboard.yaml -grafanadashboard.integreatly.org/rhos-cloud-dashboard-1.3 created +grafanadashboard.integreatly.org/rhos-cloud-dashboard-1 created ---- . Import the cloud events dashboard: + [source,bash,options="nowrap"] ---- -$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1.3/rhos-cloudevents-dashboard.yaml +$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1/rhos-cloudevents-dashboard.yaml grafanadashboard.integreatly.org/rhos-cloudevents-dashboard created ---- @@ -40,17 +40,17 @@ grafanadashboard.integreatly.org/rhos-cloudevents-dashboard created + [source,bash,options="nowrap"] ---- -$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1.3/virtual-machine-view.yaml +$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1/virtual-machine-view.yaml -grafanadashboard.integreatly.org/virtual-machine-view-1.3 configured +grafanadashboard.integreatly.org/virtual-machine-view-1 configured ---- . Import the memcached dashboard: + [source,bash,options="nowrap"] ---- -$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1.3/memcached-dashboard.yaml +$ oc apply -f https://raw.githubusercontent.com/infrawatch/dashboards/master/deploy/stf-1/memcached-dashboard.yaml -grafanadashboard.integreatly.org/memcached-dashboard-1.3 created +grafanadashboard.integreatly.org/memcached-dashboard-1 created ---- . Verify that the dashboards are available: @@ -60,11 +60,11 @@ grafanadashboard.integreatly.org/memcached-dashboard-1.3 created $ oc get grafanadashboards NAME AGE -memcached-dashboard-1.3 115s -rhos-cloud-dashboard-1.3 2m12s +memcached-dashboard-1 115s +rhos-cloud-dashboard-1 2m12s rhos-cloudevents-dashboard 2m6s -rhos-dashboard-1.3 2m17s -virtual-machine-view-1.3 2m +rhos-dashboard-1 2m17s +virtual-machine-view-1 2m ---- . Retrieve the Grafana route address: From 1be92a8d89e1f6c69b129017da38cf123a1210d0 Mon Sep 17 00:00:00 2001 From: Chris Sibbitt Date: Thu, 19 Jan 2023 16:22:57 -0500 Subject: [PATCH 13/17] Fixed alertmanager verification command (#430) (#434) (cherry picked from commit f7200112a0d0b801e6faab27c07aa31d763c42c6) --- ...lert-route-with-templating-in-alertmanager.adoc | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/doc-Service-Telemetry-Framework/modules/proc_creating-an-alert-route-with-templating-in-alertmanager.adoc b/doc-Service-Telemetry-Framework/modules/proc_creating-an-alert-route-with-templating-in-alertmanager.adoc index 19303fe6..8a7fcca1 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_creating-an-alert-route-with-templating-in-alertmanager.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_creating-an-alert-route-with-templating-in-alertmanager.adoc @@ -94,27 +94,17 @@ route: ---- -. Run the `curl` command against the `alertmanager-proxy` service to retrieve the status and `configYAML` contents, and verify that the supplied configuration matches the configuration in Alertmanager: +. Run the `wget` command from the prometheus pod against the `alertmanager-proxy` service to retrieve the status and `configYAML` contents, and verify that the supplied configuration matches the configuration in Alertmanager: + [source,bash,options="nowrap"] ---- -$ oc run curl -it --serviceaccount=prometheus-k8s --restart='Never' --image=radial/busyboxplus:curl -- sh -c "curl -k -H \"Content-Type: application/json\" -H \"Authorization: Bearer \$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)\" https://default-alertmanager-proxy:9095/api/v1/status" +$ oc exec -it prometheus-default-0 -c prometheus -- /bin/sh -c "wget --header \"Authorization: Bearer \$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)\" https://default-alertmanager-proxy:9095/api/v1/status -q -O -" {"status":"success","data":{"configYAML":"...",...}} ---- . Verify that the `configYAML` field contains the changes you expect. - -. To clean up the environment, delete the `curl` pod: -+ -[source,bash] ----- -$ oc delete pod curl - -pod "curl" deleted ----- - .Additional resources * For more information about the {OpenShift} secret and the Prometheus operator, see https://github.com/coreos/prometheus-operator/blob/master/Documentation/user-guides/alerting.md[Prometheus user guide on alerting]. From b9665045c835b3d412b60c2dce68a9a78ea7bdd5 Mon Sep 17 00:00:00 2001 From: mickogeary Date: Fri, 20 Jan 2023 13:56:47 +0000 Subject: [PATCH 14/17] mg_master_2161659_minor-style-edit changed note text and position (#437) (#438) Co-authored-by: Michael Geary Co-authored-by: Michael Geary --- .../proc_updating-the-amq-interconnect-ca-certificate.adoc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc b/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc index 22cb96c5..c395c8dc 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc @@ -24,9 +24,6 @@ $ oc get secret/default-interconnect-selfsigned -o jsonpath='{.data.ca\.crt}' | . Copy `STFCA.pem` to your {OpenStackShort} undercloud. . Log into your {OpenStackShort} undercloud. . Edit the `stf-connectors.yaml` file to contain the new caCertFileContent. For more information, see xref:configuring-the-stf-connection-for-the-overcloud_assembly-completing-the-stf-configuration[]. -+ -[NOTE] -You do not need to perform an overcloud deploy after performing the steps below. We edit the stf-connectors.yaml file only to make sure that future deployments will not overwrite the new CA certificate. . Copy the `STFCA.pem` file to each {OpenStackShort} overcloud node: + @@ -53,3 +50,6 @@ ifdef::include_when_17[] [stack@undercloud-0 ~]$ ansible -i overcloud-deploy/overcloud/tripleo-ansible-inventory.yaml allovercloud -m shell -a "sudo podman restart metrics_qdr" ---- endif::include_when_17[] ++ +[NOTE] +You do not need to deploy the overcloud after you copy the `STFCA.pem` file and restart the `metrics_qdr` container. You edit the `stf-connectors.yaml` file so that future deployments do not overwrite the new CA certificate. From 711eb3b751b4b4f5937ae7b5568b04faf06ae79b Mon Sep 17 00:00:00 2001 From: Leif Madsen Date: Thu, 9 Feb 2023 15:26:10 -0500 Subject: [PATCH 15/17] Bump base image for building to Fedora 37 (#445) (#446) * Bump base image for building to Fedora 37 Bump the base image for building in CI to Fedora 37 because Fedora 33 is now EOL and has been removed from the quay repository. * Set /docs directory to being marked safe * Bump actions/checkout v2 (deprecated) to v3 --- .github/workflows/main.yml | 4 ++-- build_tools/ci.sh | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 70e223db..d5f440ad 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -6,7 +6,7 @@ jobs: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v2 + - uses: actions/checkout@v3 - name: Test and Publish env: @@ -14,4 +14,4 @@ jobs: GH_EMAIL: ${{ secrets.GH_EMAIL }} GH_TOKEN: ${{ secrets.GH_TOKEN }} run: | - docker run -eBRANCH="${GITHUB_REF##*/}" -eGH_NAME -eGH_EMAIL -eGH_TOKEN --network host -i --volume $PWD:/docs:z --workdir /docs quay.io/fedora/fedora:33-x86_64 /bin/bash -c './build_tools/ci.sh' + docker run -eBRANCH="${GITHUB_REF##*/}" -eGH_NAME -eGH_EMAIL -eGH_TOKEN --network host -i --volume $PWD:/docs:z --workdir /docs quay.io/fedora/fedora:37-x86_64 /bin/bash -c './build_tools/ci.sh' diff --git a/build_tools/ci.sh b/build_tools/ci.sh index b642fc2b..52e0f0fd 100755 --- a/build_tools/ci.sh +++ b/build_tools/ci.sh @@ -17,6 +17,7 @@ make clean html # Checkout our gh-pages branch, remove everything but .git echo "--- switching to gh-pages" +git config --global --add safe.directory /docs git fetch --all git checkout gh-pages git pull origin gh-pages From a70b357dcb68c2e4c34f2a427c1cbd281c307672 Mon Sep 17 00:00:00 2001 From: mickogeary Date: Fri, 17 Feb 2023 14:32:28 +0000 Subject: [PATCH 16/17] =?UTF-8?q?mg=5Fmaster=5F2168184=5Fadding=20section?= =?UTF-8?q?=20with=20procedures=20for=20upgrade=20from=201.4=E2=80=A6=20(#?= =?UTF-8?q?444)=20(#448)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * mg_master_2168184_adding section with procedures for upgrade from 1.4 to 1.5 * Bump base image for building to Fedora 37 (#445) * Bump base image for building to Fedora 37 Bump the base image for building in CI to Fedora 37 because Fedora 33 is now EOL and has been removed from the quay repository. * Set /docs directory to being marked safe * Bump actions/checkout v2 (deprecated) to v3 * Update doc-Service-Telemetry-Framework/assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-5.adoc * commit 2, incorporating feedback mg_master_2168184_procedures-to-upgrade-from-1.4-to-1.5 * commit 3, fixed internal link to Grafana section mg_master_2168184_procedures-to-upgrade-from-1.4-to-1.5 * mg_master_2168184_procedures-to-upgrade-from-1.4-to-1.5 commit Feb 14th * Add ifdef wrappers for certificate parts (#447) * Add ifdef wrappers for certificate documentation parts which do not apply to OSP16 (only 13 and 17). * Add a couple of clean up items for consistency and visual bits. * mg_master_2168184_procedures-to-upgrade-from-1.4-to-1.5 commit Feb 17 Rebased --------- Co-authored-by: Leif Madsen --- ...ce-telemetry-framework-to-version-1-5.adoc | 58 +++++++ doc-Service-Telemetry-Framework/master.adoc | 6 +- ...ice-telemetry-framework-1-5-operators.adoc | 163 ++++++++++++++++++ ...-the-amq-certificate-manager-operator.adoc | 59 +++++++ .../proc_removing-the-grafana-operator.adoc | 59 +++++++ ...ice-telemetry-framework-1-4-operators.adoc | 24 +++ ...moving-the-service-telemetry-operator.adoc | 67 +++++++ ...c_removing-the-smart-gateway-operator.adoc | 69 ++++++++ ...ificate-on-red-hat-openstack-platform.adoc | 20 +++ ...-openshift-container-platform-to-4-10.adoc | 32 ++++ 10 files changed, 554 insertions(+), 3 deletions(-) create mode 100644 doc-Service-Telemetry-Framework/assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-5.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/proc_installing-the-service-telemetry-framework-1-5-operators.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/proc_removing-the-amq-certificate-manager-operator.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/proc_removing-the-grafana-operator.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/proc_removing-the-service-telemetry-framework-1-4-operators.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/proc_removing-the-service-telemetry-operator.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/proc_removing-the-smart-gateway-operator.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate-on-red-hat-openstack-platform.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/proc_upgrading-red-hat-openshift-container-platform-to-4-10.adoc diff --git a/doc-Service-Telemetry-Framework/assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-5.adoc b/doc-Service-Telemetry-Framework/assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-5.adoc new file mode 100644 index 00000000..0ad4b5e0 --- /dev/null +++ b/doc-Service-Telemetry-Framework/assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-5.adoc @@ -0,0 +1,58 @@ +ifdef::context[:parent-context-of-upgrading-service-telemetry-framework-to-version-1-5: {context}] + +//// +file name: assembly_upgrading-service-telemetry-framework-to-version-1-5.adoc +ID: [id="assembly_upgrading-service-telemetry-framework-to-version-1-5_{context}"] +Title: = Upgrading Service Telemetry Framework to version 1.5 +//// + +:_content-type: ASSEMBLY + +ifndef::context[] +[id="upgrading-service-telemetry-framework-to-version-1-5"] +endif::[] +ifdef::context[] +[id="upgrading-service-telemetry-framework-to-version-1-5_{context}"] +endif::[] + += Upgrading {Project} to version 1.5 + +:context: upgrading-service-telemetry-framework-to-version-1-5 + +To upgrade {Project} ({ProjectShort}) 1.4 to {ProjectShort} 1.5, you must complete the following steps: + +* Replace AMQ Certificate Manager with Certificate Manager. +* Remove the `ClusterServiceVersion` and `Subscription` objects for Smart Gateway Operator and Service Telemetry Operator in the `service-telemetry` namespace on your {OpenShift} environment. + +* Upgrade {OpenShift} from 4.8 to 4.10. + +* Re-enable the operators that you removed. +ifdef::include_when_13,include_when_17[* Update the {MessageBus} CA Certificate on {OpenStack} ({OpenStackShort}).] + +.Prerequisites + +* You have backed up your data. There is an outage during the {OpenShift} upgrade. You cannot reconfigure the `ServiceTelemetry` and `SmartGateway` objects during the Operators replacement. +* You have prepared your environment for upgrade from {OpenShift} 4.8 to the supported version, 4.10. +* The {OpenShift} cluster is fully-connected. {ProjectShort} does not support disconnected or restricted-network clusters. + +include::../modules/proc_removing-the-service-telemetry-framework-1-4-operators.adoc[leveloffset=+1] + +// This is "sub-procedure" of proc_removing-the-service-telemetry-framework-v1-4-operators.adoc +include::../modules/proc_removing-the-service-telemetry-operator.adoc[leveloffset=+2] +// This is "sub-procedure" of proc_removing-the-service-telemetry-framework-v1-4-operators.adoc +include::../modules/proc_removing-the-smart-gateway-operator.adoc[leveloffset=+2] +// This is "sub-procedure" of proc_removing-the-service-telemetry-framework-v1-4-operators.adoc +include::../modules/proc_removing-the-amq-certificate-manager-operator.adoc[leveloffset=+2] +// This is "sub-procedure" of proc_removing-the-service-telemetry-framework-v1-4-operators.adoc +include::../modules/proc_removing-the-grafana-operator.adoc[leveloffset=+2] + +include::../modules/proc_upgrading-red-hat-openshift-container-platform-to-4-10.adoc[leveloffset=+1] + +include::../modules/proc_installing-the-service-telemetry-framework-1-5-operators.adoc[leveloffset=+1] + +ifdef::include_when_13,include_when_17[] +include::../modules/proc_updating-the-amq-interconnect-ca-certificate-on-red-hat-openstack-platform.adoc[leveloffset=+1] +endif::include_when_13,include_when_17[] + +ifdef::parent-context-of-upgrading-service-telemetry-framework-to-version-1-5[:context: {parent-context-of-upgrading-service-telemetry-framework-to-version-1-5}] +ifndef::parent-context-of-upgrading-service-telemetry-framework-to-version-1-5[:!context:] diff --git a/doc-Service-Telemetry-Framework/master.adoc b/doc-Service-Telemetry-Framework/master.adoc index e836b7a4..9a045a40 100644 --- a/doc-Service-Telemetry-Framework/master.adoc +++ b/doc-Service-Telemetry-Framework/master.adoc @@ -37,10 +37,10 @@ include::assemblies/assembly_advanced-features.adoc[leveloffset=+1] //certificate renewal include::assemblies/assembly_renewing-the-amq-interconnect-certificate.adoc[leveloffset=+1] -// upgrading to 1.4 -//include::assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-4.adoc[leveloffset=+1] - // removing include::assemblies/assembly_removing-stf-from-the-openshift-environment.adoc[leveloffset=+1] //collectd plugins + +// upgrading to 1.5 +include::assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-5.adoc[leveloffset=+1] diff --git a/doc-Service-Telemetry-Framework/modules/proc_installing-the-service-telemetry-framework-1-5-operators.adoc b/doc-Service-Telemetry-Framework/modules/proc_installing-the-service-telemetry-framework-1-5-operators.adoc new file mode 100644 index 00000000..0295448e --- /dev/null +++ b/doc-Service-Telemetry-Framework/modules/proc_installing-the-service-telemetry-framework-1-5-operators.adoc @@ -0,0 +1,163 @@ +//// +* file name: proc_installing-the-service-telemetry-framework-1-5-operators.adoc +* ID: [id="proc_installing-the-service-telemetry-framework-1-5-operators_{context}"] +* Title: = Installing the Service Telemetry Framework 1.5 Operators +//// + +:_content-type: PROCEDURE + +[id="installing-the-service-telemetry-framework-1-5-operators_{context}"] += Installing the {Project} 1.5 Operators + +Install the {Project} ({ProjectShort}) 1.5 Operators and the Certificate Manager for OpenShift Operator on your {OpenShift} 4.10 environment. {ProjectShort} 1.5 only supports {OpenShift} 4.10. Installing {ProjectShort} into disconnected or restricted network environments is unsupported. + +ifdef::include_when_13,include_when_17[] +[NOTE] +After a successful {ProjectShort} 1.5 install, you must retrieve and apply the {MessageBus} CA certificate to the {OpenStack} environment, or the transport layer and telemetry data becomes unavailable. + +For more information about updating the {MessageBus} CA certificate, see xref:updating-the-amq-interconnect-ca-certificate-on-red-hat-openstack-platform_upgrading-service-telemetry-framework-to-version-1-5[]. +endif::include_when_13,include_when_17[] + +.Prerequisites + +* You have upgraded your {OpenShift} environment to 4.10. +For more information about upgrading {OpenShift}, see xref:upgrading-red-hat-openshift-container-platform-to-4-10_upgrading-service-telemetry-framework-to-version-1-5[]. +* Your {OpenShift} environment network is fully-connected. + +.Procedure + +. Change to the `service-telemetry` project: ++ +[source,bash] +---- +$ oc project service-telemetry +---- + +. Create a `namespace` for the `cert-manager` Operator: ++ +[source,bash] +---- +$ oc create -f - < Date: Wed, 29 Mar 2023 10:29:57 -0400 Subject: [PATCH 17/17] Import changes for STF 1.5.1 (#459) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Add procedure to disable services on OSP side (#407) * Add procedure to disable services on OSP side Add a procedure that disables the services provisioned when enabling STF. Resolves: rhbz#2096853 * Add warning to not use procedure with gnocchi Add a warning to not use the disable procedure when making use of the Sending metrics to Gnocchi and Service Telemetry Framework procedure since not all dependencies are provided as part of that instruction set, since they are a super-set of the STF deployment instructions. In the future we should probably just remove the Gnocchi deployment instructions since we've re-written the autoscaling guide and that is the one procedure that would provide Gnocchi deployments, and would contain all the necessary dependencies in the provided THT environment files. * Apply suggestions from code review Small changes from jof * Update doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> * Update doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> * Update doc-Service-Telemetry-Framework/modules/proc_disabling-openstack-services-used-with-stf.adoc Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> * Don't remove existing 'stable-1.5' generated files (#412) * Minor updates to dashboarding guide (#413) * Minor updates to dashboarding guide Perform some minor updates to the dashboarding guide, referencing existing dashboards we have for virtual machine and memcached views. Update the ServiceTelemetry manifest to reference the rhel8/grafana:7 container image which should provide more consistency in how things are deployed, helping avoid a situation where newer versions of Grafana out of hub.docker.com no longer interface with the version of Elasticsearch that would be used when enabling events support by default. Update the path to the dashboards being created to reference the 'stf-1/' directory to reduce confusion by no longer referencing stf-1.3 in the links. Depends-On: https://github.com/infrawatch/dashboards/pull/50 * Update doc-Service-Telemetry-Framework/modules/con_dashboards.adoc plural to single * Update doc-Service-Telemetry-Framework/modules/proc_setting-up-grafana-to-host-the-dashboard.adoc * Add note about STF 1.3 and revert path change Revert path changes to the dashboards as use of symlinks as intended is not possible. Instead, add a note stating that the reference to STF 1.3 is the earliest version compatible with the dashboards, and can be used with STF versions 1.3 through 1.5. Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> * Fix syntax error in certificate renewal module (#416) * Fix syntax error in certificate renewal module Fix a syntax error in the certificate renewal module. Fixing this results in another issue that was hidden, whereby extra source lines are shown when building for version 17.0 due to loose version ranges. Unfortunately asciidoc doesn't provide an AND function in an ifeval or ifdef so we need separate parameters defined and nested to perform what is effectively a greater-than AND less-than evaluation. This was caught by QE when identifying a link that didn't have a corresponding section being built. The syntax error on the endif resulted in everything after that not being built, but did not result in a build error oddly enough. With this fix, everything is working as intended and included assemblies after this one are now visible. * Update ifdef to use AND syntax Update ifdef to use AND syntax per https://docs.asciidoctor.org/asciidoc/latest/directives/ifdef-ifndef/#checking-multiple-attributes * Removed sending-metrics-to-gnocchi-and-to-stf m… (#414) * Removed xrefs and include for sending-metrics-to-gnocchi-and-to-stf module * New folder named departure to hold files before deletion, updated gitignore file * I removed an assembly xref from additional resources section so that build is 100% * Removed upgrade section * Added link at assembly level re disabling services Co-authored-by: Leif Madsen * Jof mas minor edits 1.5 (#421) * changes to Primary parameters of the ServiceTelemetry object * Minor edits mostly reducing future tense * Apply suggestions from code review Routers -> dispatch routers * Update link to STF life cycle page (#423) Update the link to the STF life cycle page and remove reference to supporting the two most recent versions of STF. This has recently changed. STF 1.4 will be supported until the EOL of OpenShift 4.8 at which point STF 1.4 will also be EOL. In the meantime STF 1.4 is now in maintenance mode (CVE fixes only, no backport of features). STF 1.5 is supported as of 4.10 and will EOL with RHOSP 17.1. * updated link (#428) * Updated path to match PR#52 in dashboard repo (#427) * Fixed alertmanager verification command (#430) * Eliminate mentions of sensubility in OSP13 (#431) * A list of low hanging docs changes from our feature testing (#426) * A list of low hanging docs changes from our feature testing Items below are from STF-1167 * OSP Connection 1a Docs: Check other containers, not just metrics_qdr * Dashboards 1c: Docs: Rewording or change object we source creds from * Dashboards 1d: Docs: Wait for grafana restart * Metrics retention 1b: Docs: Tell customers how to verify * Alerts 2: Docs: More explicit example of how to construct this config * HA 2 Docs: Improve alertmanager config docs and re-test * Continues to work for me * Ephemeral Storage Docs: Document this only upstream and/or dev docs * Observability Strategy Docs: Add a note of which objects to delete * Docs: Double check that libpodstats and sensubility are not mentioned in OSP 13 (they shouldn't be) * I checked here: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/service_telemetry_framework_1.5/index * libpodstats is not mentioned, but sensubility is mentioned in several places * Is this a mistake? I don't have an OSP 13 handy and haven't dug through the artifacts to figure it out. * If changes are needed, there are several affected locations, so I'll use a dedicated PR * Certificate Handling (Issue 15.1 or sooner): Set 7.5yr expiry on all certs * https://github.com/infrawatch/service-telemetry-operator/pull/389 * Update doc-Service-Telemetry-Framework/modules/proc_creating-an-alert-route-with-templating-in-alertmanager.adoc Co-authored-by: Leif Madsen * Apply suggestions from code review Co-authored-by: Leif Madsen * Include more ceilometer agents in verification * Source block for container verif Co-authored-by: Leif Madsen * mg_master_2161659_minor-style-edit changed note text and position (#437) Co-authored-by: Michael Geary * Remove note from importing dashboards procedure (#439) Remove the note from the importing dashboards procedure since the paths no longer refer to STF 1.3. * Adjust network polling meter for ceilometer (#440) Adjust the broad ceilometer polling meter configuration to be a bit more specific so that we aren't attempting to poll APIs that no longer exist. Closes rhbz#2129729 * Eliminate vestiges of "stf-default" (#442) * I think this used to be the default name long-long ago? * I made a deployment by cut/pasting the manifest from the ephemeral storage * And then all the names mismatched the examples and documented commands * Link to the amqp1 plugin header directly (#443) * Bump base image for building to Fedora 37 (#445) * Bump base image for building to Fedora 37 Bump the base image for building in CI to Fedora 37 because Fedora 33 is now EOL and has been removed from the quay repository. * Set /docs directory to being marked safe * Bump actions/checkout v2 (deprecated) to v3 * mg_master_2168184_adding section with procedures for upgrade from 1.4… (#444) * mg_master_2168184_adding section with procedures for upgrade from 1.4 to 1.5 * Bump base image for building to Fedora 37 (#445) * Bump base image for building to Fedora 37 Bump the base image for building in CI to Fedora 37 because Fedora 33 is now EOL and has been removed from the quay repository. * Set /docs directory to being marked safe * Bump actions/checkout v2 (deprecated) to v3 * Update doc-Service-Telemetry-Framework/assemblies/assembly_upgrading-service-telemetry-framework-to-version-1-5.adoc * commit 2, incorporating feedback mg_master_2168184_procedures-to-upgrade-from-1.4-to-1.5 * commit 3, fixed internal link to Grafana section mg_master_2168184_procedures-to-upgrade-from-1.4-to-1.5 * mg_master_2168184_procedures-to-upgrade-from-1.4-to-1.5 commit Feb 14th * Add ifdef wrappers for certificate parts (#447) * Add ifdef wrappers for certificate documentation parts which do not apply to OSP16 (only 13 and 17). * Add a couple of clean up items for consistency and visual bits. * mg_master_2168184_procedures-to-upgrade-from-1.4-to-1.5 commit Feb 17 Rebased --------- Co-authored-by: Leif Madsen * Fix alertmanager verification command * See https://github.com/infrawatch/documentation/pull/430 * I missed this second instance * Revert "Fix alertmanager verification command" Accidental push to master. I will go protect the branch... This reverts commit a9aa45051737a4a8db4c16877abfaadb9dacffda. * Fix alertmanager verification command (#450) * See https://github.com/infrawatch/documentation/pull/430 * I missed this second instance * Reference event enablement for virtual machine view (#451) * Reference event enablement for virtual machine view Reference the event enablement for the virtual machine dashboard which uses the es_ceilometer datasource. Closes: rhbz#2173856 * Soften language to make event enablement optional Soften language to make event enablement optional as suggested by Chris. * Expand supported OCP range through to 4.12 (#452) * Adjust path to triple-ansible-inventory file (#454) Adjust the path to the Ansible inventory file across RHOSP versions during documentation generation. Also fix inclusion of AMQ certificate renewal procedures in RHOSP 16 where certificate distribution does not exist. Closes: rhbz#2158178 * Remove DCN related configuration artifacts (#455) Remove DCN related configuration artifacts because it's not clear that this is helpful guidance, and there is information provided it may actually be harmful to getting a working environment. Closes: rhbz#2023902 * Add SNMP trap configuration parameters (#449) * Add SNMP trap configuration parameters Add SNMP trap configuration parameters that are being exposed via the ServiceTelemetry object. Create a concept module that provides a better overview of the OID configuration and link to location of the MIB definition. Also provide an example prometheus rule that shows how to override the OID value on a per-alert rule definition. Closes: STF-1257 * Clean up some wording * Add an example configuration manifest * Update doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> * Update doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc Changing a style comment to a suggestion which I can implement * Update doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc * Update doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc * Update doc-Service-Telemetry-Framework/modules/proc_configuring-snmp-traps.adoc * Update doc-Service-Telemetry-Framework/modules/proc_configuring-snmp-traps.adoc * Update doc-Service-Telemetry-Framework/modules/proc_configuring-snmp-traps.adoc * Update doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc * Update doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc --------- Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> Co-authored-by: mickogeary * Expose ability to set certificate renewal target times (#453) * Expose ability to set certificate renewal target times Expose the duration value in STO to allow better control of the default certificate renewal times. * Fix format errors and enhance redaction * ElasticSearch to Elasticsearch * Mispelling in default values * Inclusion of assembly for OSP 13 and OSP 16 * AQDR to {MessageBus} * Minor typo fix * Apply suggestions from code review Co-authored-by: mickogeary --------- Co-authored-by: Leif Madsen Co-authored-by: mickogeary * [OSP13] Replacing "allovercloud" with "overcloud" in ansible command (#456) Co-authored-by: root * [OSP13] Replacing podman with docker in ansible command. (#457) * [OSP13] Replacing podman with docker in ansible command. * Apply suggestions from code review Co-authored-by: Leif Madsen --------- Co-authored-by: root Co-authored-by: Leif Madsen * Fix improper merge conflict for build_tools --------- Co-authored-by: JoanneOFlynn2018 <45287002+JoanneOFlynn2018@users.noreply.github.com> Co-authored-by: Chris Sibbitt Co-authored-by: mickogeary Co-authored-by: Michael Geary Co-authored-by: Victoria Martinez de la Cruz Co-authored-by: Leonid Natapov Co-authored-by: root --- common/global/stf-attributes.adoc | 4 +- .../assembly_advanced-features.adoc | 11 ++- ...mbly_completing-the-stf-configuration.adoc | 8 +- ...wing-the-amq-interconnect-certificate.adoc | 3 +- .../modules/con_dashboards.adoc | 2 +- ...meters-of-the-servicetelemetry-object.adoc | 2 +- .../modules/con_snmp-traps.adoc | 93 +++++++++++++++++++ .../con_tls-certificates-duration.adoc | 65 +++++++++++++ .../proc_configuring-ephemeral-storage.adoc | 2 +- ...oc_configuring-observability-strategy.adoc | 7 ++ ...-openstack-platform-overcloud-for-stf.adoc | 2 +- .../modules/proc_configuring-snmp-traps.adoc | 69 ++++++++++++-- ...-the-stf-connection-for-the-overcloud.adoc | 2 +- ...configuring-tls-certificates-duration.adoc | 54 +++++++++++ ...eating-an-alert-route-in-alertmanager.adoc | 8 +- ...route-with-templating-in-alertmanager.adoc | 56 +++++------ ...-environment-file-for-multiple-clouds.adoc | 2 +- ...eating-the-base-configuration-for-stf.adoc | 6 +- ...period-in-service-telemetry-framework.adoc | 15 ++- ...and-setting-grafana-login-credentials.adoc | 17 +++- ...g-the-amq-interconnect-ca-certificate.adoc | 20 +++- ...oc_validating-clientside-installation.adoc | 19 +++- 22 files changed, 405 insertions(+), 62 deletions(-) create mode 100644 doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/con_tls-certificates-duration.adoc create mode 100644 doc-Service-Telemetry-Framework/modules/proc_configuring-tls-certificates-duration.adoc diff --git a/common/global/stf-attributes.adoc b/common/global/stf-attributes.adoc index 84064f5d..bf94c480 100644 --- a/common/global/stf-attributes.adoc +++ b/common/global/stf-attributes.adoc @@ -42,7 +42,7 @@ ifeval::["{build}" == "upstream"] :ProjectShort: STF :MessageBus: Apache{nbsp}Qpid{nbsp}Dispatch{nbsp}Router :SupportedOpenShiftVersion: 4.10 -:NextSupportedOpenShiftVersion: 4.10 +:NextSupportedOpenShiftVersion: 4.12 :CodeReadyContainersVersion: 2.6.0 endif::[] @@ -60,5 +60,5 @@ ifeval::["{build}" == "downstream"] :ProjectShort: STF :MessageBus: AMQ{nbsp}Interconnect :SupportedOpenShiftVersion: 4.10 -:NextSupportedOpenShiftVersion: 4.10 +:NextSupportedOpenShiftVersion: 4.12 endif::[] diff --git a/doc-Service-Telemetry-Framework/assemblies/assembly_advanced-features.adoc b/doc-Service-Telemetry-Framework/assemblies/assembly_advanced-features.adoc index 3adc609d..cb00819c 100644 --- a/doc-Service-Telemetry-Framework/assemblies/assembly_advanced-features.adoc +++ b/doc-Service-Telemetry-Framework/assemblies/assembly_advanced-features.adoc @@ -43,7 +43,14 @@ include::../modules/proc_creating-an-alert-route-in-alertmanager.adoc[leveloffse include::../modules/proc_creating-an-alert-route-with-templating-in-alertmanager.adoc[leveloffset=+2] //SNMP Traps -include::../modules/proc_configuring-snmp-traps.adoc[leveloffset=+1] +include::../modules/con_snmp-traps.adoc[leveloffset=+1] +include::../modules/proc_configuring-snmp-traps.adoc[leveloffset=+2] + +//TLS Certificates duration +ifdef::include_when_13,include_when_17[] +include::../modules/con_tls-certificates-duration.adoc[leveloffset=+1] +include::../modules/proc_configuring-tls-certificates-duration.adoc[leveloffset=+2] +endif::include_when_13,include_when_17[] //High availability include::../modules/con_high-availability.adoc[leveloffset=+1] @@ -51,7 +58,9 @@ include::../modules/proc_configuring-high-availability.adoc[leveloffset=+2] //Configuring ephemeral storage include::../modules/con_ephemeral-storage.adoc[leveloffset=+1] +ifeval::["{build}" == "upstream"] include::../modules/proc_configuring-ephemeral-storage.adoc[leveloffset=+2] +endif::[] //Observability strategy include::../modules/con_observability-strategy.adoc[leveloffset=+1] diff --git a/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc b/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc index 301c42b2..79d35a72 100644 --- a/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc +++ b/doc-Service-Telemetry-Framework/assemblies/assembly_completing-the-stf-configuration.adoc @@ -16,9 +16,8 @@ To collect metrics, events, or both, and to send them to the {Project} ({Project * To plan your {OpenStackShort} installation and configuration {ProjectShort} for multiple clouds, see xref:configuring-multiple-clouds_assembly-completing-the-stf-configuration[]. * As part of an {OpenStackShort} overcloud deployment, you might need to configure additional features in your environment: - -** To deploy data collection and transport to {ProjectShort} on {OpenStackShort} cloud nodes that employ routed L3 domains, such as distributed compute node (DCN) or spine-leaf, see xref:deploying-to-non-standard-network-topologies_assembly-completing-the-stf-configuration[]. - +// NOTE: removing this for now because it's not clear that this is necessary, and that recommendations here may actually be harmful. See RHBZ#2023902. +//** To deploy data collection and transport to {ProjectShort} on {OpenStackShort} cloud nodes that employ routed L3 domains, such as distributed compute node (DCN) or spine-leaf, see xref:deploying-to-non-standard-network-topologies_assembly-completing-the-stf-configuration[]. ** To disable the data collector services, see xref:disabling-openstack-services-used-with-stf_assembly-completing-the-stf-configuration[]. ifdef::include_when_13[] @@ -38,7 +37,8 @@ include::../modules/proc_validating-clientside-installation.adoc[leveloffset=+2] include::../modules/proc_disabling-openstack-services-used-with-stf.adoc[leveloffset=+1] // Gather information for deployment in non-standard network topologies in the OSP overcloud -include::../modules/proc_deploying-to-non-standard-network-topologies.adoc[leveloffset=+1] +// NOTE: removing this for now because it's not clear that this is necessary, and that recommendations here may actually be harmful. See RHBZ#2023902. +//include::../modules/proc_deploying-to-non-standard-network-topologies.adoc[leveloffset=+1] ifdef::include_when_13[] // If you synchronized container images to a local registry, create an environment file and include the paths to the container images diff --git a/doc-Service-Telemetry-Framework/assemblies/assembly_renewing-the-amq-interconnect-certificate.adoc b/doc-Service-Telemetry-Framework/assemblies/assembly_renewing-the-amq-interconnect-certificate.adoc index f2e3f0ed..a4dea970 100644 --- a/doc-Service-Telemetry-Framework/assemblies/assembly_renewing-the-amq-interconnect-certificate.adoc +++ b/doc-Service-Telemetry-Framework/assemblies/assembly_renewing-the-amq-interconnect-certificate.adoc @@ -1,5 +1,5 @@ +ifdef::include_when_13,include_when_17[] ifdef::context[:parent-context: {context}] - [id="assembly-renewing-the-amq-interconnect-certificate_{context}"] = Renewing the {MessageBus} certificate @@ -18,3 +18,4 @@ include::../modules/proc_updating-the-amq-interconnect-ca-certificate.adoc[level //reset the context ifdef::parent-context[:context: {parent-context}] ifndef::parent-context[:!context:] +endif::include_when_13,include_when_17[] diff --git a/doc-Service-Telemetry-Framework/modules/con_dashboards.adoc b/doc-Service-Telemetry-Framework/modules/con_dashboards.adoc index 3381a761..6e9cf0e9 100644 --- a/doc-Service-Telemetry-Framework/modules/con_dashboards.adoc +++ b/doc-Service-Telemetry-Framework/modules/con_dashboards.adoc @@ -19,7 +19,7 @@ Use the cloud view dashboard to view panels to monitor service resource usage, A ** For more information about {OpenStackShort} service monitoring, see xref:resource-usage-of-openstack-services_assembly-advanced-features[]. Virtual machine view dashboard:: -Use the virtual machine view dashboard to view panels to monitor virtual machine infrastructure usage. Select a cloud and project from the upper left corner of the dashboard. +Use the virtual machine view dashboard to view panels to monitor virtual machine infrastructure usage. Select a cloud and project from the upper left corner of the dashboard. You must enable event storage if you want to enable the event annotations on this dashboard. For more information, see xref:creating-a-servicetelemetry-object-in-openshift_assembly-installing-the-core-components-of-stf[]. Memcached view dashboard:: Use the memcached view dashboard to view panels to monitor connections, availability, system metrics and cache performance. Select a cloud from the upper left corner of the dashboard. diff --git a/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc b/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc index f5df5cb0..90ebf7cd 100644 --- a/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc +++ b/doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc @@ -179,7 +179,7 @@ endif::[] apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: - name: stf-default + name: default namespace: service-telemetry spec: clouds: diff --git a/doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc b/doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc new file mode 100644 index 00000000..faead485 --- /dev/null +++ b/doc-Service-Telemetry-Framework/modules/con_snmp-traps.adoc @@ -0,0 +1,93 @@ +[id="snmp-traps_{context}"] += Sending alerts as SNMP traps + +[role="_abstract"] +To enable SNMP traps, modify the `ServiceTelemetry` object and configure the `snmpTraps` parameters. SNMP traps are sent using version 2c. + +[id="configuration-parameters-for-snmptraps_{context}"] +== Configuration parameters for snmpTraps + +The `snmpTraps` parameter contains the following sub-parameters for configuring the alert receiver: + +enabled:: Set the value of this sub-parameter to true to enable the SNMP trap alert receiver. The default value is false. +target:: Target address to send SNMP traps. Value is a string. Default is `192.168.24.254`. +port:: Target port to send SNMP traps. Value is an integer. Default is `162`. +community:: Target community to send SNMP traps to. Value is a string. Default is `public`. +retries:: SNMP trap retry delivery limit. Value is an integer. Default is `5`. +timeout:: SNMP trap delivery timeout defined in seconds. Value is an integer. Default is `1`. +alertOidLabel:: Label name in the alert that defines the OID value to send the SNMP trap as. Value is a string. Default is `oid`. +trapOidPrefix:: SNMP trap OID prefix for variable bindings. Value is a string. Default is `1.3.6.1.4.1.50495.15`. +trapDefaultOid:: SNMP trap OID when no alert OID label has been specified with the alert. Value is a string. Default is `1.3.6.1.4.1.50495.15.1.2.1`. +trapDefaultSeverity:: SNMP trap severity when no alert severity has been set. Value is a string. Defaults to an empty string. + +Configure the `snmpTraps` parameter as part of the `alerting.alertmanager.receivers` definition in the `ServiceTelemetry` object: + +[source,yaml,options="nowrap"] +---- +apiVersion: infra.watch/v1beta1 +kind: ServiceTelemetry +metadata: + name: default + namespace: service-telemetry +spec: + alerting: + alertmanager: + receivers: + snmpTraps: + alertOidLabel: oid + community: public + enabled: true + port: 162 + retries: 5 + target: 192.168.25.254 + timeout: 1 + trapDefaultOid: 1.3.6.1.4.1.50495.15.1.2.1 + trapDefaultSeverity: "" + trapOidPrefix: 1.3.6.1.4.1.50495.15 +... +---- + +[id="overview-of-the-mib-definition_{context}"] +== Overview of the MIB definition + +Delivery of SNMP traps uses object identifier (OID) value `1.3.6.1.4.1.50495.15.1.2.1` by default. The management information base (MIB) schema is available at https://github.com/infrawatch/prometheus-webhook-snmp/blob/master/PROMETHEUS-ALERT-CEPH-MIB.txt. + +The OID number is comprised of the following component values: +* The value `1.3.6.1.4.1` is a global OID defined for private enterprises. +* The next identifier `50495` is a private enterprise number assigned by IANA for the Ceph organization. +* The other values are child OIDs of the parent. + +15:: prometheus objects +15.1:: prometheus alerts +15.1.2:: prometheus alert traps +15.1.2.1:: prometheus alert trap default + +The prometheus alert trap default is an object comprised of several other sub-objects to OID `1.3.6.1.4.1.50495.15` which is defined by the `alerting.alertmanager.receivers.snmpTraps.trapOidPrefix` parameter: + +.1.1.1:: alert name +.1.1.2:: status +.1.1.3:: severity +.1.1.4:: instance +.1.1.5:: job +.1.1.6:: description +.1.1.7:: labels +.1.1.8:: timestamp +.1.1.9:: rawdata + +The following is example output from a simple SNMP trap receiver that outputs the received trap to the console: + +[source,options="nowrap"] +---- + SNMPv2-MIB::snmpTrapOID.0 = OID: SNMPv2-SMI::enterprises.50495.15.1.2.1 + SNMPv2-SMI::enterprises.50495.15.1.1.1 = STRING: "TEST ALERT FROM PROMETHEUS PLEASE ACKNOWLEDGE" + SNMPv2-SMI::enterprises.50495.15.1.1.2 = STRING: "firing" + SNMPv2-SMI::enterprises.50495.15.1.1.3 = STRING: "warning" + SNMPv2-SMI::enterprises.50495.15.1.1.4 = "" + SNMPv2-SMI::enterprises.50495.15.1.1.5 = "" + SNMPv2-SMI::enterprises.50495.15.1.1.6 = STRING: "TEST ALERT FROM " + SNMPv2-SMI::enterprises.50495.15.1.1.7 = STRING: "{\"cluster\": \"TEST\", \"container\": \"sg-core\", \"endpoint\": \"prom-https\", \"prometheus\": \"service-telemetry/default\", \"service\": \"default-cloud1-coll-meter\", \"source\": \"SG\"}" + SNMPv2-SMI::enterprises.50495.15.1.1.8 = Timeticks: (1676476389) 194 days, 0:52:43.89 + SNMPv2-SMI::enterprises.50495.15.1.1.9 = STRING: "{\"status\": \"firing\", \"labels\": {\"cluster\": \"TEST\", \"container\": \"sg-core\", \"endpoint\": \"prom-https\", \"prometheus\": \"service-telemetry/default\", \"service\": \"default-cloud1-coll-meter\", \"source\": \"SG\"}, \"annotations\": {\"action\": \"TESTING PLEASE ACKNOWLEDGE, NO FURTHER ACTION REQUIRED ONLY A TEST\"}, \"startsAt\": \"2023-02-15T15:53:09.109Z\", \"endsAt\": \"0001-01-01T00:00:00Z\", \"generatorURL\": \"http://prometheus-default-0:9090/graph?g0.expr=sg_total_collectd_msg_received_count+%3E+1&g0.tab=1\", \"fingerprint\": \"feefeb77c577a02f\"}" +---- + + diff --git a/doc-Service-Telemetry-Framework/modules/con_tls-certificates-duration.adoc b/doc-Service-Telemetry-Framework/modules/con_tls-certificates-duration.adoc new file mode 100644 index 00000000..a804b4ad --- /dev/null +++ b/doc-Service-Telemetry-Framework/modules/con_tls-certificates-duration.adoc @@ -0,0 +1,65 @@ +[id="tls-certificates-duration_{context}"] += Configuring the duration for the TLS certificates + +[role="_abstract"] +To configure the duration of the TLS certificates that you use for the connections with +Elasticsearch and {MessageBus} in {Project} ({ProjectShort}), +modify the `ServiceTelemetry` object and configure the `certificates` parameters. + +[id="configuration-parameters-for-tls-certificates-duration_{context}"] +== Configuration parameters for the TLS certificates + +You can configure the duration of the certificate with the following sub-parameters of the `certificates` parameter: + +endpointCertDuration:: The requested 'duration' or lifetime of the endpoint Certificate. +Minimum accepted duration is 1 hour. Value must be in units accepted by Go time.ParseDuration https://golang.org/pkg/time/#ParseDuration. +The default value is `70080h`. +caCertDuration:: The requested 'duration' or lifetime of the CA Certificate. +Minimum accepted duration is 1 hour. Value must be in units accepted by Go time.ParseDuration https://golang.org/pkg/time/#ParseDuration. +Default value is `70080h`. + +NOTE:: The default duration of certificates is long, because you usually copy a subset of them in the {OpenStack} deployment when the certificates renew. For more information about the QDR CA Certificate renewal process, see xref:assembly-renewing-the-amq-interconnect-certificate_assembly[] + +The `certificates` parameter for Elasticsearch is part of the `backends.events.elasticsearch` definition and is configured in the `ServiceTelemetry` object: + +[source,yaml,options="nowrap"] +---- +apiVersion: infra.watch/v1beta1 +kind: ServiceTelemetry +metadata: + name: default + namespace: service-telemetry +spec: +... + backends: + ... + events: + elasticsearch: + enabled: true + version: 7.16.1 + certificates: + endpointCertDuration: 70080h + caCertDuration: 70080h +... +---- + +You can configure the `certificates` parameter for QDR that is part of the `transports.qdr` definition in the `ServiceTelemetry` object: + +[source,yaml,options="nowrap"] +---- +apiVersion: infra.watch/v1beta1 +kind: ServiceTelemetry +metadata: + name: default + namespace: service-telemetry +spec: +... + transports: + ... + qdr: + enabled: true + certificates: + endpointCertDuration: 70080h + caCertDuration: 70080h +... +---- diff --git a/doc-Service-Telemetry-Framework/modules/proc_configuring-ephemeral-storage.adoc b/doc-Service-Telemetry-Framework/modules/proc_configuring-ephemeral-storage.adoc index d6983771..ef316756 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_configuring-ephemeral-storage.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_configuring-ephemeral-storage.adoc @@ -28,7 +28,7 @@ $ oc edit stf default apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: - name: stf-default + name: default namespace: service-telemetry spec: alerting: diff --git a/doc-Service-Telemetry-Framework/modules/proc_configuring-observability-strategy.adoc b/doc-Service-Telemetry-Framework/modules/proc_configuring-observability-strategy.adoc index ebf9536e..b9fe20fb 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_configuring-observability-strategy.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_configuring-observability-strategy.adoc @@ -23,6 +23,13 @@ spec: EOF ---- + +. Delete the left over objects that are managed by community operators ++ +[source,bash] +---- +$ for o in alertmanager/default prometheus/default elasticsearch/elasticsearch grafana/default lokistack/lokistack; do oc delete $o; done +---- ++ . To verify that all workloads are operating correctly, view the pods and the status of each pod: + [source,bash,options="nowrap"] diff --git a/doc-Service-Telemetry-Framework/modules/proc_configuring-red-hat-openstack-platform-overcloud-for-stf.adoc b/doc-Service-Telemetry-Framework/modules/proc_configuring-red-hat-openstack-platform-overcloud-for-stf.adoc index 20c59407..99fa950b 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_configuring-red-hat-openstack-platform-overcloud-for-stf.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_configuring-red-hat-openstack-platform-overcloud-for-stf.adoc @@ -17,5 +17,5 @@ endif::include_when_13,include_when_17[] ifdef::include_when_16_1[] .Additional resources -* To collect data through {MessageBus}, see https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/{vernum}/html/operational_measurements/collectd-plugins_assembly[the amqp1 plug-in]. +* To collect data through {MessageBus}, see https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/{vernum}/html/operational_measurements/collectd-plugins_assembly#collectd_plugin_amqp1[the amqp1 plug-in]. endif::include_when_16_1[] diff --git a/doc-Service-Telemetry-Framework/modules/proc_configuring-snmp-traps.adoc b/doc-Service-Telemetry-Framework/modules/proc_configuring-snmp-traps.adoc index 0fa197b1..1abbff32 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_configuring-snmp-traps.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_configuring-snmp-traps.adoc @@ -2,23 +2,28 @@ [id="configuring-snmp-traps_{context}"] = Configuring SNMP traps -[role="_abstract"] -You can integrate {Project} ({ProjectShort}) with an existing infrastructure monitoring platform that receives notifications through SNMP traps. To enable SNMP traps, modify the `ServiceTelemetry` object and configure the `snmpTraps` parameters. - -For more information about configuring alerts, see xref:alerts_assembly-advanced-features[]. - .Prerequisites -* Know the IP address or hostname of the SNMP trap receiver where you want to send the alerts +* Ensure that you know the IP address or hostname of the SNMP trap receiver where you want to send the alerts to. .Procedure +. Log in to {OpenShift}. + +. Change to the `service-telemetry` namespace: ++ +[source,bash] +---- +$ oc project service-telemetry +---- + . To enable SNMP traps, modify the `ServiceTelemetry` object: + [source,bash] ---- $ oc edit stf default ---- + . Set the `alerting.alertmanager.receivers.snmpTraps` parameters: + [source,yaml] @@ -37,3 +42,55 @@ spec: ---- . Ensure that you set the value of `target` to the IP address or hostname of the SNMP trap receiver. + +.Additional Information + +For more information about available parameters for `snmpTraps`, see xref:configuration-parameters-for-snmptraps_assembly-advanced-features[]. + +[id="creating-alerts-for-snmp-traps_{context}"] += Creating alerts for SNMP traps + +You can create alerts that are configured for delivery by SNMP traps by adding labels that are parsed by the prometheus-webhook-snmp middleware to define the trap information and delivered object identifiers (OID). Adding the `oid` or `severity` labels is only required if you need to change the default values for a particular alert definition. + +NOTE:: When you set the oid label, the top-level SNMP trap OID changes, but the sub-OIDs remain defined by the global `trapOidPrefix` value plus the child OID values `.1.1.1` through `.1.1.9`. For more information about the MIB definition, see xref:overview-of-the-mib-definition_{context}[]. + +.Procedure + +. Log in to {OpenShift}. + +. Change to the `service-telemetry` namespace: ++ +[source,bash] +---- +$ oc project service-telemetry +---- + +. Create a `PrometheusRule` object that contains the alert rule and an `oid` label that contains the SNMP trap OID override value: ++ +[source,bash] +---- +$ oc apply -f - < alertmanager.yaml < +receivers: + - name: slack + slack_configs: + - channel: #stf-alerts + title: |- + ... + text: >- + ... +route: + group_by: ['job'] + group_wait: 30s + group_interval: 5m + repeat_interval: 12h + receiver: 'slack' +EOF ---- - -. To deploy a custom Alertmanager route with {ProjectShort}, you must pass an `alertmanagerConfigManifest` parameter to the Service Telemetry Operator that results in an updated secret that is managed by the Prometheus Operator: +. Generate the config manifest and add it to the `ServiceTelemetry` object for your {ProjectShort} deployment: + -[source,yaml,options="nowrap"] +[source,bash,options="nowrap"] ---- -apiVersion: infra.watch/v1beta1 -kind: ServiceTelemetry -metadata: - name: default - namespace: service-telemetry -spec: - backends: - metrics: - prometheus: - enabled: true - alertmanagerConfigManifest: | - apiVersion: v1 - kind: Secret - metadata: - name: 'alertmanager-default' - namespace: 'service-telemetry' - type: Opaque - data: - alertmanager.yaml: Z2xvYmFsOgogIHJlc29sdmVfdGltZW91dDogMTBtCiAgc2xhY2tfYXBpX3VybDogPHNsYWNrX2FwaV91cmw+CnJlY2VpdmVyczoKICAtIG5hbWU6IHNsYWNrCiAgICBzbGFja19jb25maWdzOgogICAgLSBjaGFubmVsOiAjc3RmLWFsZXJ0cwogICAgICB0aXRsZTogfC0KICAgICAgICAuLi4KICAgICAgdGV4dDogPi0KICAgICAgICAuLi4Kcm91dGU6CiAgZ3JvdXBfYnk6IFsnam9iJ10KICBncm91cF93YWl0OiAzMHMKICBncm91cF9pbnRlcnZhbDogNW0KICByZXBlYXRfaW50ZXJ2YWw6IDEyaAogIHJlY2VpdmVyOiAnc2xhY2snCg== +$ CONFIG_MANIFEST=$(oc create secret --dry-run=client generic alertmanager-default --from-file=alertmanager.yaml -o json) +$ oc patch stf default --type=merge -p '{"spec":{"alertmanagerConfigManifest":'"$CONFIG_MANIFEST"'}}' ---- - . Verify that the configuration has been applied to the secret: + +[NOTE] +There will be a short delay as the operators update each object ++ [source,bash,options="nowrap"] ---- $ oc get secret alertmanager-default -o go-template='{{index .data "alertmanager.yaml" | base64decode }}' diff --git a/doc-Service-Telemetry-Framework/modules/proc_creating-openstack-environment-file-for-multiple-clouds.adoc b/doc-Service-Telemetry-Framework/modules/proc_creating-openstack-environment-file-for-multiple-clouds.adoc index 7c8f3ebc..7417c18e 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_creating-openstack-environment-file-for-multiple-clouds.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_creating-openstack-environment-file-for-multiple-clouds.adoc @@ -74,7 +74,7 @@ resource_registry: parameter_defaults: MetricsQdrConnectors: - - host: stf-default-interconnect-5671-service-telemetry.apps.infra.watch + - host: default-interconnect-5671-service-telemetry.apps.infra.watch port: 443 role: edge verifyHostname: false diff --git a/doc-Service-Telemetry-Framework/modules/proc_creating-the-base-configuration-for-stf.adoc b/doc-Service-Telemetry-Framework/modules/proc_creating-the-base-configuration-for-stf.adoc index c52e383d..29744283 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_creating-the-base-configuration-for-stf.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_creating-the-base-configuration-for-stf.adoc @@ -55,7 +55,8 @@ parameter_defaults: - image.* - memory - memory.* - - network.* + - network.services.vpn.* + - network.services.firewall.* - perf.* - port - port.* @@ -138,7 +139,8 @@ parameter_defaults: - image.* - memory - memory.* - - network.* + - network.services.vpn.* + - network.services.firewall.* - perf.* - port - port.* diff --git a/doc-Service-Telemetry-Framework/modules/proc_editing-the-metrics-retention-time-period-in-service-telemetry-framework.adoc b/doc-Service-Telemetry-Framework/modules/proc_editing-the-metrics-retention-time-period-in-service-telemetry-framework.adoc index ce5e0fa4..e8b75d6a 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_editing-the-metrics-retention-time-period-in-service-telemetry-framework.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_editing-the-metrics-retention-time-period-in-service-telemetry-framework.adoc @@ -33,7 +33,7 @@ If you set a long retention period, retrieving data from heavily populated Prome apiVersion: infra.watch/v1beta1 kind: ServiceTelemetry metadata: - name: stf-default + name: default namespace: service-telemetry spec: ... @@ -48,6 +48,19 @@ spec: ---- . Save your changes and close the object. +. Wait for prometheus to restart with the new settings. ++ +[source,bash] +---- +$ oc get po -l app.kubernetes.io/name=prometheus -w +---- +. Verify the new retention setting by checking the command line arguments used in the pod. ++ +[source,bash] +---- +$ oc describe po prometheus-default-0 | grep retention.time + --storage.tsdb.retention.time=24h +---- .Additional resources diff --git a/doc-Service-Telemetry-Framework/modules/proc_retrieving-and-setting-grafana-login-credentials.adoc b/doc-Service-Telemetry-Framework/modules/proc_retrieving-and-setting-grafana-login-credentials.adoc index b3376803..7d986a74 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_retrieving-and-setting-grafana-login-credentials.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_retrieving-and-setting-grafana-login-credentials.adoc @@ -2,7 +2,9 @@ = Retrieving and setting Grafana login credentials [role="_abstract"] -{Project} ({ProjectShort}) sets default login credentials when Grafana is enabled. You can override the credentials in the `ServiceTelemetry` object. +When Grafana is enabled, you can login using openshift authentication, or the default username and password set by the Grafana Operator. + +You can override the credentials in the `ServiceTelemetry` object to have {Project} ({ProjectShort}) set the username and password for Grafana instead. .Procedure @@ -13,7 +15,7 @@ ---- $ oc project service-telemetry ---- -. Retrieve the default username and password from the {ProjectShort} object: +. Retrieve the existing username and password from the {ProjectShort} object: + [source,bash] ---- @@ -21,3 +23,14 @@ $ oc get stf default -o jsonpath="{.spec.graphing.grafana['adminUser','adminPass ---- . To modify the default values of the Grafana administrator username and password through the ServiceTelemetry object, use the `graphing.grafana.adminUser` and `graphing.grafana.adminPassword` parameters. ++ +[source,bash] +---- +$ oc edit stf default +---- +. Wait for the grafana pod to restart with the new credentials in place ++ +[source,bash] +---- +$ oc get po -l app=grafana -w +---- \ No newline at end of file diff --git a/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc b/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc index c395c8dc..035d26a6 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_updating-the-amq-interconnect-ca-certificate.adoc @@ -25,19 +25,35 @@ $ oc get secret/default-interconnect-selfsigned -o jsonpath='{.data.ca\.crt}' | . Log into your {OpenStackShort} undercloud. . Edit the `stf-connectors.yaml` file to contain the new caCertFileContent. For more information, see xref:configuring-the-stf-connection-for-the-overcloud_assembly-completing-the-stf-configuration[]. +ifdef::include_when_13[] +. Generate an inventory file: ++ +[source,bash,options="nowrap"] +---- +[stack@undercloud-0 ~]$ tripleo-ansible-inventory --static-yaml-inventory ./tripleo-ansible-inventory.yaml +---- +endif::include_when_13[] + . Copy the `STFCA.pem` file to each {OpenStackShort} overcloud node: + [source,bash,options="nowrap"] +ifdef::include_when_13[] +---- +[stack@undercloud-0 ~]$ ansible -i tripleo-ansible-inventory.yaml overcloud -b -m copy -a "src=STFCA.pem dest=/var/lib/config-data/puppet-generated/metrics_qdr/etc/pki/tls/certs/CA_sslProfile.pem" +---- +endif::include_when_13[] +ifdef::include_when_17[] ---- [stack@undercloud-0 ~]$ ansible -i overcloud-deploy/overcloud/tripleo-ansible-inventory.yaml allovercloud -b -m copy -a "src=STFCA.pem dest=/var/lib/config-data/puppet-generated/metrics_qdr/etc/pki/tls/certs/CA_sslProfile.pem" ---- +endif::include_when_17[] + . Restart the metrics_qdr container on each {OpenStackShort} overcloud node: + [source,bash,options="nowrap"] ifdef::include_when_13[] ---- -[stack@undercloud-0 ~]$ tripleo-ansible-inventory --static-yaml-inventory ./tripleo-ansible-inventory.yaml -[stack@undercloud-0 ~]$ ansible -i tripleo-ansible-inventory.yaml allovercloud -m shell -a "sudo podman restart metrics_qdr" +[stack@undercloud-0 ~]$ ansible -i tripleo-ansible-inventory.yaml overcloud -m shell -a "sudo {containerbin} restart metrics_qdr" ---- endif::include_when_13[] ifdef::include_when_16+include_before_17[] diff --git a/doc-Service-Telemetry-Framework/modules/proc_validating-clientside-installation.adoc b/doc-Service-Telemetry-Framework/modules/proc_validating-clientside-installation.adoc index 31d4dd99..1d18c638 100644 --- a/doc-Service-Telemetry-Framework/modules/proc_validating-clientside-installation.adoc +++ b/doc-Service-Telemetry-Framework/modules/proc_validating-clientside-installation.adoc @@ -10,15 +10,26 @@ TIP: Some telemetry data is available only when {OpenStackShort} has active work . Log in to an overcloud node, for example, controller-0. -. Ensure that the `metrics_qdr` container is running on the node: +. Ensure that the `metrics_qdr` and collection agent containers are running on the node: + [source,bash,options="nowrap",subs="attributes"] ---- -$ sudo {containerbin} container inspect --format '{{.State.Status}}' metrics_qdr - +$ sudo {containerbin} container inspect --format '{{.State.Status}}' metrics_qdr collectd ceilometer_agent_notification ceilometer_agent_central +running +running +running running ---- - ++ +[NOTE] +==== +Use this command on compute nodes: +[source,bash,options="nowrap",subs="attributes"] +----- +$ sudo {containerbin} container inspect --format '{{.State.Status}}' metrics_qdr collectd ceilometer_agent_compute +----- +==== ++ . Return the internal network address on which {MessageBus} is running, for example, `172.17.1.44` listening on port `5666`: + [source,bash,options="nowrap",subs="attributes"]