dora-metrics · KevinMGranger · Sep 15, 2022 · Sep 15, 2022 · Sep 15, 2022 · Sep 15, 2022
diff --git a/docs/Development.md b/docs/Development.md
@@ -15,7 +15,7 @@ This track is focused around the development of custom [Prometheus exporters](ht
 
 We use [Helm](https://helm.sh) to provide an automated deployment and configuration experience for Pelorus. We are always doing work to cover more and more complex use cases with our helm charts. In order to be able to effectively contribute to these charts, you'll need a cluster that satisfies all of the installation prerequisites for Pelorus.
 
-See the [Install guide](Install.md) for more details on that.
+See the [installation guide](GettingStarted.md#installation) for more details.
 
 Currently we have two charts:
 
@@ -292,7 +292,7 @@ Checkout the PR on top of your fork.
 
 1. [Checkout](#checkout) the PR on top of your fork.
 
-2. [Install Pelorus](Install.md) from checked out fork/branch.
+2. [Install Pelorus](GettingStarted.md) from checked out fork/branch.
 
     **NOTE:**
 
@@ -341,7 +341,7 @@ Each PR runs exporter tests in the CI systems, however those changes can be test
 
 ### Helm Install changes
 
-For testing changes to the helm chart, you should just follow the [standard install process](Install.md), then verify that:
+For testing changes to the helm chart, you should just follow the [standard install process](GettingStarted.md), then verify that:
 
 * All expected pods are running and healthy
 * Any expected behavior changes mentioned in the PR can be observed.

diff --git a/docs/Install.md → docs/GettingStarted.md b/docs/Install.md → docs/GettingStarted.md
@@ -1,9 +1,29 @@
+# Getting Started
 
-# Installation
+## Basic Concepts
+
+Pelorus presents various [_measures_](../dashboards/SoftwareDeliveryPerformance#measures) to you, such as Lead Time for Change (how long it takes for a commit to wind up in production).
+
+These measures are calculated from various _metrics_.
+For example, the Lead Time for Change measure is calculated as the difference between the time of a deployment including that commit (`deploy_time`), and the time that commit was made (`commit_time`).
+
+These measures are provided using _exporters_, which collect information from various sources.
+For example, the `deploytime` exporter looks for running pods in OpenShift. The `committime` exporter looks for OpenShift Builds, and correlates them with git commit information from various _providers_, such as GitHub or Bitbucket.
+
+### Preparing Your Data
+
+<!-- TODO: explain how app names work in the first place -->
+
+To properly collect various metrics, Pelorus will need to find certain metadata. In common cases, this metadata may already be there! If not, you will need to adjust how these resources are created in OpenShift.
+
+For now, we'll focus on deploy time: to capture deployments, `Pod`s and their `ReplicationController`s must be labeled with the _app name_. This is `app.kubernetes.io/name` by default, but can be [customized](Configuration.md#labels).
+
+
+## Installation
 
 The following will walk through the deployment of Pelorus.
 
-## Prerequisites
+### Prerequisites
 
 Before deploying the tooling, you must have the following prepared
 
@@ -14,7 +34,7 @@ Before deploying the tooling, you must have the following prepared
   * jq
   * git
 
-## Initial Deployment
+### Initial Deployment
 
 Pelorus gets installed via helm charts. The first deploys the operators on which Pelorus depends, the second deploys the core Pelorus stack and the third deploys the exporters that gather the data. By default, the below instructions install into a namespace called `pelorus`, but you can choose any name you wish.
 
@@ -40,9 +60,7 @@ In a few seconds, you will see a number of resourced get created. The above comm
 * The following exporters:
     * Deploy Time
 
-From here, some additional configuration is required in order to deploy other exporters, and make the Pelorus
-
-See the [Configuration Guide](Configuration.md) for more information on exporters.
+From here, some additional [configuration](Configuration.md) and [data preparation](#preparing-your-data-details) is required in order to deploy other exporters.
 
 You may additionally want to enabled other features for the core stack. Read on to understand those options.
 
@@ -107,7 +125,7 @@ If you don't have an object storage provider, we recommend [NooBaa](https://www.
 
 By default, this tool will pull in data from the cluster in which it is running. The tool also supports collecting data across mulitple OpenShift clusters. In order to do this, the thanos sidecar can be configured to read from a shared S3 bucket accross clusters. See [Pelorus Multi-Cluster Architecture](Architecture.md) for details. You define exporters for the desired meterics in each of the clusters which metrics will be evaluated.  The main cluster's Grafana dashboard will display a combined view of the metrics collected in the shared S3 bucket via thanos.
 
-#### Configure Production Cluster.
+### Configure Production Cluster
 
 The produciton configuration example with one deploytime exporter, which uses AWS S3 bucket and AWS volume for Prometheus and tracks deployments to production:
 
@@ -135,7 +153,23 @@ exporters:
     - pelorus-config
     - deploytime-config
 ```
+## Preparing Your Data: Details
+
+#### Commit Time
+
+`Build`s must have a commit hash and repository URL associated with them.
+
+The commit hash comes from either the build's `spec.revision.git.commit` (populated in Source to Image builds), or falls back to the [annotation](./Configuration.md#annotations-and-local-build-support) `io.openshift.build.commit.id`.
+
+The repository URL comes from either the build's `spec.source.git.uri` (populated in Source to Image builds), or falls back to the [annotation](./Configuration.md#annotations-and-local-build-support) `io.openshift.build.source-location`.
+
+The commit time exporter(s) must be [configured](./Configuration.md#commit-time-exporter) to point to the proper git provider(s).
+
+<!-- TODO: info about image exporter -->
+
+#### Failure Time
 
+Metadata from issue trackers is provider-specific. See [Failure Time Exporter Configuration](./Configuration.md#failure-time-exporter) for details.
 
 ## Uninstalling
 

diff --git a/docs/Noobaa.md b/docs/Noobaa.md
@@ -2,7 +2,7 @@
 
 NooBaa is a software-driven data service that provides S3 object-storage interface that we use for testing and development of Pelorus project.
 
-The following is a walkthrough for deploying NooBaa Operator on OpenShift and then configuring Pelorus to consume it as a [Long Term Storage](Install.md#configure-long-term-storage-recommended) solution.
+The following is a walkthrough for deploying NooBaa Operator on OpenShift and then configuring Pelorus to consume it as a [Long Term Storage](GettingStarted.md#configure-long-term-storage-recommended) solution.
 
 ## Install NooBaa Operator CLI
 
@@ -105,7 +105,7 @@ noobaa bucket status thanos --namespace pelorus
 
 ## Update Pelorus Configuration
 
-To update our Pelorus stack, follow the instructions provided in the [Long Term Storage](Install.md#configure-long-term-storage-recommended).
+To update our Pelorus stack, follow the instructions provided in the [Long Term Storage](GettingStarted.md#configure-long-term-storage-recommended).
 
 Ensure that `<s3 access key>`, `<s3 secred access key>` and the `<bucket name>` are used from the [Deploy NooBaa
 ](#deploy-noobaa) step and `s3.pelorus.svc:443`, which is an `S3 InternalDNS Address` from the `noobaa status --namespace pelorus` command, as bucket access point as in example:

diff --git a/docs/Troubleshooting.md b/docs/Troubleshooting.md
@@ -0,0 +1,10 @@
+# Troubleshooting
+
+## Information not showing up in dashboard
+
+We've included a troubleshooting script to check if your data is labeled correctly,
+as required for [the deploy time exporter](GettingStarted.md#preparing-your-data)
+and [the other exporters](GettingStarted.md#preparing-your-data-details).
+
+With a [local dev environment](Development.md#dev-environment-setup) set up,
+run `./scripts/troubleshooting/missing_labels -h` for information about how to use it.
diff --git a/docs/dashboards/SoftwareDeliveryPerformance.md b/docs/dashboards/SoftwareDeliveryPerformance.md
@@ -2,6 +2,9 @@
 
 _Software Delivery Performance_ is a measure of an organization's ability to effectively deliver software-based products they have built to their customers. It is comprised of 4 _measures_ that provide a balanced perspective, taking both speed to market and stability measures into account. Tracking _Software Delivery Performance_ over time provides IT organizations with data they can use to make smarter investments in their internal tools and processes to optimize their delivery processes based on the types of products they are delivering. This outcomes provides a bridge between development, operations and leadership, allowing them to better communicate about whether proposed work on infrastructure imrovements or process developments are in line with the overall vision and financial goals of the organization at large.
 
+
+## Measures
+
 ![Software Delivery Performance dashboard](../img/sdp-dashboard.png)
 
 The Pelorus _Software Delivery Performance_ dashboard tracks the four primary measures of software delivery:
@@ -13,7 +16,6 @@ The Pelorus _Software Delivery Performance_ dashboard tracks the four primary me
 
 For more information about Software Delivery Performance, check out the book [Accelerate](https://itrevolution.com/book/accelerate/) by Forsgren, Kim and Humble.
 
-## Measures
 
 ![Exporter relaionship diagram](../img/exporter-relationship-diagram.png)
 

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -4,16 +4,19 @@ theme: readthedocs
 markdown_extensions:
   - def_list
   - tables
+  - toc:
+      permalink: True
 nav:
   - Introduction:
     - Welcome to Pelorus: index.md
     - Our Philosphy: Philosophy.md
   - Using Pelorus:
+    - Getting Started: GettingStarted.md
     - Architecture: Architecture.md
-    - Installation: Install.md
     - Demo: Demo.md
     - Configuration: Configuration.md
     - NooBaa for Long Term Storage: Noobaa.md
+    - Troubleshooting: Troubleshooting.md
   - Dashboards:
     - Dashboard Summary: Dashboards.md
     - Software Delivery Performance: dashboards/SoftwareDeliveryPerformance.md

diff --git a/scripts/troubleshooting/missing_labels.py b/scripts/troubleshooting/missing_labels.py
@@ -15,6 +15,10 @@
 import pelorus.utils
 from pelorus.utils import paginate_resource
 
+DOCS_BASE_URL = "https://pelorus.readthedocs.io/en/stable/"
+DEPLOYTIME_PREPARE_DATA_URL = DOCS_BASE_URL + "GettingStarted#preparing-your-data"
+COMMITTIME_PREPARE_DATA_URL = DOCS_BASE_URL + "GettingStarted#commit-time"
+
 # A NOTE ON TERMINOLOGY:
 # what you might call a "resource" in openshift is called a ResourceInstance by the client.
 # to the client, a Resource is its "type definition".
@@ -197,6 +201,10 @@ class DeploytimeTroubleshootingReport:
     pods_missing_app_label: list[PodId]
     replicators_missing_app_label: dict[ReplicatorId, OwnedPods]
 
+    @property
+    def anything_to_report(self):
+        return self.pods_missing_app_label or self.replicators_missing_app_label
+
     def _print_pods(self):
         if not self.pods_missing_app_label:
             print("No pods were missing the app label", self.app_label)
@@ -216,10 +224,16 @@ def _print_replicators(self):
         for replicator in self.replicators_missing_app_label:
             print(" ", replicator.kind_, replicator.name)
 
+    def _print_suggestion(self):
+        print(f"Add the label {self.app_label}.")
+        print("See", DEPLOYTIME_PREPARE_DATA_URL)
+
     def print_human_readable(self):
         self._print_pods()
         print()
         self._print_replicators()
+        if self.anything_to_report:
+            self._print_suggestion()
 
     def to_json(self) -> dict:
         pods_missing_label = [pod.name for pod in self.pods_missing_app_label]
@@ -253,6 +267,10 @@ class CommittimeTroubleshootingReport:
 
     builds_missing_app_label: list[BuildId]
 
+    @property
+    def anything_to_report(self):
+        return bool(self.builds_missing_app_label)
+
     def print_human_readable(self):
         if not self.builds_missing_app_label:
             print("No builds were missing the app label", self.app_label)
@@ -262,6 +280,8 @@ def print_human_readable(self):
         for build in self.builds_missing_app_label:
             print(build.name)
 
+        # TODO: app label committime docs?
+
     def to_json(self) -> dict:
         return dict(
             namespace=namespace,