Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jsonnet/telemeter: Record a "subscribed cluster" metric #297

Merged

Commits on Jan 16, 2020

  1. jsonnet/telemeter: Add recording rule for cluster availability

    We have defined the OpenShift SLI based on whether the cluster
    version operator is reporting a failure or not over a period of
    time. To better support queries that attempt to aggregate availability
    over classes of cluster, create a recording rule that is 1 when
    a cluster is available and 0 when the cluster is reporting unhealthy.
    If the cluster is not reporting any metrics it would have no data
    at that point in time. Include the current version of the cluster
    since it is implicit at the time of the recording rule.
    
    This can be used in more complex aggregates by joining on cluster
    id.
    smarterclayton committed Jan 16, 2020
    Configuration menu
    Copy the full SHA
    be10a02 View commit details
    Browse the repository at this point in the history
  2. jsonnet/telemeter: Record a "subscribed cluster" metric

    There are a number of key dimensions joined against the subscription
    data that almost all queries base on.
    
    1. Is it subscribed
    2. Is it internal
    3. Is it openshift dedicated
    4. Is it in a range of versions
    5. AND is it currently running (subscription_labels does not check that)
    
    This recording rule collapses those into a single metric with
    cardinality O(clusters) that should dramatically reduce the cost of
    queries.
    
    Combined with the previous rule (available) the extremely common
    query "which subscribed clusters are unhealthy" becomes
    
    ```
    id_version:cluster_available == 0 + on(_id) max by (_id) (id_version_ebs_account_internal:cluster_subscribed == 1)
    ```
    
    This also reduces the data that must be searched for almost all other
    joins and some dashboards.
    smarterclayton committed Jan 16, 2020
    Configuration menu
    Copy the full SHA
    9349f2a View commit details
    Browse the repository at this point in the history