Skip to content

Latest commit

 

History

History
500 lines (399 loc) · 16.9 KB

rest-endpoints.adoc

File metadata and controls

500 lines (399 loc) · 16.9 KB

REST endpoints

This section describes the REST-API, that monitoring agents would use to retrieve the collected metrics. (Java-) methods mentioned refer to the respective Objects in the Java API. See also app-programming-model.adoc

JSON format

  • When using JSON format, the REST API will respond to GET requests with data formatted in a tree like fashion with sub-trees for the sub-resources. A sub-tree that does not contain data must be omitted.

  • A 'shadow tree' that responds to OPTIONS will provide the metadata and tags associated to a metric name.

Translation rules for metric names and handling of tags

The following rules apply only to GET requests:

  • Tags are appended to the leaf element of the metric’s JSON tree.

  • For metrics with tags, the metric name must be appended with a semicolon ; followed by a semicolon-separated list of tag key/value pairs.

  • For compound metrics (those with child JSON attributes) with tags, only the "leaf" metric names are decorated with tags.

  • Semicolons ; present in tag values must be converted to underscores _ in JSON output.

For example:

{
 "carsCounter;colour=red": 0,
 "carsCounter;car=sedan;colour=blue": 0,
 "carsSpeed": {
    "count;colour=red": 324,
    "percentile;colour=red;_p=50": 110,
    "percentile;colour=red;_p=75": 122,
    "percentile;colour=red;_p=95": 135,
    "percentile;colour=red;_p=98": 138,
    "percentile;colour=red;_p=99": 141,
    "percentile;colour=red;_p=99.9": 155,
    "count;colour=blue": 199,
    "percentile;colour=blue;_p=50": 105,
    "percentile;colour=blue;_p=75": 118,
    "percentile;colour=blue;_p=95": 133,
    "percentile;colour=blue;_p=98": 139,
    "percentile;colour=blue;_p=99": 140,
    "percentile;colour=blue;_p=99.9": 152
 }
}

The following apply to both GET and OPTION requests:

  • Each tag is a key-value-pair in the format of <key>=<value>. The list of tags must be sorted alphabetically by key name.

  • If the metric name or tag value contains a special reserved JSON character, these characters must be escaped in the JSON response.

If the metric has no tags, the semicolon ; must be omitted.

For example,

{
  "metricWithoutTags": 192
}

REST-API Objects

API-objects MAY include one or more metrics as in

{
  "thread.count": 33,
  "thread.max.count": 47,
  "memory.maxHeap": 3817863211,
  "memory.usedHeap": 16859081,
  "memory.committedHeap": 64703546
}

or

{
  "hitCount;type=yes": 45
}

In case /metrics is requested, then the data for the scopes are wrapped in the scope name:

{
  "application": {
    "hitCount": 45
  },
  "base": {
     "thread.count": 33,
     "thread.max.count": 47
  },
  "vendor": {...},
  "someCustomScope": {
    "myCounter": 22
  }
}

If there is a scope that contains no metrics, then it can be either present with an empty object as its value, or it can be omitted completely.

Gauge JSON Format

The value of the gauge must be equivalent to a call to the instance Gauge’s getValue(). The JSON leaf is named <metric-name>[';'<tag-name>'='<tag-value>]+ with tags in alphabetical order.

Example Gauge JSON GET Response
{
  "responsePercentage": 48.45632,
  "responsePercentage;servlet=two": 26.23654,
  "responsePercentage;servlet=three;store=webshop": 29.24554
}

Counter JSON Format

The value of the counter must be equivalent to a call to the instance Counter’s getCount(). The JSON leaf is named <metric-name>[';'<tag-name>'='<tag-value>]+ with tags in alphabetical order.

Example Counter JSON GET Response
{
  "hitCount": 45,
  "hitCount;servlet=two": 3,
  "hitCount;servlet=three;store=webshop": 4
}

Histogram JSON Format

Histogram is a complex metric type comprised of multiple key/values. The format is specified by the table below. The JSON node is named <metric-name>. The JSON leaves are named <key>[';'<tag-name>'='<tag-value>]+ with tags in alphabetical order and keys according to below table.

Table 1. JSON mapping for a Histogram metric
JSON Key Value (Equivalent Histogram method)

count

getCount()

sum

getSum()

min

getSnapshot().getMin()

max

getSnapshot().getMax()

p50

getSnapshot().getMedian()

p75

getSnapshot().get75thPercentile()

p95

getSnapshot().get95thPercentile()

p98

getSnapshot().get98thPercentile()

p99

getSnapshot().get99thPercentile()

p999

getSnapshot().get999thPercentile()

Example Histogram JSON GET Response
{
  "daily_value_changes": {
    "count": 2,
    "sum": -1598,
    "min": -1624,
    "max": 26,
    "p50": 26.0,
    "p75": 26.0,
    "p95": 26.0,
    "p98": 26.0,
    "p99": 26.0,
    "p999": 26.0,
    "count;servlet=two": 2,
    "sum;servlet=two": -1598,
    "min;servlet=two": -1624,
    "max;servlet=two": 26,
    "p50;servlet=two": 26.0,
    "p75;servlet=two": 26.0,
    "p95;servlet=two": 26.0,
    "p98;servlet=two": 26.0,
    "p99;servlet=two": 26.0,
    "p999;servlet=two": 26.0
  }
}

Timer JSON Format

Timer is a complex metric type comprised of multiple key/values. The format is specified by the table below. The JSON node is named <metric-name>. The JSON leaves are named <key>[';'<tag-name>'='<tag-value>]+ with tags in alphabetical order and keys according to below table.

Table 2. JSON mapping for a Timer metric
JSON Key Value (Equivalent Timer method)

count

getCount()

elapsedTime

getElapsedTime()

min

getSnapshot().getMin()

max

getSnapshot().getMax()

p50

getSnapshot().getMedian()

p75

getSnapshot().get75thPercentile()

p95

getSnapshot().get95thPercentile()

p98

getSnapshot().get98thPercentile()

p99

getSnapshot().get99thPercentile()

p999

getSnapshot().get999thPercentile()

Example Timer JSON GET Response
{
  "responseTime": {
    "count": 29382,
    "elapsedTime": 25608694,
    "min": 169916,
    "max": 5608694,
    "p50": 293324.0,
    "p75": 344914.0,
    "p95": 543647.0,
    "p98": 2706543.0,
    "p99": 5608694.0,
    "p999": 5608694.0,
    "count;servlet=two": 29382,
    "elapsedTime;servlet=two": 25608694,
    "min;servlet=two": 169916,
    "max;servlet=two": 5608694,
    "p50;servlet=two": 293324.0,
    "p75;servlet=two": 344914.0,
    "p95;servlet=two": 543647.0,
    "p98;servlet=two": 2706543.0,
    "p99;servlet=two": 5608694.0,
    "p999;servlet=two": 5608694.0
  }
}

Metadata

Metadata is exposed in a tree-like fashion with sub-trees for the sub-resources mentioned previously. Tags from metrics associated with the metric name are also included. The 'tags' attribute is an array of nested arrays which hold tags from different metrics that are associated with the metadata. Tags in each inner array are in alphabetical order.

Example:

If GET /metrics/base/fooVal exposes:

{
  "fooVal;store=webshop": 12345
}

then OPTIONS /metrics/base/fooVal will expose:

{
  "fooVal": {
    "unit": "milliseconds",
    "type": "gauge",
    "description": "The size of foo after each request",
    "displayName": "Size of foo",
    "tags": [
      [
        "store=webshop"
      ]
    ]
  }
}

If GET /metrics/base exposes multiple values like this:

Example of exposed metrics data
{
  "fooVal;store=webshop": 12345,
  "barVal;component=backend;store=webshop": 42,
  "barVal;component=frontend;store=webshop": 63
}

then OPTIONS /metrics/base exposes:

Example of JSON output of Metadata
{
  "fooVal": {
    "unit": "milliseconds",
    "type": "gauge",
    "description": "The average duration of foo requests during last 5 minutes",
    "displayName": "Duration of foo",
    "tags": [
      [
        "store=webshop"
      ]
    ]
  },
  "barVal": {
    "unit": "megabytes",
    "type": "gauge",
    "tags": [
      [
        "component=backend",
        "store=webshop"
      ],
      [
        "component=frontend",
        "store=webshop"
      ]
    ]
  }
}

Prometheus / OpenMetrics formats

The REST API must respond to GET requests with data formatted according to the Prometheus text-based exposition format, version 0.0.4 (hereafter Prometheus format). For details of how to format metrics data in this format, see Prometheus format.

Implementations may additionally provide the ability to respond to GET requests with data formatted according to the OpenMetrics exposition format, version 1.0 (hereafter OpenMetrics format). For details on how to format metrics data in this format, see OpenMetrics format.

This section provides the details of how to map from the Gauge, Counter, Timer and Histogram types defined in this specification into appropriate fields in the Prometheus format.

Details of how to format metric names, including conventions, special character mapping and placement of the unit (if provided) in the name, are as described by the Prometheus format and OpenMetrics format documentation.

Quantile values, as used in Histogram and Timer output, should represent recent values (typically from the last 5-10 minutes). If no data is available from that timeframe, the value must be set to NaN.

Gauge

Example Gauge with unit celsius in Prometheus format.
# HELP current_temperature_celsius The current temperature. (1)
# TYPE current_temperature_celsius gauge (2)
current_temperature_celsius{scope="application",server="front_office"} 36.2 (3)
  1. The description of the gauge, from the getDescription() method of the Metadata associated to the gauge, must be provided in the HELP line

  2. The type of the metric, in this case gauge, must be shown in the TYPE line

  3. The value specified must be the value of the gauge’s getValue() method. Tags, if provided, are included in brackets separated by commas.

Counter

Example Counter with unit events in Prometheus format.
# HELP messages_processed_events_total Number of messages handled (1)
# TYPE messages_processed_events_total counter (2)
messages_processed_events_total{scope="application"} 1.0 (3)
  1. The description of the counter must be provided in the HELP line

  2. The type of the metric, in this case counter, must be shown in the TYPE line

  3. The value specified must be the value of the counter’s getCount() method. Tags, if provided, are included in brackets separated by commas. By convention, _total should be added to the end of the counter name.

Histogram

Example Histogram with unit meters in Prometheus format.
# HELP distance_to_hole_meters_max Distance of golf ball to hole (1)
# TYPE distance_to_hole_meters_max gauge (2)
distance_to_hole_meters_max{scope="golf_stats"} 12.722726616315509 (3)
# HELP distance_to_hole_meters Distance of golf ball to hole (1)
# TYPE distance_to_hole_meters summary (2)
distance_to_hole_meters{scope="golf_stats",quantile="0.5"} 2.8748779296875 (3)
distance_to_hole_meters{scope="golf_stats",quantile="0.75"} 4.4998779296875 (3)
distance_to_hole_meters{scope="golf_stats",quantile="0.95"} 7.9998779296875 (3)
distance_to_hole_meters{scope="golf_stats",quantile="0.98"} 9.4998779296875 (3)
distance_to_hole_meters{scope="golf_stats",quantile="0.99"} 11.9998779296875 (3)
distance_to_hole_meters{scope="golf_stats",quantile="0.999"} 12.9998779296875 (3)
distance_to_hole_meters_count{scope="golf_stats"} 487.0 (3)
distance_to_hole_meters_sum{scope="golf_stats"} 1569.3785694223322 (3)

Histogram output is comprised of a maximum section and a summary section.

  1. The description of the histogram must be provided on the HELP lines for the maximum and summary

  2. The type of the metrics, in this case gauge (for the maximum) and summary for the summary. The summary type is comprised of the count, sum and multiple quantile values.

  3. The value of each metric included in the output is described in the table below. Tags, if provided, are included in brackets separated by commas. Percentile metrics include a quantile label that is merged with the metric’s tags.

Table 3. Prometheus format mapping for a Histogram metric
Suffix{label} TYPE Value (Histogram method) Units

<units>_max

Gauge

getSnapshot().getMax()

<units>

<units>{quantile="0.5"}

Summary

getSnapshot().getMedian()

<units>

<units>{quantile="0.75"}

Summary

getSnapshot().get75thPercentile()

<units>

<units>{quantile="0.95"}

Summary

getSnapshot().get95thPercentile()

<units>

<units>{quantile="0.98"}

Summary

getSnapshot().get98thPercentile()

<units>

<units>{quantile="0.99"}

Summary

getSnapshot().get99thPercentile()

<units>

<units>{quantile="0.999"}

Summary

getSnapshot().get999thPercentile()

<units>

<units>_count

Summary

getCount()

<units>

<units>_sum

Summary

getSum()

<units>

Timer

Example Timer in Prometheus format. Timers use seconds as the unit.
# HELP myClass_myMethod_seconds duration of myMethod (1)
# TYPE myClass_myMethod_seconds summary (2)
myClass_myMethod_seconds{scope="vendor",quantile="0.5"} 0.0524288 (3)
myClass_myMethod_seconds{scope="vendor",quantile="0.75"} 0.0524288 (3)
myClass_myMethod_seconds{scope="vendor",quantile="0.95"} 0.054525952 (3)
myClass_myMethod_seconds{scope="vendor",quantile="0.98"} 0.054525952 (3)
myClass_myMethod_seconds{scope="vendor",quantile="0.99"} 0.054525952 (3)
myClass_myMethod_seconds{scope="vendor",quantile="0.999"} 0.054525952 (3)
myClass_myMethod_seconds_count{scope="vendor"} 100.0 (3)
myClass_myMethod_seconds_sum{scope="vendor"} 5.310349419 (3)
# HELP myClass_myMethod_seconds_max duration of myMethod (1)
# TYPE myClass_myMethod_seconds_max gauge (2)
myClass_myMethod_seconds_max{scope="vendor"} 0.05507899 (3)

Timer output is comprised of a maximum section and a summary section.

  1. The description of the timer must be provided on the HELP lines for the maximum and summary

  2. The type of the metrics, in this case gauge (for the maximum) and summary for the summary. The summary type is comprised of the count, sum and multiple quantile values.

  3. The value of each metric included in the output is described in the table below. Tags, if provided, are included in brackets separated by commas. Percentile metrics include a quantile label that is merged with the metric’s tags.

Table 4. Prometheus format mapping for a Timer metric
Suffix{label} TYPE Value (Timer method) Units

max_seconds

Gauge

getSnapshot().getMax()

SECONDS1

seconds{quantile="0.5"}

Summary

getSnapshot().getMedian()

SECONDS1

seconds{quantile="0.75"}

Summary

getSnapshot().get75thPercentile()

SECONDS1

seconds{quantile="0.95"}

Summary

getSnapshot().get95thPercentile()

SECONDS1

seconds{quantile="0.98"}

Summary

getSnapshot().get98thPercentile()

SECONDS1

seconds{quantile="0.99"}

Summary

getSnapshot().get99thPercentile()

SECONDS1

seconds{quantile="0.999"}

Summary

getSnapshot().get999thPercentile()

SECONDS1

seconds_count

Summary

getCount()

SECONDS1

seconds_sum

Summary

getElapsedTime()

SECONDS1

1 The implementation is expected to convert the result returned by the Timer into seconds

Security

It must be possible to secure the endpoints via the usual means. The definition of 'usual means' is in this version of the specification implementation specific.

In case of a secured endpoint, accessing /metrics without valid credentials must return a 401 Unauthorized header.

A server SHOULD implement TLS encryption by default.

It is allowed to ignore security for trusted origins (e.g. localhost)