Skip to content

Commit

Permalink
fix(Data Shape): Updated draft to follow our style guide more closely
Browse files Browse the repository at this point in the history
  • Loading branch information
rhetoric101 committed Oct 26, 2021
1 parent 947322a commit 605103b
Show file tree
Hide file tree
Showing 8 changed files with 105 additions and 106 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
---
title: Prometheus Remote Write data and billed bytes
tags:
- Telemetry Data Platform
- Prometheus Remote Write
- Sent bytes vs. billed bytes
- Prometheus integration
metaDescription: "Explanation for the difference in bytes sent vs. bytes stored/billed for Prometheus Remote Write data."
---

The size of the billed bytes from Prometheus Remote Write can be higher than the bytes sent to New Relic. To make sure you're not surprised by the difference, take a look at how data compression affects billed bytes.

## Data compression [#data-compression]

When Prometheus Remote Write data is sent to New Relic, it is sent [compressed](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations) (for faster, lossless transmission). When ingested, that data is uncompressed and decorated so that it can be properly used with New Relic features, such as [entity synthesis](/docs/new-relic-one/use-new-relic-one/core-concepts/what-entity-new-relic/#entity-synthesis). Although you should expect a difference in the compressed to uncompressed byte rate, the potential difference for Prometheus Remote Write data is important because of New Relic’s [billing model](/docs/accounts/accounts-billing/new-relic-one-pricing-billing/new-relic-one-pricing-billing/#usage-calculation).

You are billed based on the computational effort needed to ingest your data, as well as the size of the data stored within New Relic. The decompression process and data transformations can result in the final uncompressed bytes stored being around 15x the size of its compressed counterpart.

For example, based on a sampling of time-series data we gathered when simulating real world traffic, you might see something like this:

```
~124 GB/day compressed data sent = ~1.86TB uncompressed data stored
```

Below is a simulation of the byte rate changes as Prometheus Read Write data moves through our system. In this case, metrics were generated by ingesting a local Prometheus server’s remote write scrape of a local node-exporter:

![Byte rate estimate total comparison](./images/byte-rate-estimate-total-comparison.png "Byte rate estimate total comparison")

Note how the Prometheus sent byte rate closely matches the Remote Write compressed bytes count that we record on our end just before uncompressing the data point(s). We can attribute the increased variance of the Remote Write compressed byte rate to the nature of processing the data through our distributed systems:

![Sent vs. compressed bytes comparison](./images/sent-vs-compressed-bytes-comparison.png "Sent vs. compressed bytes comparison")

As the data points are uncompressed, the 5-10x expansion factor is reflected in the difference between the Remote Write compressed data byte rate and the Remote Write uncompressed bytes rate, which are measurements taken right before and after data decompression.

![Uncompressed vs. compressed bytes comparison](./images/uncompressed-vs-compressed-bytes-comparison.png "Uncompressed vs. compressed bytes comparison")

Finally, as the data is transformed and enrichments are performed, the difference between the Remote Write uncompressed bytes and the `bytescountestimate()` can be seen below. The `bytecountestimate()` listed is a measure of byte count of the final state of the data before being stored.

![Bytecountestimate() vs. uncompressed bytes comparison](./images/bytecountestimate-vs-uncompressed-bytes-comparison.png "Bytecountestimate() vs. uncompressed bytes comparison")

To give a better understanding of the possible data transformations/additions Prometheus Read Write data can go through, below is a side-by-side comparison of the `prometheus_remote_storage_bytes_total` metric, a measure reported by the Prometheus server.

This is an example comparison for illustrative purposes, so it's lightly decorated. The final view/comparison of other, more densely labelled and/or featured metrics will be different. On the left is a representation as given by Prometheus and on the right its NRQL query counterpart:

*Prometheus Server Representation*

```
"prometheus_remote_storage_bytes_total" {
"instance=""localhost:9090"
"job=""prometheus"
"remote_name=""5dfb33"
"url=""https://staging-metric-api.newrelic.com/prometheus/v1/write?prometheus_server=foobarbaz"
}
23051
```

*New Relic Query Representation*

```
"endTimestamp": 1631305327668,
"instance:" "localhost:9090",
"instrumentation.name": "remote-write"
"instrumentation.provider": "prometheus",
"instrumentation.source": "foobarbaz",
"instrumentation.version": "0.0.2",
"job": "prometheus",
"metricName": "prometheus_remote_storage_bytes_total",
"newrelic.source": "prometheusAPI",
"prometheus_remote_storage_bytes_total",
"newrelic.source": "prometheusAPI",
"prometheus_remote_storage_bytes_total": {
"type": "count",
"count": 23051
},
"prometheus_server": "foobarbaz",
"remote_name": "5dfb33",
"timestamp": 1631305312668,
"url": "https://staging-metric-api.newrelic.com/prometheus/v1/write?prometheus_server=foobarbaz"
}
```

## NRQL queries [#nrql-queries]

Try these queries to gather byte count information:

*Viewing estimated byte count stored at New Relic*

```
FROM Metric SELECT rate(bytecountestimate(), 1 minute) AS 'bytecountestimate()' WHERE prometheus_server = <var>INSERT_PROMETHEUS_SERVER_NAME</var> SINCE 1 hour ago TIMESERIES AUTO
```

*Prometheus monitoring of bytes sent to New Relic*

```
FROM Metric SELECT rate(sum(prometheus_remote_storage_samples_bytes_total), 1 minute) AS 'Prometheus sent bytes rate' WHERE prometheus_server = <var>INSERT_PROMETHEUS_SERVER_NAME</var> SINCE 1 hour ago TIMESERIES AUTO
```

## References [#references]

Here are some links to clarify compression and encoding:

* [Prometheus referencing Snappy Compression being used in encoding](https://prometheus.io/docs/prometheus/latest/storage/#overview): The read and write protocols both use a snappy-compressed protocol buffer encoding over HTTP. The protocols are not considered as stable APIs yet and may change to use gRPC over HTTP/2 in the future, when all hops between Prometheus and the remote storage can safely be assumed to support HTTP/2.

* [Prometheus Protobuf Reference](https://github.com/prometheus/prometheus/blob/main/prompb/types.proto#L58-L64)

This file was deleted.

0 comments on commit 605103b

Please sign in to comment.