Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote read/write interfaces use Snappy when they probably shouldn't #2692

Closed
engelrob opened this issue May 9, 2017 · 11 comments
Closed

Remote read/write interfaces use Snappy when they probably shouldn't #2692

engelrob opened this issue May 9, 2017 · 11 comments

Comments

@engelrob
Copy link

engelrob commented May 9, 2017

What did you do?

I implemented a remote_read interface for Prometheus in Java.

What did you expect to see?

I expected to receive Protobuf-encoded remote ReadRequests from Prometheus, but instead...

What did you see instead? Under which circumstances?

Instead of the protobuf binary format, Prometheus' remote read requests are Snappy-compressed data.

First, this is not stated in the Content-type, which says "application/x-protobuf" indicating that it is protobuf binary format, not Snappy.

Second, I am currently not aware of any Java implementation of the Snappy codec that can successfully decode golang-snappy encoded data (I tried both xerial-snappy and dain/snappy and could not make it decompress the golang-snappy encoded data).

  • Prometheus version:

    1.6.1

  • Prometheus configuration file:

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

remote_read:
  - url: "http://localhost:9201/read"
@tomwilkie
Copy link
Member

Hi there - we chose snappy intentionally, as a trade off between compression ratio and cpu usage. It was out intention that this should work across languages.

First, this is not stated in the Content-type, which says "application/x-protobuf" indicating that it is protobuf binary format, not Snappy.

This was suggested in issue #2522. I think this header is correct - the body is protobuf - but we should probably also send Content-Encoding: snappy as well.

Second, I am currently not aware of any Java implementation of the Snappy codec that can successfully decode golang-snappy encoded data (I tried both xerial-snappy and dain/snappy and could not make it decompress the golang-snappy encoded data).

This is concerning. Snappy is supposed to be cross language! Do you some code I can use to reproduce this? if this is indeed true then its a bug in the upstream snappy library IMO.

@tomwilkie
Copy link
Member

It seems you are not the first person to notice this: xerial/snappy-java#175

I've reproduced the issue with a pair of simple test programs here: https://github.com/weaveworks-experiments/prometheus-benchmarks/tree/master/cmd/snappy

@tomwilkie
Copy link
Member

https://github.com/eapache/go-xerial-snappy seems to imply the problem is on xerials end.

@tomwilkie
Copy link
Member

Right, I think the problem is that snappy doesn't specify a streaming format, so each library has their own. I can't find a java library that is compatible with golang/snappy's framing format.

I suggest we remove the framing format (we have all the data in memory) and just send the unframed snappy.

@tomwilkie
Copy link
Member

And the content-encoding header has already been added: 3f23aa2

@juliusv
Copy link
Member

juliusv commented May 9, 2017

I didn't know about Snappy's framing format. https://github.com/google/snappy/blob/master/framing_format.txt says,

Implementation of the framing format is optional for Snappy compressors and decompressor; it is not part of the Snappy core specification.

...so I think you're right that we have to omit the framing and use snappy.Encode() / snappy.Decode() instead of snappy.Writer and snappy.Reader to achieve that. A bit of a bummer, as I predict this will cause quite a bit of confusion down the road for users that expect that the direct encode/decode would do the same thing as the reader/writer variants, with different interfaces.

@tomwilkie
Copy link
Member

Yeah its a bit... weird. Even worse, the golang reader/writer can't read the non-framed encoding. So we'll need to bump the remote read/write version header, and detect it using that. PR incoming.

@engelrob
Copy link
Author

engelrob commented May 9, 2017

Thanks for addressing this. Other than the fix for cross-lang Snappy, it will also help that the Content-encoding header is now going to say it's Snappy (in my case it took me a couple of hours to figure out why my protobuf deserializer was not able to read the bytestream). Maybe it would be good to also mention in the docs that it's Snappy encoded.

@tomwilkie
Copy link
Member

tomwilkie commented May 9, 2017

the content-encoding header has already been added: 3f23aa2

(edit) actually looks like I added it ages ago! d838792#diff-ae8db9d16d8057358e49d694522e7186R82

@tomwilkie
Copy link
Member

Looks like its not there on read though, and it should be.

mattbostock added a commit to mattbostock/timbala that referenced this issue Jun 26, 2017
Prometheus switched to use unframed Snappy encoding as of version 1.7.0.

Upgrade Prometheus in the integration test environment to 1.7.1 and
update the remote read API endpoint to accept unframed Snappy-encoded
data from Prometheus, and update the acceptance tests to match.

See:
prometheus/prometheus#2692
https://github.com/prometheus/prometheus/tree/v1.7.0
mattbostock added a commit to mattbostock/timbala that referenced this issue Jun 26, 2017
Prometheus switched to use unframed Snappy encoding as of version 1.7.0.

Upgrade Prometheus in the integration test environment to 1.7.1 and
update the remote read API endpoint to accept unframed Snappy-encoded
data from Prometheus, and update the acceptance tests to match.

See:
prometheus/prometheus#2692
https://github.com/prometheus/prometheus/tree/v1.7.0
@lock
Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants