New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote storage #10

Closed
juliusv opened this Issue Jan 4, 2013 · 169 comments

Comments

Projects
None yet
@juliusv
Member

juliusv commented Jan 4, 2013

Prometheus needs to be able to interface with a remote and scalable data store for long-term storage/retrieval.

bernerdschaefer added a commit that referenced this issue Apr 9, 2014

Rename test helper files to helpers_test.go
This ensures that these files are properly included only in testing.

[Fixes #10]
@johann8384

This comment has been minimized.

Show comment
Hide comment
@johann8384

johann8384 Feb 5, 2015

Is there anyone planning to work on this? Is the work done in the opentsdb-integration branch still valid or has the rest of the code-base moved past that?

johann8384 commented Feb 5, 2015

Is there anyone planning to work on this? Is the work done in the opentsdb-integration branch still valid or has the rest of the code-base moved past that?

@beorn7

This comment has been minimized.

Show comment
Hide comment
@beorn7

beorn7 Feb 5, 2015

Member

The opentsdb-integration branch is indeed completely outdated (still using the old storage backend etc.). Personally, I'm a great fan of the OpenTSDB integration, but where I work, there is not an urgent enough requirement to justify a high priority from my side...

Member

beorn7 commented Feb 5, 2015

The opentsdb-integration branch is indeed completely outdated (still using the old storage backend etc.). Personally, I'm a great fan of the OpenTSDB integration, but where I work, there is not an urgent enough requirement to justify a high priority from my side...

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv

juliusv Feb 5, 2015

Member

To be clear, the outdated "opentsdb-integration" was only for the
proof-of-concept read-back support (querying OpenTSDB through Prometheus).

Writing into OpenTSDB should be experimentally supported in master, but
the last time we tried it was a year ago on a single-node OpenTSDB.

You initially asked on #10:

"I added the storage.remote.url command line flag, but as far as I can tell
Prometheus doesn't attempt to store any metrics there."

A couple of questions:

  • did you enable the OpenTSDB option "tsd.core.auto_create_metrics"?
    Otherwise OpenTSDB won't auto-create metrics for you, as the option is
    false by default. See
    http://opentsdb.net/docs/build/html/user_guide/configuration.html
  • if you run Prometheus with -logtostderr, do you see any relevant log
    output? If there is an error sending samples to TSDB, it should be logged
    (glog.Warningf("error sending %d samples to TSDB: %s", len(s), err))
  • Prometheus also exports metrics itself about sending to OpenTSDB. On
    /metrics of your Prometheus server, you should find the counter metrics
    "prometheus_remote_storage_sent_errors_total" and
    "prometheus_remote_storage_sent_samples_total". What do these say?

Cheers,
Julius

On Thu, Feb 5, 2015 at 9:22 AM, Björn Rabenstein notifications@github.com
wrote:

The opentsdb-integration branch is indeed completely outdated (still using
the old storage backend etc.). Personally, I'm a great fan of the OpenTSDB
integration, but where I work, there is not an urgent enough requirement to
justify a high priority from my side...


Reply to this email directly or view it on GitHub
#10 (comment)
.

Member

juliusv commented Feb 5, 2015

To be clear, the outdated "opentsdb-integration" was only for the
proof-of-concept read-back support (querying OpenTSDB through Prometheus).

Writing into OpenTSDB should be experimentally supported in master, but
the last time we tried it was a year ago on a single-node OpenTSDB.

You initially asked on #10:

"I added the storage.remote.url command line flag, but as far as I can tell
Prometheus doesn't attempt to store any metrics there."

A couple of questions:

  • did you enable the OpenTSDB option "tsd.core.auto_create_metrics"?
    Otherwise OpenTSDB won't auto-create metrics for you, as the option is
    false by default. See
    http://opentsdb.net/docs/build/html/user_guide/configuration.html
  • if you run Prometheus with -logtostderr, do you see any relevant log
    output? If there is an error sending samples to TSDB, it should be logged
    (glog.Warningf("error sending %d samples to TSDB: %s", len(s), err))
  • Prometheus also exports metrics itself about sending to OpenTSDB. On
    /metrics of your Prometheus server, you should find the counter metrics
    "prometheus_remote_storage_sent_errors_total" and
    "prometheus_remote_storage_sent_samples_total". What do these say?

Cheers,
Julius

On Thu, Feb 5, 2015 at 9:22 AM, Björn Rabenstein notifications@github.com
wrote:

The opentsdb-integration branch is indeed completely outdated (still using
the old storage backend etc.). Personally, I'm a great fan of the OpenTSDB
integration, but where I work, there is not an urgent enough requirement to
justify a high priority from my side...


Reply to this email directly or view it on GitHub
#10 (comment)
.

@sammcj

This comment has been minimized.

Show comment
Hide comment
@sammcj

sammcj Feb 11, 2015

Contributor

I cannot +1 this enough

Contributor

sammcj commented Feb 11, 2015

I cannot +1 this enough

@mwitkow

This comment has been minimized.

Show comment
Hide comment
@mwitkow

mwitkow Mar 5, 2015

Contributor

Is InfluxDB on the cards in any way? :)

Contributor

mwitkow commented Mar 5, 2015

Is InfluxDB on the cards in any way? :)

@beorn7

This comment has been minimized.

Show comment
Hide comment
@beorn7

beorn7 Mar 5, 2015

Member

Radio Yerevan: "In principle yes." (Please forgive that Eastern European digression... ;)

Member

beorn7 commented Mar 5, 2015

Radio Yerevan: "In principle yes." (Please forgive that Eastern European digression... ;)

@mwitkow

This comment has been minimized.

Show comment
Hide comment
@mwitkow

mwitkow Mar 5, 2015

Contributor

:D That was slightly before my time ;)

Contributor

mwitkow commented Mar 5, 2015

:D That was slightly before my time ;)

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv

juliusv Mar 5, 2015

Member

See also: https://twitter.com/juliusvolz/status/569509228462931968

We're just waiting for InfluxDB 0.9.0, which has a new data model which
should be more compatible with Prometheus's.

On Thu, Mar 5, 2015 at 10:31 AM, Michal Witkowski notifications@github.com
wrote:

:D That was slightly before my time ;)


Reply to this email directly or view it on GitHub
#10 (comment)
.

Member

juliusv commented Mar 5, 2015

See also: https://twitter.com/juliusvolz/status/569509228462931968

We're just waiting for InfluxDB 0.9.0, which has a new data model which
should be more compatible with Prometheus's.

On Thu, Mar 5, 2015 at 10:31 AM, Michal Witkowski notifications@github.com
wrote:

:D That was slightly before my time ;)


Reply to this email directly or view it on GitHub
#10 (comment)
.

@pires

This comment has been minimized.

Show comment
Hide comment
@pires

pires Apr 7, 2015

We're just waiting for InfluxDB 0.9.0, which has a new data model which
should be more compatible with Prometheus's.

Can I say awesome more than once? Awesome!

pires commented Apr 7, 2015

We're just waiting for InfluxDB 0.9.0, which has a new data model which
should be more compatible with Prometheus's.

Can I say awesome more than once? Awesome!

@fabxc

This comment has been minimized.

Show comment
Hide comment
@fabxc

fabxc Apr 7, 2015

Member

Unfortunately, @juliusv ran some tests with 0.9 and InfluxDB consumed 14x more storage than Prometheus.

Before it was an overhead of 11x but Prometheus's could reduce storage size significantly since then - so in reality InfluxDB has apparently improved in that regard.
Nonetheless, InfluxDB did not turn out to be the eventual answer for long-term storage, yet.

Member

fabxc commented Apr 7, 2015

Unfortunately, @juliusv ran some tests with 0.9 and InfluxDB consumed 14x more storage than Prometheus.

Before it was an overhead of 11x but Prometheus's could reduce storage size significantly since then - so in reality InfluxDB has apparently improved in that regard.
Nonetheless, InfluxDB did not turn out to be the eventual answer for long-term storage, yet.

@beorn7

This comment has been minimized.

Show comment
Hide comment
@beorn7

beorn7 Apr 7, 2015

Member

At least experimental write support is in master, as of today, so anybody can play with Influxdb receiving Prometheus metrics. Quite possible somebody finds the reason for the blow-up in storage space and everything will be unicorns and rainbows in the end...

Member

beorn7 commented Apr 7, 2015

At least experimental write support is in master, as of today, so anybody can play with Influxdb receiving Prometheus metrics. Quite possible somebody finds the reason for the blow-up in storage space and everything will be unicorns and rainbows in the end...

@pires

This comment has been minimized.

Show comment
Hide comment
@pires

pires Apr 7, 2015

@beorn7 that's great. TBH I'm not concerned about disk space, it's the cheapest resource on the cloud after all. Not to mention, I'm expecting to hold data with a very small TTL, i.e. few weeks.

pires commented Apr 7, 2015

@beorn7 that's great. TBH I'm not concerned about disk space, it's the cheapest resource on the cloud after all. Not to mention, I'm expecting to hold data with a very small TTL, i.e. few weeks.

@beorn7

This comment has been minimized.

Show comment
Hide comment
@beorn7

beorn7 Apr 7, 2015

Member

@pires In that case, why not just run two identically configured Prometheis with a reasonably large disk?
A few weeks or months is usually fine as retention time for Prometheus. (Default is 15d for a reason... :) The only problem is that if your disk breaks, your data is gone, but for that, you have the other server.

Member

beorn7 commented Apr 7, 2015

@pires In that case, why not just run two identically configured Prometheis with a reasonably large disk?
A few weeks or months is usually fine as retention time for Prometheus. (Default is 15d for a reason... :) The only problem is that if your disk breaks, your data is gone, but for that, you have the other server.

@fabxc

This comment has been minimized.

Show comment
Hide comment
@fabxc

fabxc Apr 7, 2015

Member

@pires do you have a particular reason to hold the data in another database for that time? "A few weeks" does not seem to require a long-term storage solution. Prometheus's default retention time is 15 days - increasing that to 30 or even 60 days should not be a problem.

Member

fabxc commented Apr 7, 2015

@pires do you have a particular reason to hold the data in another database for that time? "A few weeks" does not seem to require a long-term storage solution. Prometheus's default retention time is 15 days - increasing that to 30 or even 60 days should not be a problem.

@pires

This comment has been minimized.

Show comment
Hide comment
@pires

pires Apr 7, 2015

@beorn7 @fabxc I am currently using a proprietary & very specific solution that writes monitoring metrics into InfluxDB. This can eventually be replaced with Prometheus.

Thing is I have some tailored apps that read metrics from InfluxDB in order to reactively scale up/down, that would need to be rewritten to read from Prometheus instead. Also, I use continuous queries. Does Prometheus deliver such a feature?

pires commented Apr 7, 2015

@beorn7 @fabxc I am currently using a proprietary & very specific solution that writes monitoring metrics into InfluxDB. This can eventually be replaced with Prometheus.

Thing is I have some tailored apps that read metrics from InfluxDB in order to reactively scale up/down, that would need to be rewritten to read from Prometheus instead. Also, I use continuous queries. Does Prometheus deliver such a feature?

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Apr 7, 2015

Member

http://prometheus.io/docs/querying/rules/#recording-rules are the equivalent to InfluxDB's continuous queries.

Member

brian-brazil commented Apr 7, 2015

http://prometheus.io/docs/querying/rules/#recording-rules are the equivalent to InfluxDB's continuous queries.

@dever860

This comment has been minimized.

Show comment
Hide comment
@dever860

dever860 commented Jul 1, 2015

+1

1 similar comment
@drawks

This comment has been minimized.

Show comment
Hide comment
@drawks

drawks commented Jul 31, 2015

👍

@fabxc fabxc removed this from the Small Scale Mission Critical Monitoring Use Cases milestone Sep 21, 2015

@blysik

This comment has been minimized.

Show comment
Hide comment
@blysik

blysik Oct 8, 2015

How does remote storage as currently implemented interact with PromDash or grafana?

I have a use case where I want to run Prometheus in a 'heroku-like' environment, where the instances could conceivably go away at any time.

Then I would configure a remote, traditional influxdb cluster to store data in.

Could this configuration function normally?

blysik commented Oct 8, 2015

How does remote storage as currently implemented interact with PromDash or grafana?

I have a use case where I want to run Prometheus in a 'heroku-like' environment, where the instances could conceivably go away at any time.

Then I would configure a remote, traditional influxdb cluster to store data in.

Could this configuration function normally?

@matthiasr

This comment has been minimized.

Show comment
Hide comment
@matthiasr

matthiasr Oct 9, 2015

Contributor

This depends on your definition of "normally", but mostly, no.

Remote storage as it is is write-only; from Prometheus you would only get what it has locally.

To get at older data, you need to query OpenTSDB or InfluxDB directly, using their own interfaces and query languages. With PromDash you're out of luck in that regard; AFAIK Grafana knows all of them.

You could build your dashboards fully based on querying them and leave Prometheus to be a collection and rule evaluation engine, but you would miss out on its query language for ad hoc drilldowns over extended time spans.

Contributor

matthiasr commented Oct 9, 2015

This depends on your definition of "normally", but mostly, no.

Remote storage as it is is write-only; from Prometheus you would only get what it has locally.

To get at older data, you need to query OpenTSDB or InfluxDB directly, using their own interfaces and query languages. With PromDash you're out of luck in that regard; AFAIK Grafana knows all of them.

You could build your dashboards fully based on querying them and leave Prometheus to be a collection and rule evaluation engine, but you would miss out on its query language for ad hoc drilldowns over extended time spans.

@matthiasr

This comment has been minimized.

Show comment
Hide comment
@matthiasr

matthiasr Oct 9, 2015

Contributor

Also note that both InfluxDB and OpenTSDB support are somewhat experimental, under-exercised on our side, and in flux.

Contributor

matthiasr commented Oct 9, 2015

Also note that both InfluxDB and OpenTSDB support are somewhat experimental, under-exercised on our side, and in flux.

@mattkanwisher

This comment has been minimized.

Show comment
Hide comment
@mattkanwisher

mattkanwisher Oct 21, 2015

Contributor

We're kicking around the idea of a flat file exporter, thus we can start storing long term data and then once bulk import issue is done we can use that #535. Would you guys be open for a PR around this?

Contributor

mattkanwisher commented Oct 21, 2015

We're kicking around the idea of a flat file exporter, thus we can start storing long term data and then once bulk import issue is done we can use that #535. Would you guys be open for a PR around this?

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv

juliusv Oct 21, 2015

Member

For #535 take a look at my way outdated branch import-api, where I once added an import API as a proof-of-concept: https://github.com/prometheus/prometheus/commits/import-api. It's from March, so it doesn't apply to master anymore, but it just shows that in principle adding such an API using the existing transfer formats would be trivial. We just need to agree that we want this (it's a contentious issue, /cc @brian-brazil) and whether it should use the same sample transfer format as we use for scraping. The issue with this transfer format is that it's optimized for the many-series-one-sample (scrape) case, while with batch imports you often care more about importing all samples of a series at once, without having to repeat the metric name and labels for each sample (massive overhead). But maybe we don't care about efficiency in the (rare?) bulk import case, so the existing format could be fine.

For the remote storage part, there was this discussion
https://groups.google.com/forum/#!searchin/prometheus-developers/json/prometheus-developers/QsjXwQDLHxI/Cw0YWmevAgAJ about decoupling the remote storage in some generic way, but some details haven't been resolved yet. The basic idea was that Prometheus could send all samples in some well-defined format (JSON, protobuf, or whatever) to a user-specified endpoint which could then do anything it wants with it (write it to a file, send it to another system, etc.).

So it might be ok to add a flat file exporter as a remote storage backend directly to Prometheus, or resolve that discussion above and use said well-defined transfer format and an external daemon.

Member

juliusv commented Oct 21, 2015

For #535 take a look at my way outdated branch import-api, where I once added an import API as a proof-of-concept: https://github.com/prometheus/prometheus/commits/import-api. It's from March, so it doesn't apply to master anymore, but it just shows that in principle adding such an API using the existing transfer formats would be trivial. We just need to agree that we want this (it's a contentious issue, /cc @brian-brazil) and whether it should use the same sample transfer format as we use for scraping. The issue with this transfer format is that it's optimized for the many-series-one-sample (scrape) case, while with batch imports you often care more about importing all samples of a series at once, without having to repeat the metric name and labels for each sample (massive overhead). But maybe we don't care about efficiency in the (rare?) bulk import case, so the existing format could be fine.

For the remote storage part, there was this discussion
https://groups.google.com/forum/#!searchin/prometheus-developers/json/prometheus-developers/QsjXwQDLHxI/Cw0YWmevAgAJ about decoupling the remote storage in some generic way, but some details haven't been resolved yet. The basic idea was that Prometheus could send all samples in some well-defined format (JSON, protobuf, or whatever) to a user-specified endpoint which could then do anything it wants with it (write it to a file, send it to another system, etc.).

So it might be ok to add a flat file exporter as a remote storage backend directly to Prometheus, or resolve that discussion above and use said well-defined transfer format and an external daemon.

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Oct 21, 2015

Member

I think for flat file we'd be talking the external daemon, as it's not something we can ever read back from.

Member

brian-brazil commented Oct 21, 2015

I think for flat file we'd be talking the external daemon, as it's not something we can ever read back from.

@mattkanwisher

This comment has been minimized.

Show comment
Hide comment
@mattkanwisher

mattkanwisher Oct 26, 2015

Contributor

So the more I think about it, it would be nice to have this /import-api (a raw data) api, so we can have backup nodes mirroring the data from the primary prometheus. Would their be appetite for a PR for this and corresponding piece inside of prometheus to import the data. So you can have essentially read slaves?

Contributor

mattkanwisher commented Oct 26, 2015

So the more I think about it, it would be nice to have this /import-api (a raw data) api, so we can have backup nodes mirroring the data from the primary prometheus. Would their be appetite for a PR for this and corresponding piece inside of prometheus to import the data. So you can have essentially read slaves?

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Oct 26, 2015

Member

For that use case we generally recommend running multiple identical Prometheus servers. Remote storage is about long term data, not redundancy or scaling.

Member

brian-brazil commented Oct 26, 2015

For that use case we generally recommend running multiple identical Prometheus servers. Remote storage is about long term data, not redundancy or scaling.

@mattkanwisher

This comment has been minimized.

Show comment
Hide comment
@mattkanwisher

mattkanwisher Oct 26, 2015

Contributor

I think running multiple scrapers is not a good solution cause the data won't match, also there is no way to backfill data. So we have issue where I need to spin up some redundant nodes and now they are missing a month of data. If you have an api to raw import the data you could at least catch them up. Also the same interface could be used for backups

Contributor

mattkanwisher commented Oct 26, 2015

I think running multiple scrapers is not a good solution cause the data won't match, also there is no way to backfill data. So we have issue where I need to spin up some redundant nodes and now they are missing a month of data. If you have an api to raw import the data you could at least catch them up. Also the same interface could be used for backups

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Oct 26, 2015

Member

So we have issue where I need to spin up some redundant nodes and now they are missing a month of data. If you have an api to raw import the data you could at least catch them up. Also the same interface could be used for backups

This is the use case for remote storage, you pull the older data from remote storage rather than depending on Prometheus being stateful. Similarly in such a setup there's no need for backups, as Prometheues doesn't have any notable state.

Member

brian-brazil commented Oct 26, 2015

So we have issue where I need to spin up some redundant nodes and now they are missing a month of data. If you have an api to raw import the data you could at least catch them up. Also the same interface could be used for backups

This is the use case for remote storage, you pull the older data from remote storage rather than depending on Prometheus being stateful. Similarly in such a setup there's no need for backups, as Prometheues doesn't have any notable state.

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv

juliusv Mar 10, 2017

Member

@iksaif ok yeah, so you're not talking about the retention time, but the earliest sample timestamp in Prometheus's current database. We don't track that currently, but that would generally make more sense, yeah.

Member

juliusv commented Mar 10, 2017

@iksaif ok yeah, so you're not talking about the retention time, but the earliest sample timestamp in Prometheus's current database. We don't track that currently, but that would generally make more sense, yeah.

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Mar 10, 2017

Member

Question is whether we'll want the per-query / rule LTS determination in v1. It could influence the design a bit, so I'll think about it at least.

It's not v1, but it's something we should have before we start getting into production usage as it's critical for reliability.

Member

brian-brazil commented Mar 10, 2017

Question is whether we'll want the per-query / rule LTS determination in v1. It could influence the design a bit, so I'll think about it at least.

It's not v1, but it's something we should have before we start getting into production usage as it's critical for reliability.

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv
Member

juliusv commented Mar 10, 2017

@brian-brazil For sure!

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Mar 10, 2017

Member

The other thing to consider is having the API allow for multiple vector selectors (each with their own timestamps due to offset), so that LTSes can optimise and do better throttling/abuse handling.

Member

brian-brazil commented Mar 10, 2017

The other thing to consider is having the API allow for multiple vector selectors (each with their own timestamps due to offset), so that LTSes can optimise and do better throttling/abuse handling.

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv

juliusv Mar 10, 2017

Member

@brian-brazil Oh yeah, I have multiple vector selector sets, but great point about different offsets!

Member

juliusv commented Mar 10, 2017

@brian-brazil Oh yeah, I have multiple vector selector sets, but great point about different offsets!

@pilhuhn

This comment has been minimized.

Show comment
Hide comment
@pilhuhn

pilhuhn Mar 10, 2017

A simple static duration is not sufficient, as the remote storage may not be caught up that far yet or Prometheus may have retention going further back. I think this is something we'll have to figure out

I don't think Prometheus having retention going further back is really an issue here, as long as the remote can (already) provide the data. Worst case is with downsampling that you lose granularity.

pilhuhn commented Mar 10, 2017

A simple static duration is not sufficient, as the remote storage may not be caught up that far yet or Prometheus may have retention going further back. I think this is something we'll have to figure out

I don't think Prometheus having retention going further back is really an issue here, as long as the remote can (already) provide the data. Worst case is with downsampling that you lose granularity.

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv

juliusv Mar 10, 2017

Member

@pilhuhn I meant it the other way around: if you have a Prometheus retention of 15d and you query only data older than 15d from the remote storage, it doesn't necessarily mean that Prometheus will already have all data younger than 15d (due to storage wipe or whatever).

Well, for a first iteration we're just going to query all time ranges from everywhere.

Member

juliusv commented Mar 10, 2017

@pilhuhn I meant it the other way around: if you have a Prometheus retention of 15d and you query only data older than 15d from the remote storage, it doesn't necessarily mean that Prometheus will already have all data younger than 15d (due to storage wipe or whatever).

Well, for a first iteration we're just going to query all time ranges from everywhere.

@juliusv juliusv referenced this issue Mar 15, 2017

Merged

Remote Read #2499

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv

juliusv Mar 15, 2017

Member

There's a WIP PR for the remote read integration here for anyone who would like to take a look early: #2499

Member

juliusv commented Mar 15, 2017

There's a WIP PR for the remote read integration here for anyone who would like to take a look early: #2499

@tjboring

This comment has been minimized.

Show comment
Hide comment
@tjboring

tjboring Apr 15, 2017

I'm trying to use the remote_storage_adapter to send metrics from prometheus to opentsdb. But I'm getting these errors in the logs:

WARN[0065] cannot send value NaN to OpenTSDB, skipping sample &model.Sample{Metric:model.Metric{"instance":"localhost:9090", "job":"prometheus", "monitor":"codelab-monitor", "location":"archived", "quantile":"0.5", "__name__":"prometheus_local_storage_maintain_series_duration_seconds"}, Value:NaN, Timestamp:1492267735191}  source=client.go:78
WARN[0065] Error sending samples to remote storage       err=invalid character 'p' after top-level value num_samples=100 source=main.go:281 storage=opentsdb

I've also tried using influxdb instead of opentsdb, with similar results:

EBU[0001] cannot send value NaN to InfluxDB, skipping sample &model.Sample{Metric:model.Metric{"job":"prometheus", "instance":"localhost:9090", "scrape_job":"ns1-web-pinger", "quantile":"0.99", "__name__":"prometheus_target_sync_length_seconds", "monitor":"codelab-monitor"}, Value:NaN, Timestamp:1492268550191}  source=client.go:76

Here's how I'm starting the remote_storage_adapter:

# this is just for influxdb, i make the appropriate changes if trying to use opentsdb
./remote_storage_adapter -influxdb-url=http://138.197.107.211:8086 -influxdb.database=prometheus -influxdb.retention-policy=autogen -log.level debug

Here's the Prometheus config:

global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

remote_write:
  url: "http://localhost:9201/write"

Is there something I'm misunderstanding about how to configure the remote_storage_adapter?

tjboring commented Apr 15, 2017

I'm trying to use the remote_storage_adapter to send metrics from prometheus to opentsdb. But I'm getting these errors in the logs:

WARN[0065] cannot send value NaN to OpenTSDB, skipping sample &model.Sample{Metric:model.Metric{"instance":"localhost:9090", "job":"prometheus", "monitor":"codelab-monitor", "location":"archived", "quantile":"0.5", "__name__":"prometheus_local_storage_maintain_series_duration_seconds"}, Value:NaN, Timestamp:1492267735191}  source=client.go:78
WARN[0065] Error sending samples to remote storage       err=invalid character 'p' after top-level value num_samples=100 source=main.go:281 storage=opentsdb

I've also tried using influxdb instead of opentsdb, with similar results:

EBU[0001] cannot send value NaN to InfluxDB, skipping sample &model.Sample{Metric:model.Metric{"job":"prometheus", "instance":"localhost:9090", "scrape_job":"ns1-web-pinger", "quantile":"0.99", "__name__":"prometheus_target_sync_length_seconds", "monitor":"codelab-monitor"}, Value:NaN, Timestamp:1492268550191}  source=client.go:76

Here's how I'm starting the remote_storage_adapter:

# this is just for influxdb, i make the appropriate changes if trying to use opentsdb
./remote_storage_adapter -influxdb-url=http://138.197.107.211:8086 -influxdb.database=prometheus -influxdb.retention-policy=autogen -log.level debug

Here's the Prometheus config:

global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

remote_write:
  url: "http://localhost:9201/write"

Is there something I'm misunderstanding about how to configure the remote_storage_adapter?

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv

juliusv Apr 15, 2017

Member

@tjboring Neither OpenTSDB nor InfluxDB support float64 NaN (not a number) values, so these samples are skipped when sending samples to them. We have mentioned this problem to InfluxDB, and if we're lucky, they will support NaN values sometime in the future, or maybe we can find another workaround.

OpenTSDB issue: OpenTSDB/opentsdb#183
InfluxDB issue: influxdata/influxdb#4089

I am not sure where the invalid character 'p' after top-level value error comes from though.

Member

juliusv commented Apr 15, 2017

@tjboring Neither OpenTSDB nor InfluxDB support float64 NaN (not a number) values, so these samples are skipped when sending samples to them. We have mentioned this problem to InfluxDB, and if we're lucky, they will support NaN values sometime in the future, or maybe we can find another workaround.

OpenTSDB issue: OpenTSDB/opentsdb#183
InfluxDB issue: influxdata/influxdb#4089

I am not sure where the invalid character 'p' after top-level value error comes from though.

@tjboring

This comment has been minimized.

Show comment
Hide comment
@tjboring

tjboring Apr 15, 2017

@juliusv Thanks for the pointers to the opentsdb/influxdb issues. I was just seeing the error messages on the console and thought nothing was being written, not realizing those are just samples that are being skipped. I've since confirmed that samples are indeed making it to the remote storage db. :)

tjboring commented Apr 15, 2017

@juliusv Thanks for the pointers to the opentsdb/influxdb issues. I was just seeing the error messages on the console and thought nothing was being written, not realizing those are just samples that are being skipped. I've since confirmed that samples are indeed making it to the remote storage db. :)

@mattbostock

This comment has been minimized.

Show comment
Hide comment
@mattbostock

mattbostock Apr 17, 2017

Contributor

Now that remote read and write APIs are in place (albeit experimental), should this issue be closed in favour of raising more specific issues as they arise?

https://prometheus.io/docs/operating/configuration/#<remote_write>
https://prometheus.io/docs/operating/configuration/#<remote_read>

Contributor

mattbostock commented Apr 17, 2017

Now that remote read and write APIs are in place (albeit experimental), should this issue be closed in favour of raising more specific issues as they arise?

https://prometheus.io/docs/operating/configuration/#<remote_write>
https://prometheus.io/docs/operating/configuration/#<remote_read>

@prasenforu

This comment has been minimized.

Show comment
Hide comment
@prasenforu

prasenforu Apr 21, 2017

Any body tried with container ? Please paste Dockerfile

Because I am not able to find "remote_storage_adapter" executable file in docker "prom/prometheus" version 1.6

/prometheus # find / -name remote_storage_adapter
/prometheus #

Please

prasenforu commented Apr 21, 2017

Any body tried with container ? Please paste Dockerfile

Because I am not able to find "remote_storage_adapter" executable file in docker "prom/prometheus" version 1.6

/prometheus # find / -name remote_storage_adapter
/prometheus #

Please

@sorrowless

This comment has been minimized.

Show comment
Hide comment
@sorrowless

sorrowless Apr 21, 2017

@prasenforu I have built a docker image with remote_storage_adapter from current master code: gra2f/remote_storage_adapter, feel free to use it.

@juliusv I have a problems similar to @tjboring ones:

time="2017-04-21T17:45:00Z" level=warning msg="cannot send value NaN to Graphite,skipping sample &model.Sample{Metric:model.Metric{"name":"prometheus_target_sync_length_seconds", "monitor":"codelab-monitor", "job":"prometheus", "instance":"localhost:9090", "scrape_job":"prometheus", "quantile":"0.9"}, Value:NaN, Timestamp:1492796695772}" source="client.go:90"

but I am using Graphite. Is it okay?

sorrowless commented Apr 21, 2017

@prasenforu I have built a docker image with remote_storage_adapter from current master code: gra2f/remote_storage_adapter, feel free to use it.

@juliusv I have a problems similar to @tjboring ones:

time="2017-04-21T17:45:00Z" level=warning msg="cannot send value NaN to Graphite,skipping sample &model.Sample{Metric:model.Metric{"name":"prometheus_target_sync_length_seconds", "monitor":"codelab-monitor", "job":"prometheus", "instance":"localhost:9090", "scrape_job":"prometheus", "quantile":"0.9"}, Value:NaN, Timestamp:1492796695772}" source="client.go:90"

but I am using Graphite. Is it okay?

@tjboring

This comment has been minimized.

Show comment
Hide comment
@tjboring

tjboring Apr 21, 2017

@sorrowless

Do you see other metrics in Graphite that you know came from Prometheus?

In my case I verified this by connecting to the Influxdb server I was using, and running a query. It gave me back metrics, which confirmed that Prometheus was indeed writing metrics; it's just that some were being skipped, per the log message.

tjboring commented Apr 21, 2017

@sorrowless

Do you see other metrics in Graphite that you know came from Prometheus?

In my case I verified this by connecting to the Influxdb server I was using, and running a query. It gave me back metrics, which confirmed that Prometheus was indeed writing metrics; it's just that some were being skipped, per the log message.

@sorrowless

This comment has been minimized.

Show comment
Hide comment
@sorrowless

sorrowless Apr 21, 2017

@tjboring yes, I can see some of the metrics in Graphite and what's more strange for me is that I cannot understand why some are there and some are not. For example, sy and us per CPU stored into Graphite but load average is not.

sorrowless commented Apr 21, 2017

@tjboring yes, I can see some of the metrics in Graphite and what's more strange for me is that I cannot understand why some are there and some are not. For example, sy and us per CPU stored into Graphite but load average is not.

@prasenforu

This comment has been minimized.

Show comment
Hide comment
@prasenforu

prasenforu Apr 22, 2017

@sorrowless

Not able to find the image, can you please share the url.

Thanks in advance.

prasenforu commented Apr 22, 2017

@sorrowless

Not able to find the image, can you please share the url.

Thanks in advance.

@sorrowless

This comment has been minimized.

Show comment
Hide comment
@sorrowless

sorrowless Apr 22, 2017

@prasenforu just run
$ docker pull gra2f/remote_storage_adapter
in your command line, that's all you need

sorrowless commented Apr 22, 2017

@prasenforu just run
$ docker pull gra2f/remote_storage_adapter
in your command line, that's all you need

@prasenforu

This comment has been minimized.

Show comment
Hide comment
@prasenforu

prasenforu commented Apr 22, 2017

@sorrowless

Thanks.

@juliusv

This comment has been minimized.

Show comment
Hide comment
@juliusv

juliusv Apr 24, 2017

Member

@mattbostock As you suggested, I'm closing this issue. We should open more specific remote-storage related issues in the future.

Further usage questions are best asked on our mailing lists or IRC (https://prometheus.io/community/).

Member

juliusv commented Apr 24, 2017

@mattbostock As you suggested, I'm closing this issue. We should open more specific remote-storage related issues in the future.

Further usage questions are best asked on our mailing lists or IRC (https://prometheus.io/community/).

@juliusv juliusv closed this Apr 24, 2017

@prasenforu

This comment has been minimized.

Show comment
Hide comment
@prasenforu

prasenforu Apr 27, 2017

@sorrowless

I was looking the images, I saw there was file remote_storage_adapter in /usr/bin

but rest of prometheus file and volume not there,

~ # find / -name remote_storage_adapter
/usr/bin/remote_storage_adapter
~ # find / -name prometheus.yml
~ # find / -name prometheus

Anyway can you please send me the dockerfile of "gra2f/remote_storage_adapter"

prasenforu commented Apr 27, 2017

@sorrowless

I was looking the images, I saw there was file remote_storage_adapter in /usr/bin

but rest of prometheus file and volume not there,

~ # find / -name remote_storage_adapter
/usr/bin/remote_storage_adapter
~ # find / -name prometheus.yml
~ # find / -name prometheus

Anyway can you please send me the dockerfile of "gra2f/remote_storage_adapter"

@sorrowless

This comment has been minimized.

Show comment
Hide comment
@sorrowless

sorrowless Apr 30, 2017

@prasenforu
you do not need main prometheus executable to use remote storage adapter. Use prom/prometheus image for that.
What related for Dockerfile - all it is doing is copy prebuilt remote_storage_adapter to it and run it, that's all.

sorrowless commented Apr 30, 2017

@prasenforu
you do not need main prometheus executable to use remote storage adapter. Use prom/prometheus image for that.
What related for Dockerfile - all it is doing is copy prebuilt remote_storage_adapter to it and run it, that's all.

@gdmelloatpoints

This comment has been minimized.

Show comment
Hide comment
@gdmelloatpoints

gdmelloatpoints Aug 16, 2017

If anyone wants to test it out (like I need to), I wrote a small docker-compose based setup to get this up and running locally - https://github.com/gdmello/prometheus-remote-storage.

gdmelloatpoints commented Aug 16, 2017

If anyone wants to test it out (like I need to), I wrote a small docker-compose based setup to get this up and running locally - https://github.com/gdmello/prometheus-remote-storage.

simonpasquier pushed a commit to simonpasquier/prometheus that referenced this issue Oct 12, 2017

Merge pull request prometheus#10 from brian-brazil/absent
Make documentation for absent() not, uhm, absent

cofyc added a commit to cofyc/prometheus that referenced this issue Jun 5, 2018

Merge pull request prometheus#10 from cofyc/revert_shared_informers
Revert "Share kubernetes informers in kubernetes discovery to improve performance."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment