Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote_read - read_recent does not work as expected #3599

Closed
AndreaGiardini opened this Issue Dec 19, 2017 · 16 comments

Comments

Projects
None yet
5 participants
@AndreaGiardini
Copy link
Contributor

AndreaGiardini commented Dec 19, 2017

What did you do?

I tried to configure prometheus with a remote_read endpoint

What did you expect to see?

I expected prometheus to query the remote endpoint only if the metrics are not available in the local storage.

What did you see instead? Under which circumstances?

Prometheus always queries the remote endpoint, no matter its configuration.

Environment

  • System information:

    k8s 1.7.8

  • Prometheus version:

    both v2 and master

  • Prometheus configuration file:

    global:
      evaluation_interval: 1m
      scrape_interval: 1m
    remote_write:
      - url: "http://influxdb:8086/api/v1/prom/write?db=test"
        remote_timeout: 30s
        queue_config:
          capacity: 100000
          max_shards: 1000
          max_samples_per_send: 100
          batch_send_deadline: 5s
          max_retries: 10
          min_backoff: 30ms
          max_backoff: 100ms
    remote_read:
      - url: "http://influxdb:8086/api/v1/prom/read?db=test"
        remote_timeout: 30s
        read_recent: false

I believe the default for read_recent should be set to false here: https://github.com/prometheus/prometheus/blob/master/config/config.go#L200

From the logs of influxdb I can see that, no matter what, prometheus always queries the remote endpoint. Help is appreciated

@grobie

This comment has been minimized.

Copy link
Member

grobie commented Dec 19, 2017

I looked into this and found the tsdb.StartTime() function to cause this behavior.

  1. Before the first persisted blocks exist, the function will always return the current timestamp. As this is evaluated on the server after the query creation, it'll always be equal or newer than the query timestamp.

    // StartTime implements the Storage interface.
    func (a adapter) StartTime() (int64, error) {
    var startTime int64
    if len(a.db.Blocks()) > 0 {
    startTime = a.db.Blocks()[0].Meta().MinTime
    } else {
    startTime = int64(time.Now().Unix() * 1000)
    }
    // Add a safety margin as it may take a few minutes for everything to spin up.
    return startTime + a.startTimeMargin, nil
    }

  2. The StartTimeMargin is added on top of that timestamp, moving the start time 4 hours into the future before the first blocks are persisted.

    startTimeMargin := int64(2 * time.Duration(cfg.tsdb.MinBlockDuration).Seconds() * 1000)

I have to admit I'm not sure I understand the reason for the current implementation.

  • Wouldn't the server start time be a better candidate than the current time until the first blocks have been persisted?
  • I don't know the purpose of the safety margin, but I assume it shouldn't be added to the current time, as it'll move the start time well into the future. Are the side effects of this margin known and expected?
@AndreaGiardini

This comment has been minimized.

Copy link
Contributor Author

AndreaGiardini commented Dec 21, 2017

I think I found out the problem.

My prometheus is started with the following parameter storage.tsdb.retention=2h. I have such a low retention because I would like to keep on local storage as little as possible and rely on remote_read / remote_write for everything else.

Setting this value so low triggers a bug in the way the block size is calculated. If i go to to the /flags endpoint i can see: storage.tsdb.max-block-duration = 12m and storage.tsdb.min-block-duration = 2h. As you can see max-block-duration < min-block-duration.

Forcing the max/min block size with the following options makes the bug disappear and the reads are processed correctly by prometheus:

storage.tsdb.retention=2h
storage.tsdb.min-block-duration=30m
storage.tsdb.max-block-duration=1h

The problem relies on the way prometheus calculates the defaults for max/min-block-duration:

  • min-block-duration has a default of 2h
  • max-block-duration has a default of 10% of retention time

So, accordingly:

  • retention < 20h causes min-block-duration > max-block-duration (wrong)
  • retention > 20h causes min-block-duration < max-block-duration (correct)

IMHO this bug can be fixed calculating min-block-duration as ... let's say... 5% of retention time? Maybe @fabxc has some recommendations?

Specifying min/max-block-duration without relying on the defaults fixes the problem.

@AndreaGiardini

This comment has been minimized.

Copy link
Contributor Author

AndreaGiardini commented Dec 21, 2017

Actually... I just found out that whenever min-block-duration > max-block-duration then the storage layer changes silently to min-block-duration = max-block-duration as you can see in https://github.com/prometheus/prometheus/blob/master/storage/tsdb/tsdb.go#L129

@brian-brazil brian-brazil removed the kind/bug label Dec 21, 2017

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Dec 21, 2017

With those settings, we basically have to query LTS on every query as there's insufficient local state.

Why do you wish to set the retention so low?

@AndreaGiardini

This comment has been minimized.

Copy link
Contributor Author

AndreaGiardini commented Dec 22, 2017

Why do you think that there is insufficient storage state?

I had the feeling that for every query that is less than 2h in the past I would use the local storage (example, memory usage over the past hour)

@iksaif

This comment has been minimized.

Copy link
Contributor

iksaif commented Dec 26, 2017

cc: @Thib17, whowork on some of this code.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Dec 26, 2017

Why do you think that there is insufficient storage state?

Retention is only 2 hours, it takes more than that for read_recent to start to work. This is all working as expected.

@AndreaGiardini

This comment has been minimized.

Copy link
Contributor Author

AndreaGiardini commented Dec 31, 2017

I had the feeling that for every query that is less than 2h in the past I would use the local storage (example, memory usage over the past hour)

Is this statement correct? That's the behavior I recall

Retention is only 2 hours, it takes more than that for read_recent to start to work

Is this documented somewhere? I can't find anything relevant

@sathappan

This comment has been minimized.

Copy link

sathappan commented Feb 1, 2018

I'm facing the same issue as @AndreaGiardini . Except in my case the retention is the default 15d, min-block-duration is 2h and max-block-duration is 36h. All defaults. I have Prometheus 2.1.0 and use InfluxDB for the LTS.
All read queries fire a query to the LTS regardless of read_recent. This made all my irate calls return no datapoints. Only rate queries worked.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 1, 2018

This made all my irate calls return no datapoints.

That's not right. read_recent is an optimisation, it shouldn't have any semantic effect.

@sathappan

This comment has been minimized.

Copy link

sathappan commented Feb 1, 2018

When I changed the min-block-duration to 119m instead of the default 120m, all my irate queries started working correctly and the calls were not sent to the LTS. Can somebody explain this behavior? @brian-brazil @fabxc

@sathappan

This comment has been minimized.

Copy link

sathappan commented Feb 1, 2018

That is what I thought. But like I mentioned in my last comment, whenever I left the min block duration to 2h, my graphs clearly said No datapoints. When I ran the query that Prometheus fired in Influx, I was able to see the samples there. So, it's only the query processing that's ignoring the samples may be? This happened only for irate though.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 1, 2018

That does not sound like the issue being reported here, can you file a new bug?

@sathappan

This comment has been minimized.

Copy link

sathappan commented Feb 1, 2018

@brian-brazil Okay. It started working now. Silly me! So, I missed giving an information earlier. After I integrated the LTS, I wiped my data directory. Once I did that, all my reads were being sent to the LTS. That's why when I gave lower min-block-duration values earlier, it worked by getting data from local storage and for higher ones like 2h or more, it was sending my queries to LTS. Didn't know that read_recent was dependent on min-block-duration. I guess it makes sense to document this behavior.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 1, 2018

It's an implementation detail that I don't believe should be exposed to users as it'll lead to confusion (i.e. they'll start messing around with block durations).

In any case, this is all working as designed.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.