Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable local storage #2729

Closed
h7kanna opened this Issue May 16, 2017 · 10 comments

Comments

Projects
None yet
5 participants
@h7kanna
Copy link

h7kanna commented May 16, 2017

Provide an option to configure Prometheus in disk-less storage mode in 2.x series.

From the discussion
https://groups.google.com/forum/#!topic/prometheus-users/kJ1knCitxQs

I learnt this can be easily implemented in 1.x series which I am trying to do.
But the suggestion was to ask someone working on 2.x series to consider this as a feature request.

Thanks,
Harsha

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 16, 2017

This should be pretty straightforward but it has very low-priority for me right now. Happy provide guidance and review if someone wants to implement that feature for 2.x

I am curious about your use case though – do I assume correct that you still want to keep a certain amount of history in memory for querying or is it just about proxying data via remote-write?

The easy mode is to set a short retention and point the data dir to tmpfs. Truly disabling anything persistence related will probably not give notable performance gains in 2.x

@h7kanna

This comment has been minimized.

Copy link
Author

h7kanna commented May 16, 2017

Yes, use case is metrics in memory for querying but completely ephemeral. Persistence to disk is only via remote write or optional.

Reason is not particularly performance but to deploy Prometheus as stateless container.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 16, 2017

We basically do this all the time with Kubernetes. Prometheus writes its data to the file system as usual, but it is backed by tmpfs and if the container goes down, the data is simply dropped. Anything speaking against that solution?

@h7kanna

This comment has been minimized.

Copy link
Author

h7kanna commented May 16, 2017

Currently trying to deploy on internal PAAS platform which only supports stateless containers.
We are not yet on Kubernetes (adoption is in works though). Will try to deploy as you suggested.
Thanks @fabxc

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jun 6, 2017

Closing here as this seems to be a viable solution.

@fabxc fabxc closed this Jun 6, 2017

@AndreaGiardini

This comment has been minimized.

Copy link
Contributor

AndreaGiardini commented Dec 18, 2017

I would like to add my findings and doubts to this issue:

  • I have prometheus with remote read/write (influxdb)
  • I set retention period for 2 days and mounted the /prometheus folder as tmpfs as suggested by @fabxc

My doubts are:

  • It looks like this technique does not provide any in-memory caching (every request still goes to the influxdb backend). Am I missing something? I would expect every query that includes data less than 2d old to be processed locally... or maybe the metrics are deleted from local storage as soon as they are shipped to influxdb?

  • Kubernetes does not allow you to set a size for the tmpfs volume. This means that whoever implements this solution needs to pay particular attention to the retention policy since their pod might get OOM killed.

@sevagh

This comment has been minimized.

Copy link

sevagh commented May 16, 2018

It looks like this technique does not provide any in-memory caching (every request still goes to the influxdb backend). Am I missing something? I would expect every query that includes data less than 2d old to be processed locally... or maybe the metrics are deleted from local storage as soon as they are shipped to influxdb?

I think this is intended behavior. When doing some basic testing, I put a log statement in my remote storage adapter on the /read endpoint, and every query/graph was hitting remote storage. I believe the remote storage is always used when it's defined.

Relevant info from the docs:

Note that on the read path, Prometheus only fetches raw series data for a set of label selectors and time ranges from the remote end. All PromQL evaluation on the raw data still happens in Prometheus itself. This means that remote read queries have some scalability limit, since all necessary data needs to be loaded into the querying Prometheus server first and then processed there. However, supporting fully distributed evaluation of PromQL was deemed infeasible for the time being.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented May 17, 2018

@AndreaGiardini

It looks like this technique does not provide any in-memory caching (every request still goes to the influxdb backend). Am I missing something? I would expect every query that includes data less than 2d old to be processed locally... or maybe the metrics are deleted from local storage as soon as they are shipped to influxdb?

For queries that don't touch data older than the start time of the local TSDB, remote reads should only occur when the read_recent: true flag is set in a remote_read config.

Relevant code:

// PreferLocalStorageFilter returns a QueryableFunc which creates a NoopQuerier
// if requested timeframe can be answered completely by the local TSDB, and
// reduces maxt if the timeframe can be partially answered by TSDB.
func PreferLocalStorageFilter(next storage.Queryable, cb startTimeCallback) storage.Queryable {
return storage.QueryableFunc(func(ctx context.Context, mint, maxt int64) (storage.Querier, error) {
localStartTime, err := cb()
if err != nil {
return nil, err
}
cmaxt := maxt
// Avoid queries whose timerange is later than the first timestamp in local DB.
if mint > localStartTime {
return storage.NoopQuerier(), nil
}
// Query only samples older than the first timestamp in local DB.
if maxt > localStartTime {
cmaxt = localStartTime
}
return next.Querier(ctx, mint, cmaxt)
})
}

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented May 17, 2018

(at least now, not sure if it was already true when your comment was written :))

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.