Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support remote_read shard remote storage #4062

Closed
Wing924 opened this Issue Apr 8, 2018 · 9 comments

Comments

Projects
None yet
3 participants
@Wing924
Copy link

Wing924 commented Apr 8, 2018

We need horizontal partition the storage backend in case the metrics are too huge that one storage can't handle. Currently, we can use remote read feature to archive this function by set multiple remote read endpoints. But remote read function send queries to all backends even if we know in advance that backend don't have matched metrics.
I suggest add a shard option for better perforamnce.

For example, we have 2 backends, one stores production metrics the other stores development metrics.

remote_read:
# only contains metrics with {env="prod"}
- url: http://prod-backend/api/v1/read
  shards:
    env: 
    - prod
# only contains metrics with {env="stg"} or {env="dev"}
- url: http://devel-backend/api/v1/read
  shards:
    env:
    - stg
    - dev 
  • If a query have {env="prod"} we can skip devel-backend
  • If a query have {env!="prod"} we can skip prod-backend
  • If a query have {env=~"stg|dev"} we can skip prod-backend
  • If a query have {env=~".*notexist.*"} we can skip both
  • If a query don't have have env label, we can query both backends like current implementation.
@gouthamve

This comment has been minimized.

Copy link
Member

gouthamve commented Apr 9, 2018

You can already do this using: write_relabel_configs. I see this as a usage question rather than a bug.
It makes more sense to ask questions like this on the prometheus-users mailing list rather than in a GitHub issue. On the mailing list, more people are available to potentially respond to your question, and the whole community can benefit from the answers provided.

Please feel free to re-open if you feel there is a feature missing or a bug here.

@gouthamve gouthamve closed this Apr 9, 2018

@Wing924

This comment has been minimized.

Copy link
Author

Wing924 commented Apr 9, 2018

@gouthamve
I'm sorry, the title is miss-leading.
I know we can use write_relabel_configs with remote_write to shard metrics.
My issue is remote_read not remote_write.
We need combine all shards by remote_read to provide a single interface for grafana etc.

This is not a question, it's a feature request.

@Wing924 Wing924 changed the title Support shard for remote storage Support remote_read shard remote storage Apr 9, 2018

@gouthamve

This comment has been minimized.

Copy link
Member

gouthamve commented Apr 9, 2018

Sorry @Wing924 for closing this, you are right.

This has been discussed before and I remember @grobie looking into something similar.

@gouthamve gouthamve reopened this Apr 9, 2018

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 9, 2018

You're thinking of limiting by label, which is already implemented though it is not useful for this use case is it's not intended for use with true remote storages.

You can have multiple remote read endpoints, and if the features offered by Prometheus not sufficient you can always implement a remote reader that goes in front of the rest with the logic you require.

What you're talking about is vertical sharding, not horizontal. For remote storage you want horizontal, which is a not a concern of Prometheus.

@Wing924

This comment has been minimized.

Copy link
Author

Wing924 commented Apr 9, 2018

@brian-brazil
Sorry for not notice this docs updates. This is what I'm looking for.

# An optional list of equality matchers which have to be
# present in a selector to query the remote read endpoint.
required_matchers:
  [ <labelname>: <labelvalue> ... ]

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 9, 2018

That won't do what you want. This is something you need to implement inside your remote read endpoint.

@Wing924

This comment has been minimized.

Copy link
Author

Wing924 commented Apr 9, 2018

@brian-brazil

You're thinking of limiting by label, which is already implemented though it is not useful for this use case is it's not intended for use with true remote storages.

I think you're talking about required_matchers, aren't it?

This is something you need to implement inside your remote read endpoint.

My remote storages are other prometheus....

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 9, 2018

My remote storages are other prometheus....

That is not a recommended way to use remote read. Like federation, remote read is not a way to have all your data accessible in one Prometheus.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.