Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query frontend: dynamic vertical shards number #5618

Open
yeya24 opened this issue Aug 20, 2022 · 10 comments
Open

Query frontend: dynamic vertical shards number #5618

yeya24 opened this issue Aug 20, 2022 · 10 comments
Labels

Comments

@yeya24
Copy link
Contributor

yeya24 commented Aug 20, 2022

Is your proposal related to a problem?

The idea was originally mentioned in #5342 (comment).

Right now the number of vertical query shards is a static value configured via the query frontend flag. However, different metrics may have different cardinalities, thus causing the static vertical shards value cannot fit well for all cases. Some queries are sufficient for 1 shard while some queries might need more shards.

Describe the solution you'd like

  1. Make the current vertical query shards value a fallback value in the query frontend.
  2. When decoding query requests at the query frontend, allow users to use the parameter vertical_shards or shards to specify the number of shards to use. If this value is specified, use it instead. The param is only valid in query frontend but not valid in querier (since only query frontend needs to set the shards).
  3. Add another flag for example query-frontend.max-vertical-shards to limit the max number of shards we allow.

Additional context

Ideally more dynamic sharding can be enabled by estimating the cardinality of the given queries and changing the shards.

One caveat:

  1. If we are going to support compaction time vertical block sharding, it might be hard to specify the shards because blocks are already sharded.
@GiedriusS
Copy link
Member

GiedriusS commented Aug 21, 2022

Mhm, how would the user know how many shards they need? In Grafana, for example, this would mean many data sources with different parameters. It would be pretty bad UX IMO.

I was thinking about this: how about making a cached call to /api/v1/series to see what series are available beforehand and then calculating the optimum number of shards according to the aggregating labels? So, the matchers would be the same as in query plus the aggregating labels should be not empty.

The downside of such solution is that it means an extra call and that this call could return a lot of info but perhaps we could limit the results to some number of series?

Just my 2 cents. But I'd rather avoid having many data-sources 🤔

Finally, we could still add the ability to override the number of shards but I don't think it should be the "final" solution.

@hanjm
Copy link
Member

hanjm commented Aug 21, 2022

typically it could be calculated by cpu cores and querier instance count?

@yeya24
Copy link
Contributor Author

yeya24 commented Aug 21, 2022

Mhm, how would the user know how many shards they need? In Grafana, for example, this would mean many data sources with different parameters. It would be pretty bad UX IMO.

I was thinking about this: how about making a cached call to /api/v1/series to see what series are available beforehand and then calculating the optimum number of shards according to the aggregating labels? So, the matchers would be the same as in query plus the aggregating labels should be not empty.

The downside of such solution is that it means an extra call and that this call could return a lot of info but perhaps we could limit the results to some number of series?

Just my 2 cents. But I'd rather avoid having many data-sources 🤔

Finally, we could still add the ability to override the number of shards but I don't think it should be the "final" solution.

I agree with you. Ideally the shard number needs to be calculated automatically by a component in query frontend and it knows the cardinality of metrics.

Right now, I am just thinking about exposing this and users can configure the shards from API call. They can choose num shards based on experiences first. Another use case is to disable sharding for some queries.

For Grafana, I would say it is a Grafana UX issue. Panels don't support custom Params overwriting.

@fpetkovski
Copy link
Contributor

I believe rulers also don't support custom URL parameters right now, which is something we can add to control shards per rule / ruler.

@yeya24
Copy link
Contributor Author

yeya24 commented Aug 22, 2022

We may need to find a better solution as adding num shards configuration to rule file is a breaking change. If our final goal is to determine the shards automatically then this field needs to be deprecated anyway.

@hanjm
Copy link
Member

hanjm commented Aug 29, 2022

For Grafana, I would say it is a Grafana UX issue. Panels don't support custom Params overwriting.

Could we use a query comment to specific the shard num?
like

count(up) # shard_num=3

The ideal comes from mysql sharding use sql comment.
https://www.percona.com/blog/2016/08/30/mysql-sharding-with-proxysql/

@yeya24
Copy link
Contributor Author

yeya24 commented Aug 29, 2022

Does promql support comment? And this would require the engine be shards aware as well.

@hanjm
Copy link
Member

hanjm commented Aug 29, 2022

@yeya24 promql support comment.

@stale
Copy link

stale bot commented Nov 13, 2022

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Nov 13, 2022
@yeya24
Copy link
Contributor Author

yeya24 commented Nov 13, 2022

Still valid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants