Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store-gateway: report touched postings & series instead of fetched #4671

Merged
merged 2 commits into from
Apr 6, 2023

Conversation

dimitarvdimitrov
Copy link
Contributor

What this PR does

The store-gateway sends statistics about the involved index bytes
in each request in its response to the querier. These statistics
are later combined in the query-frontend and included in query stats logs.

My understanding is that the purpose of these stats is to
help gauge the cost of a query. Currently, the store-gateway reports
its fetched bytes. I propose to report touched bytes instead.

Fetched index bytes are based on the number of bytes fetched from
the bucket. This excludes bytes fetched from the cache and includes bytes overfetched
because of an incorrect size estimation or when joining adjacent ranges
from the index object.

Touched bytes are the sum of bytes that were directly necessary in
order to serve the request. This means that they exclude bytes that we
overfetched (e.g. when joining adjacent regions of
an object in the object store or when incorrectly estimating the size of
a series).

The number of touched bytes should not generally change between query
executions and will give a better picture of how expensive the query was
and not depend on cache hit rates or the adjacency of requested regions
in the index.

Signed-off-by: Dimitar Dimitrov dimitar.dimitrov@grafana.com

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • [n/a] Tests updated
  • [n/a] Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The store-gateway sends statistics about each request in its response to
 the querier. My understanding is that the purpose of these stats is to
 help gauge the cost of a query. Currently, the store-gateway reports
 its fetched bytes. I propose to report touched bytes instead.

__Fetched index bytes__ are based on the number of bytes fetched from
the bucket. This excludes bytes fetched from the cache and includes bytes overfetched
because of an incorrect size estimation or when joining adjacent ranges
from the index object.

__Touched bytes__ are the sum of bytes that were
directly necessary in order to serve the request. This means that they
exclude bytes that we overfetched (e.g. when joining adjacent regions of
an object in the object store or when incorrectly estimating the size of
a series).

The number of touched bytes should not generally change between query
executions and will give a better picture of how expensive the query was
and not depend on cache hit rates or the adjacency of requested regions
in the index.

Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
@dimitarvdimitrov dimitarvdimitrov force-pushed the dimitar/store-gateway-index-response-stats branch from 6c3601d to 30e6b68 Compare April 6, 2023 15:32
@dimitarvdimitrov dimitarvdimitrov merged commit 43b4c3d into main Apr 6, 2023
@dimitarvdimitrov dimitarvdimitrov deleted the dimitar/store-gateway-index-response-stats branch April 6, 2023 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants