Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes a bug in the block cache code. #4302

Merged
merged 5 commits into from
Sep 10, 2021

Conversation

cyriltovena
Copy link
Contributor

Because chunks can overlap and that can causes multiple processing of the same chunk within batches, we decided to cache the
evaluation of it.

However there was a bug here causing incorrect query result, specially for metric queries. The bug would occur when we would only partially consume a block (overlapping block between multiple batches) and then try to replay it.
In which case it wouldn't actually replay it but keep consuming the underlaying iterator.

I guess I assumed that blocks would only be fully consumed, but that's not correct for metric queries (forward). I added new tests that were failing with the previous code to ensure we're good now.

This took us 2 days to figure out, thank @dannykopping for helping along.

Signed-off-by: Cyril Tovena cyril.tovena@gmail.com

Because chunks can overlap and that can causes multiple processing of the same chunk within batches, we decided to cache the
evaluation of it.

However there was a bug here causing incorrect query result, specially for metric queries. The bug would occur when we would only partially consume a block (overlaping block between multiple batches) and then try to replay it.
In which case it wouldn't actually replay it but keep consuming the underlaying iterator.

This took us 2 days to figure out, thank Danny for helping along.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
A few minor suggestions, none of them should block merging

if it.base == nil {
return false
}
ok := it.base.Next()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not be using the Base() function instead of these direct member accesses?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might give us some leverage later, since we have this layer of indirection.

return it.base
}

func (it *cachedIterator) consumeNext() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we might find a better name for this; it's not super clear how this is different from Next() at a casual glance.

Perhaps consumeBase/consumeWrapped?
I'd personally love it if we renamed base and Base() to wrapped and Wrapped(); might be more clear?

pkg/iter/cache.go Show resolved Hide resolved
pkg/iter/cache_test.go Outdated Show resolved Hide resolved
pkg/iter/cache_test.go Show resolved Hide resolved
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
@cyriltovena cyriltovena merged commit e81ca28 into grafana:main Sep 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants