Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: fix missing logs with batched chunk iterator #1299

Merged
merged 5 commits into from
Nov 26, 2019

Conversation

putrasattvika
Copy link
Contributor

@putrasattvika putrasattvika commented Nov 21, 2019

What this PR does / why we need it:
This PR addresses bugs that cause missing logs issue described in #1187:

  • Removed __name__ label matcher for queries against the chunk storage. The __name__ chunk label is removed from all chunks that are passed into buildHeapIterator(). Overlapping chunks (according to batchSize) will go through that function multiple times, and they will be filtered out on the second call since they no longer have the chunk label.
  • Reversed inclusivity of backward query iterators from [from, through) to (from, through]. This fixes an issue where the last log entry of the first chunk of a batch is not retrieved.
  • Fixed from/through for batched iterator with more than two batches

Which issue(s) this PR fixes:
Fixes #1187

Checklist

  • Tests updated

@cyriltovena cyriltovena self-requested a review November 24, 2019 23:26
Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple small questions, but the inclusivity work looks great.

for name, tt := range tests {
tt := tt
t.Run(fmt.Sprintf("large-batchsize/%s", name), func(t *testing.T) {
it := newBatchChunkIterator(context.Background(), tt.chunks, 1000, newMatchers(tt.matchers), nil, newQuery("", tt.start, tt.end, tt.direction))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the benefit of the large batchsize tests? Additionally, would it be better to use len(tt.chunks) instead of 1000 here for extensibility to ensure the the batch is always large enough to hold all the chunks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I don't know whether these tests are needed or not. I added these tests for debugging purposes since the bug described in #1187 don't happen when the batchsize is big enough to hold all of the chunks. I can remove those if deemed unnecessary.

pkg/storage/iterator.go Outdated Show resolved Hide resolved
@owen-d owen-d self-requested a review November 26, 2019 17:11
@owen-d
Copy link
Member

owen-d commented Nov 26, 2019

Nice work @putrasattvika!

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive @putrasattvika.

Thank you for effort of diving into it.

@cyriltovena cyriltovena merged commit c89df1a into grafana:master Nov 26, 2019
@putrasattvika putrasattvika deleted the chunk-iterator-fix branch November 27, 2019 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Logs disappear after the chunks were removed from memory
3 participants