Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve entry deduplication. #2302

Merged
merged 3 commits into from
Jul 8, 2020
Merged

Conversation

cyriltovena
Copy link
Contributor

This PR removes mostcommon and sort insert function in the heap iterator. I discovered while working on #2293 that those are actually not helping since we're deduping those lines anyways. There were no tests checking if deduping was correctly working so I did added those.

Bonus point this means deduping will run faster and the code is less complex. The only side effect is that the order of entries that are at the same timestamp, before the most common entry would appear first, now we keep the same order as we stored them, which I think is better.

I also change the label ordering, I think whether we are forward or backward we should keep the same alphabetical labels ordering not sure why direction was altering this before.

Signed-off-by: Cyril Tovena cyril.tovena@gmail.com

This PR removes mostcommon and sort insert function in the heap iterator. I discovered while working on grafana#2293 that those are actually not helping since we're deduping those lines anyways. There were no tests checking if deduping was correctly working  so I did added those.

Bonus point this means deduping will run faster and the code is less complex. The only side effect is that the order of entries that are at the same timestamp, before the most common entry would appear first, now we keep the same order as we stored them, which I think is better.

I also change the label ordering, I think whether we are forward or backward we should keep the same aphabetical labels ordering not sure why direction was altering this before.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit about a test, then LGTM.

NewStreamIterator(foo),
}, logproto.BACKWARD)
// first reverse streams, they should already be correctly ordered for the heap iterator to work.
for i, j := 0, len(foo.Entries)-1; i < j; i, j = i+1, j-1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually mutating the internal entries used by the stream iter as they reference the same slice. I think you'll want a function which creates the same []stream to guarantee we don't mutate the same underlying data. Basically the testware and the HeapIter should use identical but separate copies of the same data.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
@codecov-commenter
Copy link

Codecov Report

Merging #2302 into master will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2302      +/-   ##
==========================================
+ Coverage   61.03%   61.04%   +0.01%     
==========================================
  Files         158      158              
  Lines       12778    12751      -27     
==========================================
- Hits         7799     7784      -15     
+ Misses       4394     4379      -15     
- Partials      585      588       +3     
Impacted Files Coverage Δ
pkg/iter/iterator.go 66.85% <100.00%> (-1.31%) ⬇️
pkg/promtail/targets/tailer.go 73.86% <0.00%> (-2.28%) ⬇️
pkg/promtail/targets/filetarget.go 68.67% <0.00%> (-1.81%) ⬇️
pkg/logql/evaluator.go 92.13% <0.00%> (-0.41%) ⬇️
pkg/promtail/positions/positions.go 60.71% <0.00%> (+13.39%) ⬆️

@owen-d owen-d merged commit cd74043 into grafana:master Jul 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants