Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loki: Improve Tailer loop #877

Merged
merged 1 commit into from
Aug 12, 2019

Conversation

pracucci
Copy link
Contributor

@pracucci pracucci commented Aug 9, 2019

What this PR does / why we need it:

The current Tailer.loop() implementation is a bit hard to follow and presents some issues which I'm trying to address in this PR:

  1. Fix: isBlocked() was always returning false because t.blocked was never set to true: now switched to isResponseChanBlocked()
  2. Fix: droppedEntries was not populated with all entries in case of a dropped tailResponse containing multiple streams/entries (was just populated with the current one)
  3. Fix: On a stuck client and high message rate, droppedEntries could grow indefinitely: now introduced an hard cap maxDroppedEntriesPerTailResponse
  4. Refactoring: I tried to refactor Tailer.loop() to make it easier to follow reading the code (and hopefully easier to maintain over the future)

Notes to reviewers:

This is a draft PR cause I would like to hear if you believe the proposed changes make sense. If so, I will work on adding tests (we have no tests on the tailer right now, it's time to show it some ❤️ ).

Checklist

  • Documentation added
  • Tests updated

@sandeepsukhani
Copy link
Contributor

@pracucci
I have been thinking about refactoring this but didn't get a chance to do so.
Thanks for the PR, changes look really good so far!
I had added tests as well but removed them to revisit later because most of the code is async so tests were flaky, failing sometimes to see the expected result.

@pracucci
Copy link
Contributor Author

pracucci commented Aug 9, 2019

Thanks @sandlis for your quick feedback. I will work on testing the Tailer in my spare time over the weekend and submit the PR for review as soon as done. I will pay attention to make tests not flaky 🙏

@pracucci pracucci marked this pull request as ready for review August 11, 2019 12:49
@pracucci
Copy link
Contributor Author

@sandlis I've introduced tests and solved the TODOs I initially left in the code. The PR is now ready for a full review. May you take a look and share your thoughts, please?

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the improvements you have done! Changes look really good.
Please let me know what you think about my comments, otherwise, we can go ahead and merge this.

@@ -14,12 +14,22 @@ import (

const (
// if we are not seeing any response from ingester, how long do we want to wait by going into sleep
nextEntryWait = time.Second / 2
tailerWaitEntryThrottle = time.Second / 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should move this to pkg/querier/querier.go since it is used in that file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. I was in doubt as well. You confirmed my doubt.

select {
case t.responseChan <- tailResponse:
droppedEntries = make([]droppedEntry, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets put a check here for the size of droppedEntries before re-initializing?
If we are doing this, we would also want to put same condition above where droppedEntries is assigned to tailResponse i.e https://github.com/grafana/loki/pull/877/files#diff-6786afcc7f6bd0a7f048371a46467af0R152

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I read your feedback is "do it only if len(droppedEntries) > 0". If I correctly understood your comment, then definitely yes (done). If I misunderstood your comment, may you share more details on your thought, please?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what I meant, thanks for fixing this!

@pracucci
Copy link
Contributor Author

@sandlis Thanks for your review! I should have addressed both.

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sandeepsukhani sandeepsukhani merged commit b918532 into grafana:master Aug 12, 2019
@pracucci pracucci deleted the improve-tailer-loop branch August 13, 2019 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants