Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aggregator|tcpclient] Drop enqueued payloads for a flush after write fails #4116

Closed
wants to merge 2 commits into from

Conversation

vdarulis
Copy link
Collaborator

@vdarulis vdarulis commented Jun 2, 2022

…failed

What this PR does / why we need it:

If an instance is slow/dead and writes (with retries) fail, don't just abort flush - drop all enqueued payloads (captured at the time of Flush()). Otherwise, the backlog is accumulating up to the point the queue is dropping upstream and causes even more load on the target aggregator instance the next flush cycle.

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:


Does this PR require updating code package or user-facing documentation?:


n += processed
}

// drop any unconsumed messages
q.bufProcessing.reset()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanity checking:

  • Do we have/need retries before we drop the buffer here?
  • Are these buffered elsewhere? Afaict no, but sanity checking.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the processing buf is not used outside of Flush(), which is now thread-safe, and retries are handled by the "connection".

@vdarulis vdarulis closed this Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants