New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure dequeued batches have a minimum size #495

Closed
radu-gheorghe opened this Issue Aug 21, 2015 · 1 comment

Comments

Projects
None yet
2 participants
@radu-gheorghe
Copy link
Contributor

radu-gheorghe commented Aug 21, 2015

It seems that some outputs, like the Elasticsearch one (though clearly not all of them) would benefit if rsyslog makes sure it batches multiple messages instead of sending many small batches under light load. The worst scenario being many rsyslog instances on many machines hammering a small ES cluster with 1-doc bulks.

@rgerhards says it can be done in the queue engine so that we have best of both worlds: send as fast as we can - like we do now - for some outputs, and ensure a minimum batch size (or maybe some other solution?) for outputs like omelasticsearch. This issue is more like a reminder for him :) Reference mailing list thread: http://search-devops.com/m/PamuZV4TVQ1M0AJg&subj=Re+rsyslog+Can+we+have+a+minimum+bulk+size+for+omelasticsearch+

@rgerhards

This comment has been minimized.

Copy link
Member

rgerhards commented Aug 21, 2015

rough idea: add config params to set minimum messages m and timeout t. In queue dequeue operation, iterate until number of messages pulled n is at last m. If no data present in n<m situation, wait on notempty signal, releasing queue mutex, with timeout t. If t expired, re-check if empty, if so finish batch, else continute iteration. We need to compute t once at begin of function, so that successive iterations do not increase the overall timeout.

@rgerhards rgerhards added this to the v8.33 milestone Nov 26, 2017

@rgerhards rgerhards modified the milestones: v8.33, v8.34 Feb 15, 2018

@rgerhards rgerhards modified the milestones: v8.34, v8.35 Apr 3, 2018

@rgerhards rgerhards modified the milestones: v8.35, v8.36 May 14, 2018

@rgerhards rgerhards modified the milestones: v8.36, v8.37 Jun 25, 2018

@rgerhards rgerhards modified the milestones: v8.37, v8.39 Aug 3, 2018

@rgerhards rgerhards modified the milestones: v8.39, v8.40 Oct 26, 2018

@rgerhards rgerhards self-assigned this Oct 30, 2018

@rgerhards rgerhards added the do_first label Dec 9, 2018

@rgerhards rgerhards modified the milestones: v8.40, v8.41 Dec 9, 2018

rgerhards added a commit to rgerhards/rsyslog that referenced this issue Jan 10, 2019

rgerhards added a commit to rgerhards/rsyslog that referenced this issue Jan 11, 2019

rgerhards added a commit to rgerhards/rsyslog that referenced this issue Jan 11, 2019

rgerhards added a commit to rgerhards/rsyslog that referenced this issue Jan 11, 2019

rgerhards added a commit to rgerhards/rsyslog that referenced this issue Jan 14, 2019

rgerhards added a commit to rgerhards/rsyslog that referenced this issue Jan 14, 2019

rgerhards added a commit to rgerhards/rsyslog that referenced this issue Jan 18, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment