Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update broker/pipeline setup #4650

Merged
merged 1 commit into from Jul 12, 2017

Conversation

Projects
None yet
5 participants
@urso
Copy link
Collaborator

commented Jul 11, 2017

  • remove settings: filebeat.spooler, filebeat.publish_async, filebeat.idle_timeout, queue_size, bulk_queue_size
  • introduce new setting: queue.<queue_type>
  • expose memory queue (type = 'mem') settings: events, flush.min_events, flush.timeout

Note: docs will be updated in a separate PR.

@urso urso added the in progress label Jul 11, 2017

# Do not modify this value.
#bulk_queue_size: 0
# Internal queue configuration for buffering events to be published.
#queue:

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 11, 2017

Collaborator

It's a bit strange that we call it here queue and in the code broker. This will need some docs and changelog entry. But can happen in a follow up PR. Does this have to be in this PR?

This comment has been minimized.

Copy link
@urso

urso Jul 11, 2017

Author Collaborator

These are the user-visible settings. I decided to rename the broker package to queue, but first I want to get the changes to the settings in and documented, so we won't introduce BC-breaking changes in case package updates PR gets merged late.

@urso urso force-pushed the urso:pipeline/config branch 2 times, most recently from fe27bb3 to cc174ea Jul 11, 2017

@ruflin

ruflin approved these changes Jul 11, 2017

# Hints the minimum number of events stored in the queue,
# before providing a batch of events to the outputs.
# A value of 0 (the default) ensures events are immediately available
# to be send to the outputs.

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 11, 2017

Collaborator

s/send/sent/

This comment has been minimized.

Copy link
@dedemorton

dedemorton Jul 11, 2017

Contributor

"to be sent" is correct

This comment has been minimized.

Copy link
@urso

urso Jul 11, 2017

Author Collaborator

Thanks!

# Queue type by name (default 'mem')
#mem:
# Max number of events to store in the queue.
#events: 4096

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 11, 2017

Collaborator

What happens when this is reached? Flush happens? Thinking if there should be flush.min, flush.max and flush.timeout or similar?

This comment has been minimized.

Copy link
@urso

urso Jul 11, 2017

Author Collaborator

Semantics is a little different from old spooler here. The queue/broker does not flush its buffers. By default it basically stream the events right through to the outputs (if output takes somewhat longer it will be presented with all events available - up to bulk_max_size).

By configuring min_events, the queue will block the output until:

  • at least min events is reached
  • timeout is trigger

The min events is more of a hint, as the output might still receive a smaller buffer.

This comment has been minimized.

Copy link
@urso

urso Jul 11, 2017

Author Collaborator

The events setting is the maximum number of events the queue can hold. It's independent of min_events and flush timer. Only min_events must be < events.

This comment has been minimized.

Copy link
@urso

urso Jul 11, 2017

Author Collaborator

Naming is a little weird here... as it's more like 'block-processing' settings...

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 11, 2017

Collaborator

@urso what happens when events is reached? I assume it will block from adding more events? In case of metricbeat assuming sending events is slower then generating events but still all events are acked. What will happen?

This comment has been minimized.

Copy link
@urso

urso Jul 11, 2017

Author Collaborator

once events is reached, the queue will block (no more space). Upon ACK of N events, another N events can be stored in the queue (N producers will be unblocked).

}

p, err := pipeline.New(brokerFactory, out, settings)
brokerFactory := broker.FindFactory(brokerType)

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 11, 2017

Collaborator

As quickly discussed, later we should probably rename broker package to queue to make the relation to the config more obvious.

func NewBroker(eventer broker.Eventer, sz int, waitOnClose bool) *Broker {
func NewBroker(
eventer broker.Eventer,
sz int,

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 11, 2017

Collaborator

Seems like the 3 params could be a config struct for borkers.

# Do not modify this value.
#bulk_queue_size: 0
# Internal queue configuration for buffering events to be published.
#queue:

This comment has been minimized.

Copy link
@andrewkroh

andrewkroh Jul 11, 2017

Member

I'll open a PR to delete this block of code from Winlogbeat so that queue isn't rejected.

This comment has been minimized.

Copy link
@urso

urso Jul 11, 2017

Author Collaborator

hehe, thanks.

This comment has been minimized.

Copy link
@urso

urso Jul 12, 2017

Author Collaborator

this PR is adjusting winlogbeat.

@urso urso force-pushed the urso:pipeline/config branch 3 times, most recently from 57756c6 to b3f1b82 Jul 11, 2017

@urso urso changed the title [WIP] update broker/pipeline setup Update broker/pipeline setup Jul 12, 2017

@urso urso force-pushed the urso:pipeline/config branch from b3f1b82 to 61f5b7e Jul 12, 2017

@urso urso referenced this pull request Jul 12, 2017

Closed

Publisher Pipeline #4598

22 of 22 tasks complete

@urso urso added needs_docs review and removed in progress labels Jul 12, 2017

@ruflin

ruflin approved these changes Jul 12, 2017

@@ -19,6 +19,7 @@ https://github.com/elastic/beats/compare/v6.0.0-alpha2...master[Check the HEAD d
- The `scripts/import_dashboards` is removed from packages. Use the `setup` command instead. {pull}4586[4586]
- Change format of the saved kibana dashboards to have a single JSON file for each dashboard {pull}4413[4413]
- Rename `configtest` command to `test config`. {pull}4590[4590]
- Remove setting `queue_size` and `bulk_queue_size`. {pull}4650[4650]

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 12, 2017

Collaborator

Should we add a message to the 5.6 release that these values will change?

@@ -81,6 +83,7 @@ https://github.com/elastic/beats/compare/v6.0.0-alpha2...master[Check the HEAD d
- Add udp prospector type. {pull}4452[4452]
- Enabled Cgo which means libc is dynamically compiled. {pull}4546[4546]
- Add Beta module config reloading mechanism {pull}4566[4566]
- Remove spooler and publisher components and settings. {pull}4644[4644]

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 12, 2017

Collaborator

Same here, we should probably have a message in 5.6 or even better, not let the beat start in 6.0 if these values are set?

# new batches.
#pipelining: 0
#pipelining: 5

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 12, 2017

Collaborator

Is there a reason we changed the default to 5? Should probably be mentioned in the changelog?

This comment has been minimized.

Copy link
@urso

urso Jul 12, 2017

Author Collaborator
  • pipeline can operate complete async -> take advantage of this fact
  • with default settings we have a high chance to publish very small batches (no flush timeout -> reduced latency) if producers don't have a high enough rate. In Logstash case we can easily compensate for this by pushing more batches on the link

This comment has been minimized.

Copy link
@urso

urso Jul 12, 2017

Author Collaborator

Updated changelog.

@@ -20,9 +20,6 @@ const (

type Config struct {
Prospectors []*common.Config `config:"prospectors"`
SpoolSize uint64 `config:"spool_size" validate:"min=1"`
PublishAsync bool `config:"publish_async"`

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 12, 2017

Collaborator

Glad to see this one removed ;-)

type config struct {
Events int `config:"events" validate:"min=32"`
Events int `config:"events" validate:"min=32"`

This comment has been minimized.

Copy link
@ruflin

ruflin Jul 12, 2017

Collaborator

Kind of surprised to see a minimum here.

This comment has been minimized.

Copy link
@urso

urso Jul 12, 2017

Author Collaborator

well, we need to buffer some data for pushing to the outputs. Without queue the pipeline won't be able to operate -> enforce minimum so users don't use 1 or 0.

Update broker setup
- remove setting queue_size
- remove unused setting bulk_queue_size
- introduce settings namespace `queue` with default `mem.events: 4096`
- add settings queue.mem.flush.min_events and queue.mem.flush.timeout
- change default output.logstash.pipelining to 5
- remove spooler settings from filebeat config object
- Update winlogbeat config validate check

@urso urso force-pushed the urso:pipeline/config branch from 61f5b7e to cd45c0a Jul 12, 2017

@urso

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 12, 2017

Translating old configurations:

  • rename queue_size -> queue.mem.events * 2
  • remove bulk_queue_size
  • rename or remove filebeat.spool_size -> queue.mem.flush.min_events
  • rename or remove filebeat.idle_timeout -> queue.mem.flush.timeout

@ruflin ruflin merged commit 7a6a5dd into elastic:master Jul 12, 2017

6 checks passed

CLA Commit author is a member of Elasticsearch
Details
beats-ci Build finished.
Details
codecov/patch 60.27% of diff hit (within 100% threshold of 62.85%)
Details
codecov/project 62.88% (+0.03%) compared to c43b2fd
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@monicasarbu monicasarbu added the libbeat label Jul 17, 2017

@urso urso deleted the urso:pipeline/config branch Jul 19, 2017

@tsg tsg referenced this pull request Jul 24, 2017

Closed

Document breaking changes in 6.0 #4737

28 of 28 tasks complete

@dedemorton dedemorton referenced this pull request Jul 25, 2017

Closed

Beats doc updates in 6.0 #4540

42 of 42 tasks complete

ramon-garcia added a commit to ramon-garcia/beats that referenced this pull request Dec 5, 2017

Update broker setup (elastic#4650)
- remove setting queue_size
- remove unused setting bulk_queue_size
- introduce settings namespace `queue` with default `mem.events: 4096`
- add settings queue.mem.flush.min_events and queue.mem.flush.timeout
- change default output.logstash.pipelining to 5
- remove spooler settings from filebeat config object
- Update winlogbeat config validate check
@dedemorton

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2018

I'm removing the needs_docs label here because I believe we've covered the doc requirements with the topic https://www.elastic.co/guide/en/beats/filebeat/current/configuring-internal-queue.html and the removed settings that are documented in the breaking changes for the release.

@dedemorton dedemorton removed the needs_docs label Jan 22, 2018

athom added a commit to athom/beats that referenced this pull request Jan 25, 2018

Update broker setup (elastic#4650)
- remove setting queue_size
- remove unused setting bulk_queue_size
- introduce settings namespace `queue` with default `mem.events: 4096`
- add settings queue.mem.flush.min_events and queue.mem.flush.timeout
- change default output.logstash.pipelining to 5
- remove spooler settings from filebeat config object
- Update winlogbeat config validate check
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.