Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline support in Filebeat #461

Closed
tbragin opened this issue Dec 7, 2015 · 1 comment
Closed

Multiline support in Filebeat #461

tbragin opened this issue Dec 7, 2015 · 1 comment
Assignees
Labels
enhancement Filebeat Filebeat in progress Pull request is currently in progress.

Comments

@tbragin
Copy link
Contributor

tbragin commented Dec 7, 2015

See this issue for more details on motivations: https://github.com/elastic/filebeat/issues/89

See this issue for proposed implementation: https://github.com/elastic/filebeat/issues/301

@tsg tsg assigned tsg and unassigned ruflin Dec 16, 2015
@tsg tsg added the in progress Pull request is currently in progress. label Dec 16, 2015
@tsg
Copy link
Contributor

tsg commented Dec 16, 2015

Copying over and summarizing the result of the discussion from https://github.com/elastic/filebeat/issues/301:

Configuration

The configuration is strongly inspired from the logstash multiline codec, but transcoded in YAML and with the "what" parameter renamed to "match" and its options extended:

multiline:
  pattern: a regexp
  negate: true or false (default false)
  match: one of "before" or "after"

For example, the following sticks to the previous line the lines that start with white spaces (common in exceptions):

multiline:
  pattern: "^\s"
  match: after

Note that "after" is the equivalent to "previous" in the LS config, and "before" is the equivalent to "next" in the LS config.

For another example, the following config puts sticks to the previous line all the lines that don't start with a timestamp (the same example can be found in the LS docs):

multiline:
  pattern: "(\d{4})-(\d{2})-(\d{2})T(\d{2})\:(\d{2})\:(\d{2})\+(\d{2})\:(\d{2})"
  negate: true
  match: after

Configuration (phase two)

An extended version is to use it also as an array. This allows to set multiple patterns at once which makes it more powerful.

multiline:
  patterns:
    - 
      pattern: regexp
      negate: true or false
      match: one of ["start", "end", "before", "after"]
    -
      pattern: regexp
      negate: true or false
      match: one of ["start", "end", "before", "after"]

Note that the "start" and "end" are new and can be used for matching things like multiline JSON or multiline XML:

For example, the following would match a pretty-printed JSON:

multiline:
  patterns:
    - pattern: "^{$"
      match: "start"
    -
      pattern: "^}$"
      match: "end"

Limits

The following configuration options will be able to set limits for the multiline matching (again inspired from LS):

  • max_lines - Flush after this many lines have been sticked together. Default 500
  • max_bytes - Flush after this many bytes have been sticked together. Default 10 MB
  • timeout - Flush after this duration of no longer seeing lines in the pattern. This option is not present in LS, so it has lower priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Filebeat Filebeat in progress Pull request is currently in progress.
Projects
None yet
Development

No branches or pull requests

4 participants