Skip to content

arielnetworks/embulk-parser-grok

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Grok parser plugin for Embulk

Embulk parser plugin using grok

Overview

  • Plugin type: parser
  • Guess supported: yes

Configuration

for run

  • grok_pattern: A grok pattern name that match with log record. (string, required)
  • first_line_pattern: A grok pattern name that match with the first line of log record, when log record is multi line. (string, default: null)
  • grok_pattern_files: Any grok pattern file's paths. (string[], required)
  • timestamp_parser: (string, default: ruby)
    • ruby: Use JRuby's timestamp parser.
    • SimpleDateFormat: Use Java's SimpleDateFormat
    • epoch: Milliseconds since the epoch
  • stop_on_invalid_record: Stop bulk load transaction if a file includes invalid record (boolean, default: false)

for guess

  • guess_patterns: Any pattern names for guessing log format. (string[], required)

Example

parse apache log file

in:
  type: file
  path_prefix: src/test/resources/apache.log
  parser:
    type: grok
    grok_pattern_files:
      - pattern/grok-patterns
      - pattern/my-patterns
    timestamp_parser: ruby
    grok_pattern: '%{COMBINEDAPACHELOG}'
    stop_on_invalid_record: false
    charset: UTF-8
    newline: CRLF
    columns:
    - {name: request, type: string}
    - {name: agent, type: string}
    - {name: COMMONAPACHELOG, type: string}
    - {name: auth, type: string}
    - {name: ident, type: string}
    - {name: verb, type: string}
    - {name: referrer, type: string}
    - {name: bytes, type: long}
    - {name: response, type: long}
    - {name: clientip, type: string}
    - {name: COMBINEDAPACHELOG, type: string}
    - {name: httpversion, type: string}
    - {name: rawrequest, type: string}
    - {name: timestamp, format: '%d/%b/%Y:%T %z', type: timestamp}

parse multiline file

in:
  type: file
  path_prefix: src/test/resources/multiline.log
  parser:
    type: grok
    grok_pattern_files:
      - pattern/grok-patterns
      - pattern/my-patterns
    timestamp_parser: ruby
    first_line_pattern: '%{MULTILINELOG_FIRSTLINE}'
    grok_pattern: '%{MULTILINELOG}'
    charset: UTF-8
    newline: CRLF
    columns:
    - {name: timestamp, format: '%Y-%m-%d %H:%M:%S.%N %z', type: timestamp}
    - {name: log_level, type: string}
    - {name: message, type: string}
    - {name: stack_trace, type: string}

guess

in:
  type: file
  path_prefix: src/test/resources/apache.log
  parser:
    charset: UTF-8
    newline: CRLF
    type: grok
    grok_pattern_files:
      - pattern/grok-patterns
      - pattern/my-patterns
    guess_patterns:
      - "%{COMBINEDAPACHELOG}"
      - "%{COMMONAPACHELOG}"
    timestamp_parser: ruby
$ embulk install embulk-parser-grok
$ embulk guess -g grok config.yml -o guessed.yml

Build

$ ./gradlew gem

About

Grok parser plugin for Embulk

Resources

License

Stars

Watchers

Forks

Packages

No packages published