Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple prospectors on the same file #3046

Closed
joshuaspence opened this issue Nov 22, 2016 · 2 comments

Comments

@joshuaspence
Copy link

commented Nov 22, 2016

We are using Filebeat to parse a log file that contains two different types of logs: JSON encoded and PHP logs. We are migrating all of our application code to use the JSON encoded format, but until then we are using the following Filebeat configuration in an attempt to parse the same log file twice, once with a json document type and a second with a php document type. The Filebeat configuration looks like this:

---
filebeat:
  spool_size: 2048
  idle_timeout: '5s'
  prospectors:
    - input_type: 'log'
      paths:
        - '/mnt/logs/php.log'
      encoding: 'plain'
      include_lines:
        - '^{'
      document_type: 'json'
      scan_frequency: '10s'
      harvester_buffer_size: 16384
      max_bytes: 10485760
      tail_files: false
      backoff: '1s'
      max_backoff: '10s'
      backoff_factor: 2
    - input_type: 'log'
      paths:
        - '/mnt/logs/php.log'
      encoding: 'plain'
      exclude_lines:
        - '^{'
      document_type: 'php'
      scan_frequency: '10s'
      harvester_buffer_size: 16384
      max_bytes: 10485760
      multiline:
        pattern: '^(\[[0-9]{2}-(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)-[0-9]{4}|{)'
        negate: true
        match: 'after'
      tail_files: false
      backoff: '1s'
      max_backoff: '10s'
      backoff_factor: 2
output:
  console:

I am noticing that sometimes (it doesnt seem to happen consistently), the log formats get mixed up. For example, I saw a message in Kibana that looks like this:

PHP Warning: REDACTED
Stack trace:
#0 REDACTED
#1 REDACTED
#2 REDACTED
#3 REDACTED
#4 REDACTED
#5 REDACTED
#6 REDACTED
#7 REDACTED
#8 REDACTED
#9 REDACTED
#10 REDACTED
#11 REDACTED
#12 REDACTED
#13 {main}
{... JSON message ...}

Looking at the registry file, it looks like there is only a single entry for source == "/mnt/logs/php.log", which made me wonder... is this supposed to work? It's possible that the problem lies in our multiline regex, but the fact that the registry contains only a single entry, despite there being multiple prospectors for this file.

@ruflin

This comment has been minimized.

Copy link
Collaborator

commented Nov 22, 2016

Harvesting the same file in two different prospectors is not supported. This messes up the state internally and can lead to strange behaviour. For your use case you should use 2 instances of filebeat and make sure, they use a different registry file. I'm closing this issue as it is not a bug but I'm happy to discuss it further on discuss: https://discuss.elastic.co/c/beats/filebeat

@ruflin ruflin closed this Nov 22, 2016
@rahulghanate

This comment has been minimized.

Copy link

commented Sep 7, 2017

We should allow overriding the definition of prospector.
For my encounter with this issue is because I have enabled one generic prospector for *.log in a specific directory.
And later added specialized prospector for one log file from that directory with different different parsing details in config_dir.

What I actually want here is that the generic prospector should pickup files if no specialized prospector defined for them.
But filebeat does not even start, it should atleast ignore that propspector config so I don't miss logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.