New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filebeat wildcard for directories #2084

Closed
runningman84 opened this Issue Jul 22, 2016 · 25 comments

Comments

Projects
None yet
@runningman84

runningman84 commented Jul 22, 2016

It looks like filebeat does not support wildcards in directories like this:

/opt/codedeploy-agent/deployment-root/*/*/logs/scripts.log
@ruflin

This comment has been minimized.

Show comment
Hide comment
@ruflin

ruflin Jul 25, 2016

Collaborator

This is related to elastic/filebeat#68 Wildcards for directories should already be supported. What is the behaviour you were expecting? Are you looking for ** which crawls all subdirectories (and is not supported at the moment)?

Collaborator

ruflin commented Jul 25, 2016

This is related to elastic/filebeat#68 Wildcards for directories should already be supported. What is the behaviour you were expecting? Are you looking for ** which crawls all subdirectories (and is not supported at the moment)?

@runningman84

This comment has been minimized.

Show comment
Hide comment
@runningman84

runningman84 Jul 25, 2016

My folders look like this

/opt/codedeploy-agent/deployment-root/4b74562c-cee3-40f5-8d36-6f588eeed802/d-KYVV5AYKF/logs/scripts.log
/opt/codedeploy-agent/deployment-root/b41f722c-80e8-41e9-866e-1f11228f5ab3/d-UIIXR9K9F/logs/scripts.log

This config does not seem to work:

'/opt/codedeploy-agent/deployment-root/*/*/logs/scripts.log'

Do I need to change the config to this?

'/opt/codedeploy-agent/deployment-root/**/logs/scripts.log'

runningman84 commented Jul 25, 2016

My folders look like this

/opt/codedeploy-agent/deployment-root/4b74562c-cee3-40f5-8d36-6f588eeed802/d-KYVV5AYKF/logs/scripts.log
/opt/codedeploy-agent/deployment-root/b41f722c-80e8-41e9-866e-1f11228f5ab3/d-UIIXR9K9F/logs/scripts.log

This config does not seem to work:

'/opt/codedeploy-agent/deployment-root/*/*/logs/scripts.log'

Do I need to change the config to this?

'/opt/codedeploy-agent/deployment-root/**/logs/scripts.log'
@ruflin

This comment has been minimized.

Show comment
Hide comment
@ruflin

ruflin Jul 25, 2016

Collaborator

I would expect it to work, but TBH so far I only tested /opt/*/scripts.log examples, means only one directory with a * pattern. Could you briefly tests if /opt/codedeploy-agent/deployment-root/4b74562c-cee3-40f5-8d36-6f588eeed802/*/logs/scripts.log works for you? Which filebeat version are you using?

Collaborator

ruflin commented Jul 25, 2016

I would expect it to work, but TBH so far I only tested /opt/*/scripts.log examples, means only one directory with a * pattern. Could you briefly tests if /opt/codedeploy-agent/deployment-root/4b74562c-cee3-40f5-8d36-6f588eeed802/*/logs/scripts.log works for you? Which filebeat version are you using?

@runningman84

This comment has been minimized.

Show comment
Hide comment
@runningman84

runningman84 Jul 25, 2016

yes
/opt/codedeploy-agent/deployment-root/e8be8394-bb4f-403e-abd8-03045480217d/*/logs/scripts.log
works

runningman84 commented Jul 25, 2016

yes
/opt/codedeploy-agent/deployment-root/e8be8394-bb4f-403e-abd8-03045480217d/*/logs/scripts.log
works

@ruflin

This comment has been minimized.

Show comment
Hide comment
@ruflin

ruflin Jul 26, 2016

Collaborator

Ok, so it seems to work with one directory but not 2 nested directories. We use directly Glob from Golang for the pattern: https://golang.org/pkg/path/filepath/#Glob Currently all pattern supported by Glob are supported by filebeat.

Collaborator

ruflin commented Jul 26, 2016

Ok, so it seems to work with one directory but not 2 nested directories. We use directly Glob from Golang for the pattern: https://golang.org/pkg/path/filepath/#Glob Currently all pattern supported by Glob are supported by filebeat.

@runningman84

This comment has been minimized.

Show comment
Hide comment
@runningman84

runningman84 Jul 26, 2016

do you know a pattern which would work here?

runningman84 commented Jul 26, 2016

do you know a pattern which would work here?

@andrewkroh

This comment has been minimized.

Show comment
Hide comment
@andrewkroh

andrewkroh Jul 26, 2016

Member

@ruflin I'm thinking we should list the patterns supported (https://golang.org/pkg/path/filepath/#Match) in our documentation or link to it. Also we should put some info in there to clarify that only one wildcard is supported (assuming I understand correctly)?

If you agree I can open a new issue to add this to the docs.

Member

andrewkroh commented Jul 26, 2016

@ruflin I'm thinking we should list the patterns supported (https://golang.org/pkg/path/filepath/#Match) in our documentation or link to it. Also we should put some info in there to clarify that only one wildcard is supported (assuming I understand correctly)?

If you agree I can open a new issue to add this to the docs.

@ruflin

This comment has been minimized.

Show comment
Hide comment
@ruflin

ruflin Jul 27, 2016

Collaborator

@andrewkroh Agree. TBH so far it wasn't clear to me that multiple * do not work. It also doesn't seem to be state in the Golang docs (or haven't found it yet).

Collaborator

ruflin commented Jul 27, 2016

@andrewkroh Agree. TBH so far it wasn't clear to me that multiple * do not work. It also doesn't seem to be state in the Golang docs (or haven't found it yet).

@wjoel

This comment has been minimized.

Show comment
Hide comment
@wjoel

wjoel Sep 12, 2016

The lack of multiple wildcards means it's not possible to have a setup as described in Log Management with ELK for Mesos, where paths have the format /var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stdout

From golang/go#11862 it seems Golang will not support this any time soon, and the discussion ends with a reference to go-zglob. Would it be possible to use that for the glob patterns in filebeat? According to this commit it looks like a simple change.

wjoel commented Sep 12, 2016

The lack of multiple wildcards means it's not possible to have a setup as described in Log Management with ELK for Mesos, where paths have the format /var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stdout

From golang/go#11862 it seems Golang will not support this any time soon, and the discussion ends with a reference to go-zglob. Would it be possible to use that for the glob patterns in filebeat? According to this commit it looks like a simple change.

@ruflin

This comment has been minimized.

Show comment
Hide comment
@ruflin

ruflin Sep 13, 2016

Collaborator

It's definitively worth a discussion. But it seems to me we are discussing two things here:

  • Support for ** which can go into multiple sub directories
  • And replacing just one directory but multiple times with *

I'm not so much worried about the implementation work itself, more the potential side affects we are not aware of yet of which parts are also in the golang issue.

It especially see the need to support multiple *, I'm a little bit sceptical about ** support. I assume zglob is doing both? To keep the default setup stable and reliable I would suggest instead of replacing the current implementation with zglob or something similar, I would make it a config option that has to be turned on specifically to get support for it. This will make it possible for use to better identify problems which could be related to it and we could test it first.

Are there alternatives to zglob? Could this be done in a few lines in filebeat itself to not have an additional dependency and be able to fix bugs directly?

Collaborator

ruflin commented Sep 13, 2016

It's definitively worth a discussion. But it seems to me we are discussing two things here:

  • Support for ** which can go into multiple sub directories
  • And replacing just one directory but multiple times with *

I'm not so much worried about the implementation work itself, more the potential side affects we are not aware of yet of which parts are also in the golang issue.

It especially see the need to support multiple *, I'm a little bit sceptical about ** support. I assume zglob is doing both? To keep the default setup stable and reliable I would suggest instead of replacing the current implementation with zglob or something similar, I would make it a config option that has to be turned on specifically to get support for it. This will make it possible for use to better identify problems which could be related to it and we could test it first.

Are there alternatives to zglob? Could this be done in a few lines in filebeat itself to not have an additional dependency and be able to fix bugs directly?

@runningman84

This comment has been minimized.

Show comment
Hide comment
@runningman84

runningman84 Sep 13, 2016

** is not need for my use case, this would be enough

'/opt/codedeploy-agent/deployment-root/*/*/logs/scripts.log'

runningman84 commented Sep 13, 2016

** is not need for my use case, this would be enough

'/opt/codedeploy-agent/deployment-root/*/*/logs/scripts.log'
@cjgeode

This comment has been minimized.

Show comment
Hide comment
@cjgeode

cjgeode Oct 6, 2016

Potential workaround:

Currently, unknown depth of subdirectories is not supported.

However, if the depth is known, like it is exactly 4, you could use wildcards to do something like:
`['path////']'

If limit can be 2, 3 or 4 use ['path/*/*', 'path/*/*/*', 'path/*/*/*/*']

You can also create a shared variable to point to the root path, so your config might look like:
['${fb.watch.path}//', '${fb.watch.path}///', '${fb.watch.path}////*']

cjgeode commented Oct 6, 2016

Potential workaround:

Currently, unknown depth of subdirectories is not supported.

However, if the depth is known, like it is exactly 4, you could use wildcards to do something like:
`['path////']'

If limit can be 2, 3 or 4 use ['path/*/*', 'path/*/*/*', 'path/*/*/*/*']

You can also create a shared variable to point to the root path, so your config might look like:
['${fb.watch.path}//', '${fb.watch.path}///', '${fb.watch.path}////*']

@tsg

This comment has been minimized.

Show comment
Hide comment
@tsg

tsg Oct 9, 2016

Collaborator

Multiple * are already supported. I re-confirmed this with 5.0.0-rc1, but should also work with 1.3. ** is not supported.

@runningman84 perhaps something else was wrong in your test, can you re-try it, please?

Collaborator

tsg commented Oct 9, 2016

Multiple * are already supported. I re-confirmed this with 5.0.0-rc1, but should also work with 1.3. ** is not supported.

@runningman84 perhaps something else was wrong in your test, can you re-try it, please?

@runningman84

This comment has been minimized.

Show comment
Hide comment
@runningman84

runningman84 Oct 10, 2016

Yes it does work for me now.

runningman84 commented Oct 10, 2016

Yes it does work for me now.

@nostrebor

This comment has been minimized.

Show comment
Hide comment
@nostrebor

nostrebor Oct 28, 2016

This is not working for me as well. Single * works, but a matching of the format:
- \\network-share\subdir\*\*\*\*.log
Yields no output. The same behavior is seen in logstash. If I replace it with a only a single wildcard:
- \\network-share\1\2\3\4\*.log
Files are discovered by both logstash and filebeat.

nostrebor commented Oct 28, 2016

This is not working for me as well. Single * works, but a matching of the format:
- \\network-share\subdir\*\*\*\*.log
Yields no output. The same behavior is seen in logstash. If I replace it with a only a single wildcard:
- \\network-share\1\2\3\4\*.log
Files are discovered by both logstash and filebeat.

@raiusa

This comment has been minimized.

Show comment
Hide comment
@raiusa

raiusa Jan 18, 2017

Any update on support of * * in filebeat.
I have logs in following directory structure.

/opt/cloudera/yarn/container-logs/application_*/container_*/*.log

raiusa commented Jan 18, 2017

Any update on support of * * in filebeat.
I have logs in following directory structure.

/opt/cloudera/yarn/container-logs/application_*/container_*/*.log
@raiusa

This comment has been minimized.

Show comment
Hide comment
@raiusa

raiusa Jan 18, 2017

Sorry some how it's stripping * from directory application_* and container_*
/opt/cloudera/yarn/container-logs/application_/container_/*.log

raiusa commented Jan 18, 2017

Sorry some how it's stripping * from directory application_* and container_*
/opt/cloudera/yarn/container-logs/application_/container_/*.log

@ruflin

This comment has been minimized.

Show comment
Hide comment
@ruflin

ruflin Jan 19, 2017

Collaborator

@raiusa I updated your post to have it posted as code. There is no update yet on this.

Collaborator

ruflin commented Jan 19, 2017

@raiusa I updated your post to have it posted as code. There is no update yet on this.

@zkf9971

This comment has been minimized.

Show comment
Hide comment
@zkf9971

zkf9971 Jan 25, 2017

I have the log files under the path like /data//_data//////, for filebeat, is it possible to get the real value for these wildcard * in the path, because the * in this path represents some important information, for instance the log file /data/containers/_data/container_1/serverid_123///*/serverid_123.log, can filebeat get the value like serverid_123? Thanks a lot.
Following is snippet of my filebeat.yml for these log files:
`
filebeat.prospectors:

  • input_type: log
    paths:
    • /data/*/_data/*/*/*/*/*/*.log
      `

zkf9971 commented Jan 25, 2017

I have the log files under the path like /data//_data//////, for filebeat, is it possible to get the real value for these wildcard * in the path, because the * in this path represents some important information, for instance the log file /data/containers/_data/container_1/serverid_123///*/serverid_123.log, can filebeat get the value like serverid_123? Thanks a lot.
Following is snippet of my filebeat.yml for these log files:
`
filebeat.prospectors:

  • input_type: log
    paths:
    • /data/*/_data/*/*/*/*/*/*.log
      `
@ruflin

This comment has been minimized.

Show comment
Hide comment
@ruflin

ruflin Jan 26, 2017

Collaborator

@zkf9971 As far as I understand you want to add the path names as fields to the event? If yes, this would be a different feature request then this one here.

Collaborator

ruflin commented Jan 26, 2017

@zkf9971 As far as I understand you want to add the path names as fields to the event? If yes, this would be a different feature request then this one here.

@7AC 7AC self-assigned this Mar 16, 2017

7AC added a commit to 7AC/beats that referenced this issue Mar 22, 2017

@7AC 7AC removed their assignment Apr 5, 2017

7AC added a commit to 7AC/beats that referenced this issue Apr 10, 2017

filebeat: expand double wildcards in prospector
Expand double wildcards into standard glob patterns, up to a maximum
depth of 16 levels after the wildcard.

Resolves elastic#2084

7AC added a commit to 7AC/beats that referenced this issue Apr 25, 2017

filebeat: expand double wildcards in prospector
Expand double wildcards into standard glob patterns, up to a maximum
depth of 8 levels after the wildcard.

Resolves elastic#2084

@ruflin ruflin closed this in #3980 Apr 26, 2017

ruflin added a commit that referenced this issue Apr 26, 2017

filebeat: expand double wildcards in prospector (#3980)
Expand double wildcards into standard glob patterns, up to a maximum
depth of 8 levels after the wildcard.

Resolves #2084
@ruflin

This comment has been minimized.

Show comment
Hide comment
@ruflin

ruflin Apr 26, 2017

Collaborator

@nostrebor @zkf9971 @raiusa @runningman84 #3980 ws just merged into master. It would be great if you could check if that works with your use case. The snapshot builds can be found here: https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/

Collaborator

ruflin commented Apr 26, 2017

@nostrebor @zkf9971 @raiusa @runningman84 #3980 ws just merged into master. It would be great if you could check if that works with your use case. The snapshot builds can be found here: https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/

@Subtalime

This comment has been minimized.

Show comment
Hide comment
@Subtalime

Subtalime Nov 2, 2017

Am I understanding correctly, that this is intended not to work (Filebeat 5.1.1-1.x86_64)?
Prospector Pattern:
logs/app/archive/**/*.log
Dir:
logs/app/archive/dir1/test.log
logs/app/archive/dir1/dir2/another.log

Subtalime commented Nov 2, 2017

Am I understanding correctly, that this is intended not to work (Filebeat 5.1.1-1.x86_64)?
Prospector Pattern:
logs/app/archive/**/*.log
Dir:
logs/app/archive/dir1/test.log
logs/app/archive/dir1/dir2/another.log

@ruflin

This comment has been minimized.

Show comment
Hide comment
@ruflin

ruflin Nov 6, 2017

Collaborator

@Subtalime This change is only in the 6.x releases. For further questions please use discuss.

Collaborator

ruflin commented Nov 6, 2017

@Subtalime This change is only in the 6.x releases. For further questions please use discuss.

athom added a commit to athom/beats that referenced this issue Jan 25, 2018

filebeat: expand double wildcards in prospector (elastic#3980)
Expand double wildcards into standard glob patterns, up to a maximum
depth of 8 levels after the wildcard.

Resolves elastic#2084

amomchilov pushed a commit to amomchilov/Filebeat that referenced this issue Apr 19, 2018

filebeat: expand double wildcards in prospector (#3980)
Expand double wildcards into standard glob patterns, up to a maximum
depth of 8 levels after the wildcard.

Resolves elastic/beats#2084
@hsluoyz

This comment has been minimized.

Show comment
Hide comment
@hsluoyz

hsluoyz Sep 27, 2018

Hi @ruflin

This is related to elastic/filebeat#68 Wildcards for directories should already be supported. What is the behaviour you were expecting? Are you looking for ** which crawls all subdirectories (and is not supported at the moment)?

https://github.com/elastic/filebeat/issues/68 is 404 now. What is the latest link?

hsluoyz commented Sep 27, 2018

Hi @ruflin

This is related to elastic/filebeat#68 Wildcards for directories should already be supported. What is the behaviour you were expecting? Are you looking for ** which crawls all subdirectories (and is not supported at the moment)?

https://github.com/elastic/filebeat/issues/68 is 404 now. What is the latest link?

@andrewkroh

This comment has been minimized.

Show comment
Hide comment
@andrewkroh

andrewkroh Sep 28, 2018

Member

That repo no longer exists, but you can read about glob support in the Filebeat documentation.

https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#input-paths

Member

andrewkroh commented Sep 28, 2018

That repo no longer exists, but you can read about glob support in the Filebeat documentation.

https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#input-paths

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment