Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add filter option ala 'grep -E' #216

Closed
geolaw opened this issue Oct 26, 2023 · 2 comments
Closed

add filter option ala 'grep -E' #216

geolaw opened this issue Oct 26, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@geolaw
Copy link

geolaw commented Oct 26, 2023

Summary

Would be great to be able to filter log items based on a regex

Current behavior

s4 will merge all lines

Suggested behavior

Using the '-E' option like on grep, pass s4 a regex to filter the logs on

Describe the feature change.

s4 -E 'one|two|three|four' log1 log2 log3 > merged

Will only result in lines from log1, log2 and log3 container 'one' 'two' 'three' or 'four'

Other

I was planning on just post processing the output with a pipe to grep -E, but then I realized with the other RFE that I submitted (#215) that ideally a RFE to build this right into s4 would be more ideal

@geolaw
Copy link
Author

geolaw commented Oct 26, 2023

RTFM :)

Maybe its possible to do this via stdin?

the --help shows :

Arguments:
  <PATHS>...  Path(s) of log files or directories.
              Directories will be recursed. Symlinks will be followed.
              Paths may also be passed via STDIN, one per line. The user must
              supply argument "-" to signify PATHS are available from STDIN.

So if I wanted to grep -E 'one|two|three|four' on a series of input files (e.g. /path/to/log1 /path/to/log2 /path/to/log3) and then pass that onto s4 what would the syntax be?

@jtmoon79
Copy link
Owner

jtmoon79 commented Nov 4, 2023

Thanks for the suggestion @geolaw and the write-up with an example. That really helps :-)

Using the '-E' option like on grep, pass s4 a regex to filter the logs on

This is where process Piping should be used.
While I could add a grep-like -E filtering, there is already implementations of grep (or Select-String in PowerShell) that will work just as well.
Additionally, while it might seem like building in the regular expression into s4 would be fast, I'm fairly certain it won't be. Simply, Unix from the start has a very fast process pipelining capability. And GNU grep (and BSD grep) are extremely fast. Whatever I write, wouldn't be much better than piping to those trusty old grep programs.
It wouldn't surprise me if my implementation was slower; I'm using Rust regex library which is relatively new (and probably slow) compared to GNU grep (or BSD grep) which has been optimized over decades by many people (it's really fast).

So if I wanted to grep -E 'one|two|three|four' on a series of input files (e.g. /path/to/log1 /path/to/log2 /path/to/log3) and then pass that onto s4 what would the syntax be?

I think what you want is to pipe data from s4 to grep? Yes, you can pass a file listing via stdin (one path per line) or on the command-line and then pipe to grep, e.g.

$ s4 log1 log2 log3 | grep -E 'one|two|three|four' > merged

or

$ echo "log1
log2
log3" | s4 - | grep -E 'one|two|three|four' > merged

@jtmoon79 jtmoon79 closed this as completed Nov 4, 2023
@jtmoon79 jtmoon79 added the enhancement New feature or request label Nov 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants