Skip to content

Read files to scan from stdin to use find for excluding of files, folders and mount points #42

@beckerr-rzht

Description

@beckerr-rzht

It would be great if the files to be scanned could be read from stdin.
This would open up a whole new set of possibilities together with find.

Example:

find / -xdev -type f | java -jar log4j-detector-2021.12.16.jar --stdin

This would scan all files in the local root filesystem, but omit /dev, /proc, etc. and all NFS mounts.

Using find, the following issues would be easy to solve: #11, #39 and #40,

Activity

zhurkin

zhurkin commented on Dec 17, 2021

@zhurkin

The find on large volumes just freezes . It is better to make an explicit exception in the program

beckerr-rzht

beckerr-rzht commented on Dec 17, 2021

@beckerr-rzht
Author

I don't know such problems with find, but I just want to scan files on all local filesystems only.

For example I'm actually using this find options:

find  / \( -type d \( -fstype autofs -o -fstype fuse.sshfs -o -fstype nfs -o -fstype proc -o -fstype sshfs -o -fstype sysfs -o -fstype tmpfs \) -prune -o -type f \) \
    -type f -print | java -jar log4j-detector-2021.12.17.jar --stdin
beckerr-rzht

beckerr-rzht commented on Dec 18, 2021

@beckerr-rzht
Author

The current precompiled version 2021.12.17 supporting --stdin is here:
https://github.com/beckerr-rzht/log4j-detector/raw/master/log4j-detector-2021.12.17.jar

juergenhoetzel

juergenhoetzel commented on Dec 19, 2021

@juergenhoetzel

You can build and execute command lines from standard input using xargs:

find  / \( -type d \( -fstype autofs -o -fstype fuse.sshfs -o -fstype nfs -o -fstype proc -o -fstype sshfs -o -fstype sysfs -o -fstype tmpfs \) -prune -o -type f \)     -type f -name "*.jar"|xargs java -jar log4j-detector-2021.12.17.jar
beckerr-rzht

beckerr-rzht commented on Dec 19, 2021

@beckerr-rzht
Author

Note the following when using xargs:
Using xargs can always be slower if many files are passed, because the java process may have to be started several times.

When using xargs, parameters and environment variables together may only occupy a maximum of 4096 bytes in the worst case. The size of the environment of root is around 2000 bytes (depending on operating system and configuration).
A "medium" installation of Ubuntu Desktop has about 400000 files.

This would result in the following comparison:

  • with --stdin the java process is started exactly once.
  • without --stdin xargs starts the java process about 10000 times.

But this is of course only the worst case, which should occur rarely.
The actual values of the particular system are provided by xargs --show-limits.

But xargs has one advantage in any case:
The parameter -P allows to run several processes in parallel.
So e.g.:

find \ -xdev | xargs -rn100 -P8 java -jar log4j-detector-2021.12.17.jar

... will start 8 processes scanning in parallel. Here -r prevents the process from being started without parameters and -n100 determines that 100 arguments are passed at a time.

Provided you have enough CPU, this could speed up the detector scan.
However, in such cases the tool parallel should be preferred, because it is much more flexible.

Regardless, I hope that my pull request #43 will be accepted.

changed the title [-]Read file to scan from stdin[/-] [+]Read files to scan from stdin to use `find` for excluding of files, folders and mount points[/+] on Dec 19, 2021
juliusmusseau

juliusmusseau commented on Dec 20, 2021

@juliusmusseau
Contributor

I did this in my own way. See v2021.12.20 which adds a new --stdin flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @juergenhoetzel@zhurkin@beckerr-rzht@juliusmusseau

      Issue actions

        Read files to scan from stdin to use `find` for excluding of files, folders and mount points · Issue #42 · mergebase/log4j-detector