-
Notifications
You must be signed in to change notification settings - Fork 66
[ML] Add system call restrictions to the ML processes #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Yes, I think also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could imagine this being info as it might be useful for customers to know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. Upgrading this would mean customers running on CentOS 6 would be spammed every time an ML process starts up. And normalize
can run pretty frequently...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It started as info level but as Dave points out every time normalize starts the message is logged and this can spam the log files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we could log this message in autodetect main only. It is a useful message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember ES itself will have logged this once at startup. Logging in autodetect will still log 100 times if 100 jobs are started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider using boost for that, it handles some more cases afaik and can be easily replaced with std once we switch to C++17
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean is there a boost version of getenv? I'm using the temp dir set by elasticsearch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean boost::filesystem::temp_directory_path()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider boost::filesystem::unique_path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mkstemps
is the safest way to do this as it avoids a race condition between creating and opening the file. I pushed a change to do this, unfortunately it uses strlcpy
and strlcat
and file descriptors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left quite a few comments about low level things but the fundamental structure of this is exactly right in my opinion.
One more thing is that some of the formatting looks a little different to what clang-format would choose, so please run dev-tools/clang-format.sh
over the whole codebase.
include/seccomp/CSystemCallFilter.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use a static method like installSystemCallFilter()
rather than the constructor. Using the constructor implies an RAII idiom where the destructor reverses what the constructor has done. But that's impossible for system call filters.
Similar classes would be CCrashHandler
and CProcessPriority
and they use a static method rather than the constructor.
include/seccomp/CSystemCallFilter.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could say the intention is to prevent starting new processes and, where possible, preventing network access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything from here down to the end of canUseSeccompBpf()
should be in an unnamed namespace to enforce internal linkage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: kernal -> kernel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this and the two below should be const std::uint32_t
(plus #include <cstdint>
). The struct member they're used with is explicitly 32 bits, and whilst unsigned int
is 32 bits on all the compilers we're currently using it's best practice to specify the size exactly where the consuming code does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: kernals -> kernels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: double //
lib/seccomp/unittest/Makefile
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need this as lib/core
has it and only lib/core
is using the Boost regex lib directly. (If you do need it then it's a sign something isn't working as expected.)
lib/seccomp/unittest/Makefile
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency the linker flags should be added here too, i.e. LDFLAGS:=$(ML_SECCOMP_LDFLAGS)
mk/stdcpptest.mk
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're not linking the seccomp library to every unit test program I think the linker flags should also be added just in the one unit test program that tests it.
docs/CHANGELOG.asciidoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proceses -> processes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
free after return? But goes away with boost anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few more minor comments/suggestions.
docs/CHANGELOG.asciidoc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think the official name is "macOS" with a captial S at the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessary to declare variables static
within an unnamed namespace - simply being inside the unnamed namespace gives them internal linkage. (I can see a few files where I've unnecessarily added static
in this situation in the past, so I'll remove those in a follow-up.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nicer if the 3
was a symbolic constant (e.g. FILE_NAME_TEMPLATE_SUFFIX_LEN
) defined adjacent to FILE_NAME_TEMPLATE
. Then anyone changing FILE_NAME_TEMPLATE
would be more likely to see they also needed to change the 3
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could avoid the C-style string functions by doing something like this:
std::string profileFilename = getTempDir() + FILE_NAME_TEMPLATE;
int fd = mkstemps(&profileFilename[0], FILE_NAME_TEMPLATE_SUFFIX_LEN);
Also, the documentation for mkstemps()
states that it can fail and set errno
, so maybe log the error and return an empty string in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that safe? Is &profileFilename[0]
null terminated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is in C++11 and above, but not previous standards.
https://stackoverflow.com/a/11752722 summarizes the situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be SCheckedHandle
as our structs begin with S
include/seccomp/CSystemCallFilter.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use a static method like installSystemCallFilter()
rather than the constructor. Using the constructor implies an RAII idiom where the destructor reverses what the constructor has done. But that's impossible for system call filters.
CCrashHandler
and CProcessPriority
are conceptually similar and they use a static method rather than the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making all those changes.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To have internal linkage these 4 const char*
constants should really be const char* const
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make this a const std::string
, as that's what we usually use for string constants unless there's some horrible circular dependency in linkage or static initialization.
But if you stick with a const char*
then it really should be const char* const
to result in internal linkage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit that inverts the BPF approach on linux so instead of listing banned syscalls I list all the ones that can be used. This has the advantage that we don't have to think of every syscall that should be blocked and covers new functions (e.g.
To get the full list I change to filter to return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked over the Linux sections. 👍
Restrict the ability of the autodetect, autoconfig, categorize and normalize programs to make system calls. Implemented for Linux (Seccomp BPF), macOS (sandbox) and Windows (Job Groups)
Restrict the ability of the autodetect, autoconfig, categorize and normalize programs to make system calls. Implemented for Linux (Seccomp BPF), macOS (sandbox) and Windows (Job Groups)
Add system call restrictions to the ML processes (#98) Restrict the ability of the autodetect, autoconfig, categorize and normalize programs to make system calls. Implemented for Linux (Seccomp BPF), macOS (sandbox) and Windows (Job Groups) * [ML] Add comment explaining changes required if Linux BPF are modified.
The
autodetect
,autoconfig
,categorize
andnormalize
programs all require the ability to create a named pipe (mkfifo/CreateNamedPipe) open a connection to a named pipe and read & write to those pipes. All other system calls which are not required such as fork/exec should be restricted using platform specific APIs.The code here is similar to SystemCallFilter.java.
LibSeccomp
This PR adds a new static library
libseccomp
containing a single classCSystemCallFilter
the library must be linked in the 4 main programs listed above.Due to static linkage remember to
make relink
the unit test and binaries after making a change.macOS
macOS has the most configurable sandboxing functionality the approach here is to blacklist everything by default but allow
read*
andwrite*
for open, mkfifo, read and write. This imposes harder limits than the SystemCallFilter.java which only denies fork and exec.Linux
Seccomp BPF was added to
prctl()
in kernal 3.5, no filters will be installed in older kernals. The dedicatedseccomp()
function was added in kernal 3.17 and provides the same the functionality asprctl()
but with the SECCOMP_FILTER_FLAG_TSYNC option that is not required here if the filters are installed by the main thread before any other threads are spawned. Because of this we use the older more compatibleprctl()
function.SystemCallFilter.java tries to use
seccomp()
and falls back toprctl()
Windows
The Windows implementation uses Job Objects to prevent the process spawning another.
Release note: Secures the machine learning processes by preventing system calls such as fork
and exec. The Linux implementation uses Seccomp BPF (secure computing with
Berkeley Packet Filters) to intercept system calls and is available in kernels
since 3.5. On Windows, Job Objects prevent new processes being created and macOS
uses the sandbox functionality