Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StandardInput=data, StandardInput=file:... and more #7198

Merged
merged 23 commits into from
Nov 19, 2017

Conversation

poettering
Copy link
Member

Let's substantially beef up how stdin/stdout/stderr can be configured. Specifically, this adds

StandardInput=data

for configuring literal data to pass into stdin of an invoked service. The data itself is configured via StandardInputData= and StandardInputText=.

This also adds

StandardInput=file:/some/path
StandardOutput=file:/some/path
StandardError=file:/some/path

for connecting service stdin/stdout/stderr to some file (or in fact AF_UNIX socket) in the file system

Fixes #3991

@yuwata
Copy link
Member

yuwata commented Oct 28, 2017

The CIs xenial-i386 and xenial-s390x are failed in test-execute (exec-stdin-data.service):

exec-stdin-data.service: Passing 0 fds to service
exec-stdin-data.service: About to execute: /bin/sh -x -c 'd=$$(mktemp -d -p /tmp); echo -e "this is a test
and this is more
something encoded!
exec-stdin-data.service: Executing: /bin/sh -x -c 'd=$(mktemp -d -p /tmp); echo -e "this is a test
and this is more
something encoded!
something   in multiple lines
something   in multiple lines
and some more
and a more bas64 data
something with strange
embedded	characters
and something with a exec-stdin-data.service specifier" > $d/text ; cmp $d/text'
exec-stdin-data.service: Forked /bin/sh as 20534
exec-stdin-data.service: Changed dead -> start
and some more
and a more bas64 data
something with strange
embedded	characters
and something with a exec-stdin-data.service specifier" > $d/text ; cmp $d/text'
+ mktemp -d -p /tmp
+ d=/tmp/tmp.vCko762m1Y
+ echo -e this is a test
and this is more
something encoded!
something   in multiple lines
and some more
and a more bas64 data
something with strange
embedded	characters
and something with a exec-stdin-data.service specifier
+ cmp /tmp/tmp.vCko762m1Y/text
/tmp/tmp.vCko762m1Y/text - differ: byte 1, line 1
(snip)
Received SIGCHLD from PID 20534 (sh).
Child 20534 (sh) died (code=exited, status=1/FAILURE)
exec-stdin-data.service: Child 20534 belongs to exec-stdin-data.service
exec-stdin-data.service: Main process exited, code=exited, status=1/FAILURE
exec-stdin-data.service: Failed with result 'exit-code'.
exec-stdin-data.service: Changed start -> failed
exec-stdin-data.service: Unit entered failed state.

@yuwata yuwata added the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Oct 28, 2017
@yuwata
Copy link
Member

yuwata commented Oct 28, 2017

Also, the CI xenial-amd64 is also failed with the same error.

Copy link
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the proposed semantics are very good. The way StandardInputData and StandardInputText can be combined is flexible yet clear. I find no issues in the implementation. The docs could use some tweaks.

sz = strlen(resolved);
if (c->stdin_data_size + sz + 1 < c->stdin_data_size || /* check for overflow */
c->stdin_data_size + sz + 1 > EXEC_STDIN_DATA_MAX) {
log_syntax(unit, LOG_ERR, filename, line, r, "Standard input data too large (%zu), maximum of %zu permitted, ignoring.", c->stdin_data_size + sz, (size_t) EXEC_STDIN_DATA_MAX);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we must make any errors in input data fatal (either in unescaping, or in specifier expansion, or in size). Running the service with invalid input data is potentially very dangerous. (Imagine the case of this being used for a partition reformatter, and a part of the sfdisk input being lost because an invalid specifier was used for a partition uuid, yikes).


delete_chars(cleaned, WHITESPACE);

r = unbase64mem(cleaned, (size_t) -1, &decoded, &sz);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to add a mode to unbase64mem where it ignores white space? Copying the data just to ignore spaces seems a bit over the top.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but that'd be some non-trivial refactoring (because the code as it is calculates the destination index of the decoding from the source index, which doesn't work if whitespace is in the stream), which I didn't also want to cram into this PR too. It's already hard to digest with the assorted stuff it contains.

Other uses of unbase64mem also delete the whitespace first, hence this is also means reworking a lot of other callers. but yes, we should do that, but i figure in a later commit

if (n < 0)
return log_oom();

print_prop(name, "%s", h);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it would be nice to be smart about this, and use text mode if the data is printable. Let's leave that question for later though.

Description=Test for StandardInputText= and StandardInputData=

[Service]
ExecStart=/bin/sh -x -c 'd=$$(mktemp -d -p /tmp); echo -e "this is a test\nand this is more\nsomething encoded!\nsomething in multiple lines\nand some more\nand a more bas64 data\nsomething with strange\nembedded\tcharacters\nand something with a exec-stdin-data.service specifier" > $d/text ; cmp $d/text'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tempory directory $d should be deleted.


r = unit_full_printf(u, n, &resolved);
if (r < 0) {
log_syntax(unit, LOG_ERR, filename, line, r, "Failed to resolve unit specifiers on %s, ignoring: %m", n);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think we should ignore the failure here. Not strictly backwards compatible, but I'd make this fatal.


<para><varname>StandardInputText=</varname> accepts arbitrary textual data. C-style escapes for special
characters as well as the usual <literal>%</literal>-specifiers are resolved. Each time this setting is used a
new line is appended to the per-unit data buffer, followed by a newline character. Note that leading and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line ... newline character (either/or).

characters as well as the usual <literal>%</literal>-specifiers are resolved. Each time this setting is used a
new line is appended to the per-unit data buffer, followed by a newline character. Note that leading and
trailing whitespace of lines configured with this option is removed. If an empty line is specified the per-unit
data buffer is cleared (hence, in order to insert an empty line, add an additional <literal>\n</literal> to the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just say "If an empty line is specified, the list is reset (hence ...)", following the usual style.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it's not a list, but a buffer, and I think "clear" is more appropriate for buffers, than "reset"? dunno? anyway, i dropped the "per-unit data" bit now


<para><varname>StandardInputData=</varname> accepts arbitrary binary data, encoded in <ulink
url="https://tools.ietf.org/html/rfc2045#section-6.8">Base64</ulink>. No escape sequences or specifiers are
resolved. Any whitespace is ignored during decoding. Each time this setting is used the decoded binary data is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This last sentence can be deleted, the next paragraph says the same thing.

dSwgZGVuayBpY2ssIGljayBkZW5rIG5hbnUhCkpldHogaXNzZSB1ZmYsIGVyc2NodCB3YXIgc2Ug
enUhCkljayBqZWhlIHJhdXMgdW5kIGJsaWNrZSDigJQKdW5kIHdlciBzdGVodCBkcmF1w59lbj8g
SWNrZSEK
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, you talk about backslashes, but there are no backslashes here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems they got lost during my copy/paste orgies...;-)

TODO Outdated
@@ -44,6 +44,12 @@ Features:
taken if multiple dirs are configured. Maybe avoid setting the env vars in
that case?

* introduce SuccessAction= that permits shutting down the system when a service
suceeds. This is useful to replace "ExecPost=/usr/bin/systemctl poweroff" and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

succeeds.

Copy link
Member

@lucaswerkmeister lucaswerkmeister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments (no full review)


<listitem><para>Configures arbitrary textual or binary data to pass via file descriptor 0 (STDIN) to the
executed processes. These settings have no effect unless <varname>StandardInput=</varname> is set to
<option>data</option>. Use this option to embedd process input data directly in the unit file.</para>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

embedd

@@ -638,8 +638,6 @@ int bus_append_unit_property_assignment(sd_bus_message *m, const char *assignmen

Copy link
Member

@lucaswerkmeister lucaswerkmeister Oct 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment on the commit message: the client should not be altered by that, I assume? (edit: this refers to commit e305edd)

@poettering
Copy link
Member Author

I force pushed a new version now, with the issues @keszybz and @lucaswerkmeister pointed out addressed (except for the base64 rework, as mentioned above).

Also, I reworked the acquire_data_fd code a bit (removed a couple of redundant seeks, unset O_NONBLOCK if the fd refers to a pipe, and some other things). I am not entirely sure this fixes anything regarding the failed CI test @yuwata pointed out, but I'd like to see the CI results for this version before trying to track down the actual problem on this one.

@poettering poettering force-pushed the stdin-stdout branch 2 times, most recently from 33b7f2d to 35d83e4 Compare November 13, 2017 11:27
@poettering poettering removed the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Nov 13, 2017
@poettering poettering force-pushed the stdin-stdout branch 3 times, most recently from e657d95 to 3e14211 Compare November 15, 2017 12:07
@poettering poettering force-pushed the stdin-stdout branch 2 times, most recently from 4ee0114 to a6a7d38 Compare November 16, 2017 13:37
@keszybz
Copy link
Member

keszybz commented Nov 16, 2017

Ha, systemd-run --user -p SystemCallFilter=~memfd_create:ENOSYS strace -ememfd_create $PWD/build/test-fd-util is absolutely great for testing. Thanks @yuwata.

@poettering
Copy link
Member Author

Ah, neat... Now, if I wasn't as lazy as I am I'd use that as hint to prep a functional test for the test/TEST-XY/ suite doing what you just did, after all it's a really neat functional test that tests a lot of different bits and pieces all at the same time

return r;

try_dev_shm_without_o_tmpfile:
fd = mkostemp_safe(pattern);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I force falling back to this code, test-fd-util fails:

systemd-run --user -p SystemCallFilter='~memfd_create:ENOSYS' strace `pwd`/build/test-fd-util
Nov 16 15:13:59 strace[24961]: memfd_create("test", MFD_CLOEXEC)       = -1 ENOSYS (Function not implemented)
Nov 16 15:13:59 strace[24961]: openat(AT_FDCWD, "/tmp", O_RDWR|O_EXCL|O_DIRECTORY|O_CLOEXEC|O_TMPFILE, 0600) = 3
Nov 16 15:13:59 strace[24961]: write(3, "test\n", 5)                   = 5
Nov 16 15:13:59 strace[24961]: close(3)                                = 0
Nov 16 15:13:59 strace[24961]: memfd_create("data-fd", MFD_CLOEXEC|MFD_ALLOW_SEALING) = -1 ENOSYS (Function not implemented)
Nov 16 15:13:59 strace[24961]: openat(AT_FDCWD, "/dev/shm", O_RDWR|O_DIRECTORY|O_CLOEXEC|O_TMPFILE, 000) = 3
Nov 16 15:13:59 strace[24961]: write(3, "foo", 3)                      = 3
Nov 16 15:13:59 strace[24961]: openat(AT_FDCWD, "/proc/self/fd/3", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
Nov 16 15:13:59 strace[24961]: close(3)                                = 0
Nov 16 15:13:59 strace[24961]: writev(2, [{iov_base="Assertion 'fd >= 0' failed at .."..., iov_len=104}, {iov_base="\n", iov_len=1}], 2Assertion 'fd >= 0' failed at ../src/test/test-fd-util.c:115, function test_acquire_data_fd(). Aborting.
Nov 16 15:13:59 strace[24961]: ) = 105
Nov 16 15:13:59 strace[24961]: rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
Nov 16 15:13:59 strace[24961]: rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
Nov 16 15:13:59 strace[24961]: getpid()                                = 24964
Nov 16 15:13:59 strace[24961]: gettid()                                = 24964
Nov 16 15:13:59 strace[24961]: tgkill(24964, 24964, SIGABRT)           = 0
Nov 16 15:13:59 strace[24961]: rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
Nov 16 15:13:59 strace[24961]: --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=24964, si_uid=1000} ---
Nov 16 15:13:59 strace[24961]: +++ killed by SIGABRT (core dumped) +++
Nov 16 15:13:59 systemd-coredump[24982]: Process 24964 (test-fd-util) of user 1000 dumped core.
                                         
                                         Stack trace of thread 24964:
                                         #0  0x00007f44a1b9c69b raise (libc.so.6)
                                         #1  0x00007f44a1b9e3b1 abort (libc.so.6)
                                         #2  0x00007f44a201b497 n/a (/home/zbyszek/src/systemd-work/build/src/shared/libsystemd-shared-235.so)
                                         #3  0x0000559a3a7d583a n/a (/home/zbyszek/src/systemd-work/build/test-fd-util)
                                         #4  0x0000559a3a7d5b6b n/a (/home/zbyszek/src/systemd-work/build/test-fd-util)
                                         #5  0x00007f44a1b8603a __libc_start_main (libc.so.6)
                                         #6  0x0000559a3a7d4caa n/a (/home/zbyszek/src/systemd-work/build/test-fd-util)
Nov 16 15:13:59 systemd-coredump[25011]: Process 24961 (strace) of user 1000 dumped core.

I'm not sure if this is somehow caused by the way I run the code or what.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, /proc/self/fd/3 is a symlink like /dev/shm/#3865430 (deleted), so it cannot be re-opened.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe acquire_data_fd should be structured as a series of functions, so that it's actually possible to test the fallback code. Another option would be to install a seccomp filter to force the fallback code, but that's only easy with memfd_create. Either way, I think we need to have a way to automatically test the paths on different kernels.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/proc/self/fd/3 is not a real symlink, it's magic. In my tests I had no problem to reopen deleted files like that. My own tests where done by adding a "goto" to the right label at the beginning of the fucntion and recompiling...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I did too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird stuff. will look into it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah! it appears to be a perms thing! i.e. i specify the mode 0000 on the fd and unprivileged processes can't reopen it hence

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat. a one byte fix... 0000 → 0500 and it works fine as unprivileged user too

@poettering
Copy link
Member Author

Force pushed a new version now. Two changes: fixed the O_TMPFILE file mask, so that reopening works always. And I beefed up the test for that code, so that we test all relevant codepaths individually (for that I added a flags parameter to acquire_data_fd(), that can be used to disabled specific fd types...

Thanks a million for testing this so thoroughly and tracking this down! Much appreciated!

PTAL!

Copy link
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks almost good.

strncpy(sa.un.sun_path, path, sizeof(sa.un.sun_path));
if (connect(fd, &sa.sa, SOCKADDR_UN_LEN(sa.un)) < 0) {
safe_close(fd);
return errno == EINVAL ? -ENXIO : -errno; /* Propagate initial error if we get ENXIO, i.e. we have
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment talk about ENXIO, but the code looks for EINVAL.

if (errno != ENXIO) /* ENXIO is returned when we try to open() an AF_UNIX file system socket on Linux */
return -errno;
if (strlen(path) > sizeof(sa.un.sun_path)) /* Too long, can't be a UNIX socket */
return -errno;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd write this as return -ENXIO, just for clarity.

r = unit_full_printf(u, n, &resolved);
if (r < 0) {
log_syntax(unit, LOG_ERR, filename, line, r, "Failed to resolve unit specifiers on %s, ignoring: %m", n);
return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return log_syntax, below too.

@poettering
Copy link
Member Author

Thanks for the review! Pushed another version, with all points you raised adressed. PTAL!

We are using the same pattern at various places: call dup2() on an fd,
and close the old fd, usually in combination with some O_CLOEXEC
fiddling. Let's add a little helper for this, and port a few obvious
cases over.
Given that Linux assigns the same ioctl numbers ot multiple subsystems,
we should be careful when invoking ioctls, so that we don't end up
calling something we wouldn't want to call.
Mostly coding style fixes, but most importantly, initialize
c->std_input only after we know the free_and_strdup() invocation
succeeded, so that we don't leave half-initialized fields around on
failure.
All this function does is place some data in an in-memory read-only fd,
that may be read back to get the original data back.

Doing this in a way that works everywhere, given the different kernels
we support as well as different privilege levels is surprisingly
complex.
…INVAL

This is not only more technically correct, but also shortens our code
quite a bit.
If the string length is specified as (size_t) -1, let's use that as
indicator for determining the length on our own. This makes it
slightlier shorter to invoke these APIs for a very common case.

Also, do some minor other coding style updates, and add assert()s here
and there.
We do a logic like that at various other places, let's do it here too,
to make this as little surprising as possible.
Let's not mix function calls and variable declarations, as well as
assignments and comparison in one expression.
…putText=

Both permit configuring data to pass through STDIN to an invoked
process. StandardInputText= accepts a line of text (possibly with
embedded C-style escapes as well as unit specifiers), which is appended
to the buffer to pass as stdin, followed by a single newline.
StandardInputData= is similar, but accepts arbitrary base64 encoded
data, and will not resolve specifiers or C-style escapes, nor append
newlines.

This may be used to pass input/configuration data to services, directly
in-line from unit files, either in a cooked or in a more raw format.
Whether seccomp is supported or not is a server implementation detail,
the client should not be altered by that, and clients should be able to talk
to servers configured differently than the client, hence drop the
HAVE_SECCOMP ifdeffery here.

(This would be different if we'd need libseccomp or so to implement the
client, but we don't)
It's the flags parameter we propagate here, not the mode parameter,
hence let's name it properly, and use the right type.
…() into one

property_get_output_fdname() already had two different control flows for
stdout and stderr, it might as well handle stdin too, thus shortening
our code a bit.
In some cases we checked for fd validity already explicitly, let's do
this for all our fds.
… friends

Let's make sure to process the fdname first, before changing the actual
input/output setting, since the fdname part can fail due to OOM.

This way we don't leave half-initialized bits around.
These new settings permit specifiying arbitrary paths as
stdin/stdout/stderr locations. We try to open/create them as necessary.
Some special magic is applied:

1) if the same path is specified for both input and output/stderr, we'll
   open it only once O_RDWR, and duplicate them fd instead.

2) If we an AF_UNIX socket path is specified, we'll connect() to it,
   rather than open() it. This allows invoking systemd services with
   stdin/stdout/stderr connected to arbitrary foreign service sockets.

Fixes: systemd#3991
This was missing for using fdnames as stdio, let's add support for
fdnames as well as file paths in one go.
These three settings only make sense within the context of actual unit
files, hence filter this out when applied to the per-manager default,
and generate a log message about it.
Already, path_is_safe() refused paths container the "." dir. Doing that
isn't strictly necessary to be "safe" by most definitions of the word.
But it is necessary in order to consider a path "normalized". Hence,
"path_is_safe()" is slightly misleading a name, but
"path_is_normalize()" is more descriptive, hence let's rename things
accordingly.

No functional changes.
@keszybz keszybz added the good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed label Nov 17, 2017
@poettering
Copy link
Member Author

I'll take the liberty to merge this now, even though the Ubuntu CI hasn't fully finished yet. But it has been 3days now, and at least the s390 CI did finish successfully, hence I think we should be good, given that s390 is traditionally at least the most picky one...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed pid1
Development

Successfully merging this pull request may close these issues.

None yet

4 participants