Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small improvements for systemd unit file and install paths #1037

Merged
merged 8 commits into from Apr 12, 2017

Conversation

Projects
None yet
4 participants
@grondo
Copy link
Contributor

grondo commented Apr 11, 2017

A few small tweaks here to improve the default systemd service unit file included with flux.

  • Use AC variable "runstatedir" instead of $localstatedir/run for both "default" (i.e. system) FLUX_URI and the broker.rundir passed to the systemd initiated flux broker.
  • If --with-systemdsystemunitdir= is not explicitly used, always prepend ${prefix} to this PATH, so that the unit file is not installed outside of the chosen $prefix. This could get us into trouble if systemd pkg-config file already has $prefix in the systemdsystemunitdir variable, but I don't think this is an issue in practice.
  • Don't install systemd unit file with executable bits
  • Add Delegate=yes to systemd unit to delegate cgroup for flux.service to flux user.

When installing flux-core to /usr/local on your desktop for example, the service unit file will not be found by systemd by default. Make a link from /etc/systemd/systemd/flux.service -> /usr/local/lib/systemd/system/flux.service and that should activate the service.

Someone should give this a try and make sure nothing breaks on at least one other system.

Fixes #1024
Fixes #1033

grondo added some commits Apr 11, 2017

build: use runstatedir instead of ${localstatedir}/run
$runstatedir is a better fit for the use here, and using a separate
variable allows packagers to easily specify --runstatedir=/run for
systemd systems, instead of requiring

  --localstatedir=/

which could have unintended consequences.
systemd: set rundir to $runstatedir in flux.service
For systemd unit file, use the value of runstatedir for the
broker.rundir attribute, instead of using systemd's idea of the
runtime directory with `%t`.

This ensures that the "default" FLUX_URI calculated by the installed
flux binary agrees with the actual flux session started by systemd.

This also allows *both* the rundir used for the systemd initiated
flux instance, and the one used for the default URI calculation to
be set by a single option at configure time, thus they will never
be out of sync., e.g.

 ./configure --runstatedir=/run ...

for a more standard systemd install.
build: install systemd unit file as DATA not SCRIPTS
Change target name for systemdsystemunit_SCRIPTS to
systemdsystemunit_DATA -- systemd doesn't like it when the unit
files are executable.
systemd: Add Delegate=yes to flux.service unit file
Add Delegate=yes to flux Service so that cgroup control is by
default delegated to the flux user, and no longer under systemd
control.
build: don't install systemd unit file outside of prefix
When --with-systemdsystemunitdir is not set explicitly, prepend
the pkg-config systemdsystemunitdir with ${prefix}, so that the
unit file is not installed outside of --prefix.

Distro packages should therefore use --with-systemdsystemunitdir
explicitly, in the case where the unit file directory *is* outside
of prefix (e.g. /lib/systemd/system when --prefix=/usr).
@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 11, 2017

Coverage Status

Coverage decreased (-0.01%) to 78.203% when pulling cf3fc77 on grondo:systemd-tweaks into 6fbe380 on flux-framework:master.

1 similar comment
@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 11, 2017

Coverage Status

Coverage decreased (-0.01%) to 78.203% when pulling cf3fc77 on grondo:systemd-tweaks into 6fbe380 on flux-framework:master.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Apr 11, 2017

Codecov Report

Merging #1037 into master will increase coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1037      +/-   ##
==========================================
+ Coverage   77.92%   77.94%   +0.02%     
==========================================
  Files         150      150              
  Lines       25494    25603     +109     
==========================================
+ Hits        19865    19957      +92     
- Misses       5629     5646      +17
Impacted Files Coverage Δ
src/common/libflux/info.c 76.47% <0%> (-2.64%) ⬇️
src/common/libflux/response.c 83.76% <0%> (-0.86%) ⬇️
src/modules/kvs/cache.c 93.49% <0%> (-0.72%) ⬇️
src/common/libutil/nodeset.c 83.64% <0%> (-0.33%) ⬇️
src/common/libflux/message.c 80.79% <0%> (-0.24%) ⬇️
src/common/libflux/dispatch.c 83.19% <0%> (-0.23%) ⬇️
src/cmd/builtin/nodeset.c 4.91% <0%> (-0.09%) ⬇️
src/modules/kvs/kvs.c 80.88% <0%> (+0.01%) ⬆️
src/broker/broker.c 72.7% <0%> (+0.01%) ⬆️
src/common/libutil/veb.c 98.31% <0%> (+0.01%) ⬆️
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6fbe380...069208f. Read the comment docs.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Apr 12, 2017

Just removed all vestiges of old flux and systemd unit files from my Ubuntu 16.04.2 LTS box and installed your branch with

./configure
make -j16
sudo make install DESTDIR=/
(verified that default rundir is /usr/local/var/run/flux)
sudo ln -s  /usr/local/lib/systemd/system/flux.service /etc/systemd/system/flux.service

then

sudo systemctl start flux
sudo journalctl | tail -100
Apr 11 17:22:07 jimbo systemd[1]: Started Flux message broker.
Apr 11 17:22:07 jimbo flux[10805]: flux-broker: could not initialize rundir: No such file or directory
Apr 11 17:22:07 jimbo flux[10805]: flux-broker: bootstrap failed
Apr 11 17:22:07 jimbo flux[10805]: E: (flux-broker) 17-04-11 17:22:07 dangling 'PAIR' socket created at src/zsys.c:379
Apr 11 17:22:07 jimbo flux[10805]: E: (flux-broker) 17-04-11 17:22:07 dangling 'PAIR' socket created at src/zsys.c:380
Apr 11 17:22:07 jimbo flux[10805]: E: (flux-broker) 17-04-11 17:22:07 dangling 'REP' socket created at src/zauth.c:77
Apr 11 17:22:07 jimbo flux[10805]: E: (flux-broker) 17-04-11 17:22:07 dangling sockets: cannot terminate ZMQ safely
Apr 11 17:22:07 jimbo systemd[1]: flux.service: Main process exited, code=exited, status=1/FAILURE
Apr 11 17:22:07 jimbo systemd[1]: flux.service: Unit entered failed state.
Apr 11 17:22:07 jimbo systemd[1]: flux.service: Failed with result 'exit-code'.

Oh, strange: /usr/local/var was there before, but it is gone now. Will pick this up again tomorrow, have to run.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Apr 12, 2017

Oops, I think I may have been confused about /usr/local/var being there before (sorry). I ran a mkdir -p /usr/local/var/run/flux, fixed up permissions so user flux could write to it, and systemctl start flux worked just fine.

Is the idea that a user would need to configure with --runstatedir=run in order for systemd to take charge of this directory? (As opposed to maybe using ExecStartPre with a configured path)

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Apr 12, 2017

Is the idea that a user would need to configure with --runstatedir=run in order for systemd to take charge of this directory? (As opposed to maybe using ExecStartPre with a configured path)

Yes, and (see comment in 8252dac) it is was the only sane way to allow package maintainers to set both the default broker.rundir and the default FLUX_URI with a single ./configure option so that they don't get out of sync. (So these issues are only for make install use case, all packages should "just work" (I think)).

That being said, it would be ideal to use /run even for the make install to /usr/local case, since then systemd can properly create and remove the ephemeral directory as designed using RuntimeDirectory= (this setting cannot be absolute path and is always relative to /run). I shied away from this because it didn't seem at ./configure time you could be sure what the runtime directory of systemd would be (i.e. what %t will expand to for system units). However, if we're guaranteed that this will always be /run -- this would be an easy fix, or if we use a search path for default FLUX_URI we could perhaps search common locations when FLUX_URI is not in environment?

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Apr 12, 2017

For completeness, two other issues came up in hallway discussion:

  • It would be nice if we had a way to differentiate single-user and multi-user instance via command line options passed to flux-start and/or flux-broker. The systemd unit file could enable multi-user by default and rc scripts could use this to enable multi-user behavior (e.g. userdb loaded with default-rolemask, etc.) (@garlick could a new "runlevel" be used for this?)

  • If we do move default rundir to /run, this is a tmpfs and currently content backing store is under broker.rundir. This could be a problem for a long-running system instance, so systemd unit file should direct this somewhere else probably.

Let me know if the first bullet above should be a new issue.

build: default runstatedir to /run
If --runstatedir is not explicitly set on ./configure command line,
then use a default runstatedir of `/run`, which will work better
for systemd managed systems.
@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Apr 12, 2017

Ok, just pushed a commit that uses a default $runstatedir of /run when --runstatedir isn't explicitly used on the command line. I think this should be safe, and will work with ./configure && make install as well as in the packaging case(s).

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Apr 12, 2017

I'm good with hardwiring /run. We can always change it later if it proves to be a bad choice.

I like the multi-user run level idea, but possibly it's more churn than we want right now, and just to get our simple system instance working quickly, maybe it would be better to do something simple like:

in unit file:

Environment=FLUX_USERDB_OPTIONS=--default-rolemask=user

and then in rc1

flux module load -r 0 userdb ${FLUX_USERDB_OPTIONS}

I think we can put off content persistence for the time being since we don't yet have an option to start flux with existing content/kvs from a previous run.

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Apr 12, 2017

I like the multi-user run level idea, but possibly it's more churn than we want right now, and just to get our simple system instance working quickly, maybe it would be better to do something simple like:

Sounds good to me.

I think we can put off content persistence for the time being since we don't yet have an option to start flux with existing content/kvs from a previous run.

I wasn't worried about persistence so much as filling up the /run tmpfs with content backing store data... this is real concern if system instance is started at boot and persists for the uptime of the machine. On my system /run is 1.6G (about 10% of RAM).

It seems like there should be a way to divert the backing store outside of broker.rundir without necessarily persisting it?

@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 12, 2017

Coverage Status

Coverage decreased (-0.03%) to 78.184% when pulling 0d356ee on grondo:systemd-tweaks into 6fbe380 on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Apr 12, 2017

Good point! Maybe for the time being we should drop the persist-directory and persist-fs attributes and a) always clean up the .db file (since we cant' do anything with it) and b) provide a module option to redirect it? I can open a bug on that one.

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Apr 12, 2017

Environment=FLUX_USERDB_OPTIONS=--default-rolemask=user

This will work for now, but environment variables for the rc scripts is a tricky one. In general these variables will persist into the initial program and thus (possibly) eventually into any subsequent sub-instances that are started via this session.

For the current implementation of the system instance this works because the initial program doesn't actually start anything else, but in general environment variables may be a bad idea, or there must be some naming "rules" so they can be sanitized before starting the initial program.

systemd: load userdb with --default-rolemask=user
For the current incarnation of the system instance as started by
systemd, it is convenient to have all users in /etc/passwd with
a default rolemask of FLUX_ROLE_USER.

As suggested by @garlick, add a ${FLUX_USERDB_OPTIONS} parameter to
the userdb load command in rc1, and set the aforementioned option
for systemd via the Environment setting for the service.
@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Apr 12, 2017

Ok, I implemented @garlick's suggestion for FLUX_USERDB_OPTIONS and set --default-rolemask=user for the systemd "system" instance.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 12, 2017

Coverage Status

Coverage decreased (-0.008%) to 78.207% when pulling 069208f on grondo:systemd-tweaks into 6fbe380 on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Apr 12, 2017

Looks good, merging.

@garlick garlick merged commit f915009 into flux-framework:master Apr 12, 2017

4 checks passed

codecov/patch Coverage not affected when comparing 6fbe380...069208f
Details
codecov/project 77.94% (+0.02%) compared to 6fbe380
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.008%) to 78.207%
Details

@grondo grondo deleted the grondo:systemd-tweaks branch Apr 13, 2017

@grondo grondo referenced this pull request Aug 23, 2017

Closed

0.8.0 Release #1160

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.