Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alias service unit reported as inactive while the aliased service unit is active #7875

Closed
nuxwin opened this Issue Jan 14, 2018 · 21 comments

Comments

4 participants
@nuxwin
Copy link

nuxwin commented Jan 14, 2018

In our environment (Debian Stretch), we have the mariadb service unit that provides two aliases: mysql and mysqld (both are symlinks to the mariadb service unit). Lately we have found that the mysql.service service unit (alias) is reported as inactive while the mariadb.service service unit (aliased) is reported as active:

systemctl is-active mysql
inactive
systemctl is-active mariadb
active

However, running:

systemctl status mysql

report the service as active as expected and also seem to solve the problem because running

systemctl is-active mysql

just after report the mysql service as active.

Environment

root@srv01:/etc/systemd/system# lsb_release -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 9.2 (stretch)
Release:	9.2
Codename:	stretch
root@srv01:/etc/systemd/system# systemctl --version
systemd 232
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN

See also: https://jira.mariadb.org/browse/MDEV-14944

@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 14, 2018

How to reproduce

1. Start the mariadb service:

root@srv01:~# systemctl start mariadb.service

2. Check terse runtime status information of the mariadb.service unit:

root@srv01:~# systemctl status mariadb.service
● mariadb.service - MariaDB database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-01-14 22:07:30 UTC; 20s ago
  Process: 17865 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 17862 ExecStartPost=/etc/mysql/debian-start (code=exited, status=0/SUCCESS)
  Process: 17644 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= ||   VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ]   && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exit
  Process: 17640 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
  Process: 17637 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (code=exited, status=0/SUCCESS)
 Main PID: 17834 (mysqld)
   Status: "Taking your SQL requests now..."
    Tasks: 29 (limit: 4915)
   CGroup: /system.slice/mariadb.service
           └─17834 /usr/sbin/mysqld

Jan 14 22:07:27 srv01 systemd[1]: Starting MariaDB database server...
Jan 14 22:07:28 srv01 mysqld[17834]: 2018-01-14 22:07:28 140228024779328 [Note] /usr/sbin/mysqld (mysqld 10.1.26-MariaDB-0+deb9u1) starting as process 17834 ...
Jan 14 22:07:30 srv01 systemd[1]: Started MariaDB database server.

3. Enable the mariadb service unit. This is what our installation script is doing to ensure that the service is enabled. Basically, it resolves the aliased mysql unit to the real mariadb.service unit and enable it):

root@srv01:~# systemctl enable mariadb.service

4. Check status of aliased service units:

root@srv01:~# systemctl is-active mysql.service mysqld.service
inactive
inactive

As you can see, both are reported as inactive while we expect them reported as active.

Now the weird part:

5. Check terse runtime status information of aliased units:

root@srv01:~# systemctl status mysql.service mysqld.service
● mariadb.service - MariaDB database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-01-14 22:07:30 UTC; 4min 18s ago
 Main PID: 17834 (mysqld)
   Status: "Taking your SQL requests now..."
   CGroup: /system.slice/mariadb.service
           └─17834 /usr/sbin/mysqld

Jan 14 22:07:27 srv01 systemd[1]: Starting MariaDB database server...
Jan 14 22:07:28 srv01 mysqld[17834]: 2018-01-14 22:07:28 140228024779328 [Note] /usr/sbin/mysqld (mysqld 10.1.26-MariaDB-0+deb9u1) starting as process 17834 ...
Jan 14 22:07:30 srv01 systemd[1]: Started MariaDB database server.

● mariadb.service - MariaDB database server
   Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-01-14 22:07:30 UTC; 4min 18s ago
 Main PID: 17834 (mysqld)
   Status: "Taking your SQL requests now..."
   CGroup: /system.slice/mariadb.service
           └─17834 /usr/sbin/mysqld

Jan 14 22:07:27 srv01 systemd[1]: Starting MariaDB database server...
Jan 14 22:07:28 srv01 mysqld[17834]: 2018-01-14 22:07:28 140228024779328 [Note] /usr/sbin/mysqld (mysqld 10.1.26-MariaDB-0+deb9u1) starting as process 17834 ...
Jan 14 22:07:30 srv01 systemd[1]: Started MariaDB database server.

So here both are showed as active (running)

6. Again, check status of aliased units:

root@srv01:~# systemctl is-active mysql mysqld
active
active
@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 14, 2018

Same problem with Systemd 235

@yuwata

This comment has been minimized.

Copy link
Member

yuwata commented Jan 15, 2018

Seems similar to #7370. systemctl status command loads units. It is not confirmed, but I guess the following.
In Step 3, systemctl enable mariadb.service creates the alias, but the alias is not loaded yet. So, in Step 4, systemctl is-active returns wrong result. In Step 5, the alias unit is loaded by status command, thus in Step 6, is-active returns correct result.

@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 15, 2018

@yuwata

My knowledge in Systemd (code) are very limited so... From my point of view, an alias (here mysql) should be reported as active if the unit that define it (here mariadb.service) is active.

This issue caused some of our SSL certificates not renewed because in our cron task, we are checking that the mysql service is running before doing anything else and exit if the unit is not active.

For now, as a temporary fix, we will patch our systemd provider to make it operating on the real units by resolving them first, but by definition, an alias should work like the unit that define it, isn't it?

@yuwata

This comment has been minimized.

Copy link
Member

yuwata commented Jan 15, 2018

systemd.unit(5) saids

The system and service manager loads a unit's configuration automatically when a unit is referenced for the first time.

The alias units are not referenced, then, are not loaded. Thus, is-active returns inactive.

@yuwata

This comment has been minimized.

Copy link
Member

yuwata commented Jan 15, 2018

So, a workaround for this is run status or show the alias units.

@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 15, 2018

@yuwata

Ok but for me, that a wrong behavior. Let's imagine the following scenario

  1. You have a distribution package which install the mariadb.service unit, which itself defines the mysql.service alias...
  2. You have a software (In our case, that is a control panel for shared hosting management) that make sure that the service is enabled by executing: systemctl enable mariadb.service
  3. The problem, as stated above is that after doing this, the mysql.service unit, if an alias, will be reported as inative while the mariadb.service is active when executing systemctl is-active mysql.service. This is a real problem for our 3rd-party scripts that check the service by executing: systemctl is-active mysql.service

Well, of course, there are always possible workarounds (loading the unit, always working with real unit by resolving alias units as showed previously.. But could not it be better to change the current behavior? Why the alias units are not pre-loaded exactly?

All is about consistency here.. The mysql.service, if an alias, should work like the aliased unit and report identical state over the time. If it works like a duck and it squeaks like a duck, then it's a duck. For me the first intent of alias is to provide compatibility.

@yuwata

This comment has been minimized.

Copy link
Member

yuwata commented Jan 15, 2018

Yeah, yeah, I understand and agree with your point. I just want to say, this is a documented behavior, thus, is a RFE.

@yuwata yuwata added the RFE 🎁 label Jan 15, 2018

@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 15, 2018

@yuwata

Thank for your understanding ;)

yuwata added a commit to yuwata/systemd that referenced this issue Jan 16, 2018

yuwata added a commit to yuwata/systemd that referenced this issue Jan 16, 2018

nuxwin added a commit to i-MSCP/imscp that referenced this issue Jan 17, 2018

Fixed: Make sure that units are not masked when enabling them (System…
…d service provider)

Fixed: Resolve units before acting on them due to systemd/systemd#7875 (Systemd service provider)
Updated: Service providers according interface changes
@poettering

This comment has been minimized.

Copy link
Member

poettering commented Jan 22, 2018

This should actually already work the same, we should always load everything implicitly when needed. I'd claim this is a bug actually, not just an RFE

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Jan 22, 2018

Is this preproducible with current upstream versions?

How exactly does the symlink chain look like?

@poettering poettering added this to the v237 milestone Jan 22, 2018

@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 22, 2018

@poettering

How exactly does the symlink chain look like?

root@stretch:/usr/local/src/imscp# ls -la /etc/systemd/system/mysql.service 
lrwxrwxrwx 1 root root 35 janv. 17 23:56 /etc/systemd/system/mysql.service -> /lib/systemd/system/mariadb.service
root@stretch:/usr/local/src/imscp# ls -la /etc/systemd/system/mysqld.service 
lrwxrwxrwx 1 root root 35 janv. 17 23:56 /etc/systemd/system/mysqld.service -> /lib/systemd/system/mariadb.service
root@stretch:/usr/local/src/imscp# readlink /etc/systemd/system/mysql.service
/lib/systemd/system/mariadb.service
root@stretch:/usr/local/src/imscp# ls -la /lib/systemd/system/mariadb.service
-rw-r--r-- 1 root root 4505 août  10 21:07 /lib/systemd/system/mariadb.service
root@stretch:/usr/local/src/imscp# cat /lib/systemd/system/mariadb.service
#
# /etc/systemd/system/mariadb.service
#
# This file is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
#
# Thanks to:
# Daniel Black
# Erkan Yanar
# David Strauss
# and probably others

[Unit]
Description=MariaDB database server
After=network.target

[Install]
WantedBy=multi-user.target
Alias=mysql.service
Alias=mysqld.service


[Service]

##############################################################################
## Core requirements
##

Type=notify

# Setting this to true can break replication and the Type=notify settings
# See also bind-address mysqld option.
PrivateNetwork=false

##############################################################################
## Package maintainers
##

User=mysql
Group=mysql

# To allow memlock to be used as non-root user if set in configuration
CapabilityBoundingSet=CAP_IPC_LOCK

# Prevent writes to /usr, /boot, and /etc
ProtectSystem=full

# Doesn't yet work properly with SELinux enabled
# NoNewPrivileges=true

PrivateDevices=true

# Prevent accessing /home, /root and /run/user
ProtectHome=true

# Execute pre and post scripts as root, otherwise it does it as User=
PermissionsStartOnly=true

ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld

# Perform automatic wsrep recovery. When server is started without wsrep,
# galera_recovery simply returns an empty string. In any case, however,
# the script is not expected to return with a non-zero status.
# It is always safe to unset _WSREP_START_POSITION environment variable.
# Do not panic if galera_recovery script is not available. (MDEV-10538)
ExecStartPre=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION"
ExecStartPre=/bin/sh -c "[ ! -e /usr/bin/galera_recovery ] && VAR= || \
 VAR=`/usr/bin/galera_recovery`; [ $? -eq 0 ] \
 && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1"

# Needed to create system tables etc.
# ExecStartPre=/usr/bin/mysql_install_db -u mysql

# Start main service
# MYSQLD_OPTS here is for users to set in /etc/systemd/system/mariadb.service.d/MY_SPECIAL.conf
# Use the [service] section and Environment="MYSQLD_OPTS=...".
# This isn't a replacement for my.cnf.
# _WSREP_NEW_CLUSTER is for the exclusive use of the script galera_new_cluster
ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION
ExecStartPost=/etc/mysql/debian-start

# Unset _WSREP_START_POSITION environment variable.
ExecStartPost=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION"

KillMode=process
KillSignal=SIGTERM

# Don't want to see an automated SIGKILL ever
SendSIGKILL=no

# Restart crashed server only, on-failure would also restart, for example, when
# my.cnf contains unknown option
Restart=on-abort
RestartSec=5s

UMask=007

##############################################################################
## USERs can override
##
##
## by creating a file in /etc/systemd/system/mariadb.service.d/MY_SPECIAL.conf
## and adding/setting the following will override this file's settings.

# Useful options not previously available in [mysqld_safe]

# Kernels like killing mysqld when out of memory because its big.
# Lets temper that preference a little.
# OOMScoreAdjust=-600

# Explicitly start with high IO priority
# BlockIOWeight=1000

# If you don't use the /tmp directory for SELECT ... OUTFILE and
# LOAD DATA INFILE you can enable PrivateTmp=true for a little more security.
PrivateTmp=false

##
## Options previously available to be set via [mysqld_safe]
## that now needs to be set by systemd config files as mysqld_safe
## isn't executed.
##

# Number of files limit. previously [mysqld_safe] open-file-limit
LimitNOFILE=16364

# Maximium core size. previously [mysqld_safe] core-file-size
# LimitCore=

# Nice priority. previously [mysqld_safe] nice
# Nice=-5

# Timezone. previously [mysqld_safe] timezone
# Environment="TZ=UTC"

# Library substitutions. previously [mysqld_safe] malloc-lib with explicit paths
# (in LD_LIBRARY_PATH) and library name (in LD_PRELOAD).
# Environment="LD_LIBRARY_PATH=/path1 /path2" "LD_PRELOAD=

# Flush caches. previously [mysqld_safe] flush-caches=1
# ExecStartPre=sync
# ExecStartPre=sysctl -q -w vm.drop_caches=3

# numa-interleave=1 equalivant
# Change ExecStart=numactl --interleave=all /usr/sbin/mysqld......

# crash-script equalivent
# FailureAction=
root@stretch:/usr/local/src/imscp#
@poettering

This comment has been minimized.

Copy link
Member

poettering commented Jan 22, 2018

So the initial alias symlink is in /etc instead of /usr/lib? Well, we don't follow symlinks in /etc/ during "systemctl enable", "systemctl link" and friends for the simple reason that those commands manage those symlinks, they are the ones that create them. And generally it's a good idea if tools generate output on stuff that is dependent on its own output. That way you create an awful feedback loop...

"systemctl link", "systemctl enable" and friends do follow symlinks in /usr/lib these days, as those are vendor supplied aliases that the tools won't manage, but which are shipped with your RPM or deb packages.

So, I am very sure we shouldn't really change this behaviour, as it would do more harm than good: programs should not have such feedback loops and not use as configuration the stuff they are supposed to create...

I hope that makes some sense? Closing.

@poettering poettering closed this Jan 22, 2018

@poettering poettering reopened this Jan 22, 2018

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Jan 22, 2018

oops, mixed up this and your other bug report...

we should still report the "is-active" stuff correctly in this case, that's a bug...

@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 22, 2018

@poettering

The symlinks are created by systemd ;) I don't create them myself and they are not created by the systemd debian package maintenance scripts neither... Aliases are defined in the /usr/lib/systemd/system/mariadb.service unit and when you enable that last, symlinks are automatically created by systemd in the /etc/systemd/system directory. I've posted the "howto reproduce above"...

BTW

root@stretch:/usr/local/src/imscp# systemctl --version
systemd 232
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN
@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 22, 2018

@poettering

So the initial alias symlink is in /etc instead of /usr/lib? Well, we don't follow symlinks in /etc/ during "systemctl enable", "systemctl link" and friends for the simple reason that those commands manage those symlinks, they are the ones that create them. And generally it's a good idea if tools generate output on stuff that is dependent on its own output. That way you create an awful feedback loop...

You're making assumption without reading my reports carrefully right? The symlinks are not created by the debian package. They are created automatically by YOUR software when you enable an unit which define aliases... Those are created by YOUR software in /etc...

See above...

@yuwata

This comment has been minimized.

Copy link
Member

yuwata commented Jan 22, 2018

Note that this is reported on 232, but I confirm the situation is the same on the current git snapshot.

@yuwata

This comment has been minimized.

Copy link
Member

yuwata commented Jan 22, 2018

Also, when this is fixed, then please revert c7612b2.

@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 22, 2018

@yuwata

Sorry for the work I've given to you for nothing... That being said, your involvement was much appreciated.

poettering added a commit to poettering/systemd that referenced this issue Jan 25, 2018

systemctl: load unit if needed in "systemctl is-active"
Previously, we'd explicitly use "GetUnit()" on the server side to
convert a unit name into a bus path, as that function will return an
error if the unit is not currently loaded. If we'd convert the path on
the client side, and access the unit this way directly the unit would be
loaded automatically in the background.

The old logic was done in order to minimize the effect of "is-active" on
the system, i.e. that a monoitoring command does not itself alter the
state of the system.

however, this is problematic as this can lead to confusing results if
the queried unit name is an alias that currently is not loaded: we'd
claim the unit wasn't active even though this isn't strictly true: the
unit the name is an alias for might be.

Hence, let's simplify the code, and accept that we might end up loading
a unit briefly here, and let's make "systemctl is-active" skip the
GetUnit() thing and calculate the unit path right away.

Fixes: systemd#7875

poettering added a commit to poettering/systemd that referenced this issue Jan 25, 2018

systemctl: load unit if needed in "systemctl is-active"
Previously, we'd explicitly use "GetUnit()" on the server side to
convert a unit name into a bus path, as that function will return an
error if the unit is not currently loaded. If we'd convert the path on
the client side, and access the unit this way directly the unit would be
loaded automatically in the background.

The old logic was done in order to minimize the effect of "is-active" on
the system, i.e. that a monoitoring command does not itself alter the
state of the system.

however, this is problematic as this can lead to confusing results if
the queried unit name is an alias that currently is not loaded: we'd
claim the unit wasn't active even though this isn't strictly true: the
unit the name is an alias for might be.

Hence, let's simplify the code, and accept that we might end up loading
a unit briefly here, and let's make "systemctl is-active" skip the
GetUnit() thing and calculate the unit path right away.

Fixes: systemd#7875
@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 27, 2018

@poettering

Thank you for the job done there. That is much appreciated.

@nuxwin

This comment has been minimized.

Copy link
Author

nuxwin commented Jan 27, 2018

@keszybz Did you think to revert changes made in c7612b2 ?

Edit: Seem yes. Was done in 71c9f49

nuxwin added a commit to i-MSCP/imscp that referenced this issue Nov 29, 2018

Fixed: Mask/Unmask units after/prior disabling/enabling them (Systemd…
… service provider)

Fixed: Missing support for various systemd unit files such as device, mount point, swap file... (Systemd init provider)
Fixed: Resolve units before acting on them due to systemd/systemd#7875 (Systemd service provider)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.