New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keepalived ignorant of systemd-networkd removing VIPs #836

Closed
lkarsten opened this Issue Apr 16, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@lkarsten
Copy link

lkarsten commented Apr 16, 2018

Hi.

Issue: When systemd is restarted (for example by an apt-get upgrade), systemd-networkd appears to remove the VIPs set up by keepalived. Keepalived does not notice this, and failover to the backup node does not happen. VIP is now down, and the service down.

Expected: When the managed IP address is removed on the system, keepalived should notice it and log it. Either try to restore it, or go into FAULT mode so that the backup node can take over.

Attempted workaround: Add functionality to check_script target that does "ip addr | grep " and fails if it isn't there.

I'm running keepalived 1.3.9 on Ubuntu 18.04 (bionic). This is a VRRP-only configuration, no ipvs.

This was reported to Debian about a year ago: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=860151 . I was not able to find anything about it in any closed issues here.

@pqarmitage

This comment has been minimized.

Copy link
Collaborator

pqarmitage commented Apr 16, 2018

As Alexender Wirt wrote at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=860151#15, I think the fundamental issue here is that systemd-networkd is removing addresses that don't 'belong' to it (I wonder if it also removes ip routes and ip rules). However, I agree that keepalived should detect this situation and handle it appropriately.

The beta branch of keepalived does detect the removal of VIPs and will make the VRRP instance transition to backup mode (transitioning to FAULT mode would not work since there would never be an event for the VRRP instance to recover from fault mode). I will soon be adding to the beta code the detection and handling of ip routes and rules being deleted. See issue #534 for what I propose to implement; any feedback on the ideas there would be welcome.

@lkarsten

This comment has been minimized.

Copy link
Author

lkarsten commented Apr 17, 2018

Thanks for replying.

I've built the beta branch, and can confirm that it reconfigured the VIP when it was removed.

This is 8ecbb59 built with "autoreconf -vf && ./configure && make && make install DESTDIR=/opt/keepalived/" on ubuntu 18.04 bionic.

# /opt/keepalived/usr/local/sbin/keepalived -l -P -n -D -f /etc/keepalived/keepalived.conf
Tue Apr 17 16:15:02 2018: Starting Keepalived v2.0.0 (04/11,2018)
Tue Apr 17 16:15:02 2018: WARNING - keepalived was build for newer Linux 4.15.15, running on Linux 4.15.0-15-generic #16-Ubuntu SMP Wed Apr 4 13:58:14 UTC 2018
Tue Apr 17 16:15:02 2018: Opening file '/etc/keepalived/keepalived.conf'.
Tue Apr 17 16:15:02 2018: Starting VRRP child process, pid=4698
Tue Apr 17 16:15:02 2018: Registering Kernel netlink reflector
Tue Apr 17 16:15:02 2018: Registering Kernel netlink command channel
Tue Apr 17 16:15:02 2018: Opening file '/etc/keepalived/keepalived.conf'.
Tue Apr 17 16:15:02 2018: Assigned address *MASKED* for interface ens192
Tue Apr 17 16:15:02 2018: Assigned address *MASKED* for interface ens192
Tue Apr 17 16:15:02 2018: Registering gratuitous ARP shared channel
Tue Apr 17 16:15:02 2018: (vrrpinstance_165) removing protocol VIPs.
Tue Apr 17 16:15:02 2018: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(8,9)]
Tue Apr 17 16:15:02 2018: VRRP_Script(check_services) succeeded
Tue Apr 17 16:15:02 2018: (vrrpinstance_165) Entering BACKUP STATE
Tue Apr 17 16:15:06 2018: (vrrpinstance_165) Entering MASTER STATE
Tue Apr 17 16:15:06 2018: (vrrpinstance_165) setting protocol VIPs.
Tue Apr 17 16:15:06 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:06 2018: (vrrpinstance_165) Sending/queueing gratuitous ARPs on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:06 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:06 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:06 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:06 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Send [vrrpinstance_165] TSM transtition : [1,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Tue Apr 17 16:15:11 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:11 2018: (vrrpinstance_165) Sending/queueing gratuitous ARPs on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:11 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:11 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:11 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:11 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Tue Apr 17 16:15:14 2018: Netlink reflector reports IP *VIP_IP_MASKED* removed from ens192
Tue Apr 17 16:15:14 2018: (vrrpinstance_165) Entering BACKUP STATE
Tue Apr 17 16:15:14 2018: (vrrpinstance_165) sent 0 priority
Tue Apr 17 16:15:14 2018: (vrrpinstance_165) removing protocol VIPs.
Tue Apr 17 16:15:15 2018: (vrrpinstance_165) Entering MASTER STATE
Tue Apr 17 16:15:15 2018: (vrrpinstance_165) setting protocol VIPs.
Tue Apr 17 16:15:15 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:15 2018: (vrrpinstance_165) Sending/queueing gratuitous ARPs on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:15 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:15 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:15 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Tue Apr 17 16:15:15 2018: Sending gratuitous ARP on ens192 for *VIP_IP_MASKED*
Send [vrrpinstance_165] TSM transtition : [1,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
Send [vrrpinstance_165] TSM transtition : [2,2] Wantstate = [2]
^CTue Apr 17 16:15:18 2018: Stopping
Tue Apr 17 16:15:18 2018: (vrrpinstance_165) sent 0 priority
Tue Apr 17 16:15:18 2018: (vrrpinstance_165) removing protocol VIPs.
Tue Apr 17 16:15:19 2018: Stopped
Tue Apr 17 16:15:19 2018: Stopped Keepalived v2.0.0 (04/11,2018)

Method used for removing: ip addr del VIP_IP_MASKED/32 dev ens192

As far as I can tell, this solves my immediate problem. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment