Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VIPs with keepalived with systemd-networkd and ipv6 #1170

Closed
chr4 opened this issue Mar 18, 2019 · 23 comments

Comments

Projects
None yet
3 participants
@chr4
Copy link

commented Mar 18, 2019

I came across several issues when using virtual floating IPs (VIPs) on Linux with systemd. Most of the tutorials online describing VIP setups on Linux do not work properly when systemd-networkd is used.

I'm here for advice on what are the best practises to run systemd with Linux distributions like Ubuntu 18.04 LTS (and alike), especially on ipv6 setups.

Ubuntu's default network configuration for servers relies on netplan.io, which itself defaults to systemd-networkd as a renderer. When running netplan apply or when restarting systemd-networkd, it purges interfaces from unknown IPs (like VIPs attached by keepalived). To make the problem worse, that also happens on DHCP renewals.

I've tried using a dummy interface for the VIP instead of the primary interface. This works quite nicely on ipv4, details in this blog post.

But: This doesn't work for ipv6, as keepalived defaults to send the unsolicited neighbour adverts via the same dummy interface in vrrp_ndisc.c:

VRRP: Error sending ndisc unsolicited neighbour advert on keepalived0

I believe that a setup like this is quite common, and I was wondering if there are stable best practises given the situation with systemd-networkd and ipv6.

A possible solution might be (also I'm not sure if this would even work), to have a configuration option that makes it possible to send out the neighbour adverts via a different interface then the one specified with dev in virtual_ipaddress.

I'm aware of #836, but I believe this issue is different. I don't want keepalived to failover when systemd-networkd is restarted. This seems to be a feature of systemd - but before I open an issue there, I'd like to get some feedback on whether I'm missing on a fundamental best practise here.

Thanks in advance!

@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented Mar 19, 2019

I am not familiar with using systemd-networkd, but if it cannot be configured to allow other processes to manage IP addresses, then it is a problem with systemd-networkd. It is absolutely not right that systemd-networkd should interfere with other processes and the addresses they manage. I would suggest raising an issue against systemd, but with nearly 1000 open issues, it isn't likely to get anywhere any time soon; nevertheless I think an issue should be raised there. You could also raise an issue against systemd-networkd on Ubuntu's launchpad.

You state in your blog post:

Unfortunately, Ubuntu ships with keepalived-1.3.9, and the keepalived developers do not provide an
official repository for more recent versions. After considering providing packages myself, I came to
the conclusion that this actually wouldn’t fit the actual problem, as keepalived would just note the
removed VIP and failover to another machine. But I wanted to fix the underlying problem itself,
instead of just coping with the symptoms.

There are a couple of points regarding the above. First Ubuntu ships a snap of keepalived, and the version in the stable channel is currently v2.0.10.

The second point is that if a VIP is removed, the vrrp instance owning the VIP will transition to backup state and the highest priority vrrp instance will become the master within 1 advert interval. That should be the vrrp instance whose VIP was removed and that has just transitioned to backup (since it was previously the master due to being the highest priority), and so it should not be another machine that takes over as master. Behaviour could be different if no_preempt is configured on the vrrp instance, or if systemd-networkd takes down the interface for a short time. The reason that I implemented recovery from a VIP being deleted by making the vrrp instance transition to backup state was that it would handle all of VIPs, virtual routes and virtual rules being deleted and reinstated, and also that it would not normally cause any other system to become master.

In your blog you also refer to using a dummy interface, but that it's state is always down. It is possible to up a dummy interface (ip link set keepalived0 up), and ip link show keepalived0 then shows its state as UNKNOWN. I can't think why systemd-networkd leaves the interface in a downed stated.

You refer above to IPv6 not working with a dummy interface, since it fails to send the NA packets. Looking at the code for IPv4 (function send_arp() in vrrp_arp.c), it also appears to use the ifindex that the address is configured on; it just seems not to report an error. Are you able to confirm whether or not gratuitous ARPs are actually sent for IPv4? I can't see how the kernel could work out an interface to send them on since keepalived specifies the dummy interface.

Does configuring keepalived to use vmacs (macvlans) make a difference? If systemd-networkd only removes IP addresses from the interfaces that it has configured, that might work. Or is systemd-networkd so intrusive that it removes the macvlans?

Does setting CriticalConnection=true in the systemd-networkd configuration on the interfaces keepalived is using make a difference?

It would be possible to add a configuration option against a vip such as garp-dev or gna-dev but I would want to do further testing first to make sure there aren't any other unintended consequences of using a dummy interface. I do feel though that this is a rather ugly workaround to a problem entirely of systemd-networkd's making, and ideally another option such as setting CriticalConnection to make systemd-networkd behave properly is the right way to go.

I have just tested the latest version of keepalived using dummy interfaces, and it now detects that the dummy interfaces do not support multicast, and hence since the GARP/GNA messages cannot be sent, the vrrp instances go to fault state:

(VI_4) interface keepalived0 does not support multicast, specify unicast peers - disabling
(VI_4) disabling ARP since interface does not support it
(VI_6) interface keepalived0 does not support multicast, specify unicast peers - disabling
(VI_6) disabling ARP since interface does not support it
(VI_4) entering FAULT state
(VI_6) entering FAULT state

(VI_4 is an IPv4 instance, VI_6 is an IPv6 instance).
We should probably add an option to allow no GARP/GNA messages to be sent, in which case keepalived wouldn't need to check for multicast.

@chr4

This comment has been minimized.

Copy link
Author

commented Mar 20, 2019

Thank you for your fast and detailed answer!

I am not familiar with using systemd-networkd, but if it cannot be configured to allow other processes to manage IP addresses, then it is a problem with systemd-networkd. It is absolutely not right that systemd-networkd should interfere with other processes and the addresses they manage.

I've created an issue on systemd systemd/systemd#12050

There are a couple of points regarding the above. First Ubuntu ships a snap of keepalived, and the version in the stable channel is currently v2.0.10.

Thanks for the hint! I've just updated to keepalived-2 on a staging cluster, and it works well so far.

I've noticed that the snap doesn't support reloading the daemon?

systemctl reload snap.keepalived.daemon

Failed to reload snap.keepalived.daemon.service: Job type reload is not applicable for unit snap.keepalived.daemon.service.
See system logs and 'systemctl status snap.keepalived.daemon.service' for details.

With keepalived-1, I've used reload by default for minor config changes, as restart resulted in re-elections and failovers.

The second point is that if a VIP is removed, the vrrp instance owning the VIP will transition to backup state and the highest priority vrrp instance will become the master within 1 advert interval.

Thanks for the explanation. I can confirm that this is the case - but it nevertheless causes a small downtime when restarting systemd-networkd.

In your blog you also refer to using a dummy interface, but that it's state is always down. It is possible to up a dummy interface (ip link set keepalived0 up), and ip link show keepalived0 then shows its state as UNKNOWN. I can't think why systemd-networkd leaves the interface in a downed stated.

I can confirm that keepalived-2 will go into FAULT state because of this - I've also created an issue on systemd for this: systemd/systemd#12051

Does configuring keepalived to use vmacs (macvlans) make a difference? If systemd-networkd only removes IP addresses from the interfaces that it has configured, that might work. Or is systemd-networkd so intrusive that it removes the macvlans?

I have not looked into this deeper. The default renderer on Ubuntu doesn't implement macvlans yet, but it's on their feature list

Does setting CriticalConnection=true in the systemd-networkd configuration on the interfaces keepalived is using make a difference?

This prevents deletion of DHCP addresses on renewals, but doesn't prevent pruning of the VIP.

I have just tested the latest version of keepalived using dummy interfaces, and it now detects that the dummy interfaces do not support multicast, and hence since the GARP/GNA messages cannot be sent, the vrrp instances go to fault state:

I should've clarified that I'm using unicast. But you're right anyway: The dummy interface can't properly send out GARP/NA. This basically renders my workaround with a dummy interface unusable.

My workaround so far:

  • Using the keepalived snap, as it reduces the downtime to 1 advert_int
  • Waiting for systemd/ Ubuntu to fix this
  • Consider migrating to another distribution/ operating system
@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented Mar 28, 2019

@chr4 Is there anything more that is needed on this issue from a keepalived perspective?

@chr4

This comment has been minimized.

Copy link
Author

commented Mar 28, 2019

If this is within scope of this repository, adding a reload option to the snap.keepalived.daemon systemd unit file would be nice.

Otherwise only the decision whether it would make sense to add the option for a garp-dev or gna-dev option - I'm not sure though if using dummy interfaces should become a best-practise?

There's no feedback so far from systemd...

If this is out of scope and you've decided against the additional options, feel free to close.

@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented Mar 28, 2019

There does appear to be a response to systemd/systemd#12051, suggesting "dropping in a .network file matching the dummy device".

Regarding systemd/systemd#12050 the issue that other packages can legitimately manage IP addresses appears to have been completely overlooked (?ignored) - I'm really not clear where the idea of leaving old configuration on an interface comes from.

It might be worth asking the specific question against issue 12050: how can a system with systemd-networkd can be configured so that keepalived can work without having its addresses removed.

It is not within the scope of this repository to modify the snap file, although I might be able to do that in relation to some other work I am doing.

It would be really helpful to know if using macvlans can work. If you add use_vmac in the vrrp_instance block, keepalived will create the necessary vlans - it doesn't need any support from Ubuntu. Since systemd-networkd won't be managing the macvlans, it is possible that it won't remove the addresses keepalived configures on them.

I think I am not in favour of suggesting the use of dummy interfaces, since it requires certain specific interface configuration (is it RP_FILTER?), and if a system doesn't use the relevant setting, it won't work.

I'm happy to consider garp-dev/gna-dev configuration options, but I would like to know first if configuring keepalived to use maclans get around the problem of systemd-networkd removing keepalived's IP addresses.

@chr4

This comment has been minimized.

Copy link
Author

commented Mar 28, 2019

There does appear to be a response to systemd/systemd#12051, suggesting "dropping in a .network file matching the dummy device".

Yes. This helps - but then, systemd takes over management of the interface and prunes the address upon reload. Therefore, it doesn't help as a workaround for the underlying problem.

It might be worth asking the specific question against issue 12050: how can a system with systemd-networkd can be configured so that keepalived can work without having its addresses removed.

I thought I did that with systemd/systemd#12050 (comment) - maybe I should've been more detailed?

It is not within the scope of this repository to modify the snap file, although I might be able to do that in relation to some other work I am doing.

I'd apprechiate that! If it belongs to an open source repository, I might also be able to contribute on this. As a workaround, I'm running killall -HUP keepalived at the moment, which works nicely in the meantime.

It would be really helpful to know if using macvlans can work. If you add use_vmac in the vrrp_instance block, keepalived will create the necessary vlans - it doesn't need any support from Ubuntu.

That's a good hint! I wasn't actually aware of use_vmac and just had a look at the systemd macvlan configuration. Give me a few days to look into it.

@chr4

This comment has been minimized.

Copy link
Author

commented Apr 3, 2019

Concerning use_vmac:

In my tests, systemd-networkd indeed leaves the created vmac interface alone. I haven't created any interfaces with systemd-networkd, so I'm not sure if there's more setup involved, e.g. configuring a bridge and attaching the base interface to it? (And if so, if that might lead to systemd-networkd pruning the interface).

With the current vmac setup, I'm having issues with ARPs though. The base interface can't announce the VIP. (I set the sysctls and the vmac_xmit_base directive according to this doc)

Even when manually sending arping -B -S $VIP -i $BASE_INTERFACE, the connections work, but seem shaky.

This might not be directly related to vmac.

Also, documentation for using use_vmac and ipv6 (concerning e.g. sysctl settings) is missing? At least I can't find any.

@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented Apr 3, 2019

Other than specifying use_vmac in the vrrp_instance configuration block, you shouldn't need to do anything, since keepalived manages creating and deleting the interface.

The document you found is VERY old, and rather out of date now (we know we need to update it, but writing code is mode interesting :).

You don't need to set any sysctls nowadays since keepalived does that itself (see vrrp_if_config.c), for both IPv4 and IPv6.

I have never found that I need to use vmac_xmit_base; it might be that the combination of setting that and you setting sysctl variables is stopping ARP working. Do you get any log entries in relation to the ARP messages not working?

I presume that the interface you are specifying in the vrrp_instance is ens3 (from your blog post). Assuming that is the case, it would be helpful if you could post the content if all the 'files' in the following directories (keepalived will need to be running when you do this):
/proc/sys/net/ipv4/conf/all
/proc/sys/net/ipv4/cont/ens3
/proc/sys/net/ipv4/conf/VMAC_IF_NAME
/proc/sys/net/ipv6/conf/all
/proc/sys/net/ipv6/cont/ens3
/proc/sys/net/ipv6/conf/VMAC_IF_NAME

It would also be helpful to see your keepalived configuration, the keepalived log entries from the time it starts, and the output of keepalived -v.

@chr4

This comment has been minimized.

Copy link
Author

commented Apr 4, 2019

I've just started with two fresh instances, trying use_vmac with IPv6.

As soon as I enable use_vmac, both instances promote themselves to master. Without use_vmac they agree nicely on who's going to be the boss.

NOTE: I've allowed all ipv6 VRRP traffic and ICMP on their firewalls, not set any sysctls and I'm not using vmac_xmit_base.

When changing one instance to use_vmac on a working cluster, the other node says this repeadetly:

(VI_1) Received advert from [...]:f::15 with lower priority 50, ours 50, forcing new election

After changing the second instance to use_vmac as well, both promote themselves to master. But there seems to be some communication going on between the nodes:

13:46:10.064033 IP6 [...]:f::16 > [...]:f::15: ip-proto-112 24
13:46:10.182018 IP6 [...]:f::15 > [...]:f::16: ip-proto-112 24

Here's the configuration I'm using (unicast peer and src addresses are flipped on the other node):

global_defs {
    vrrp_version 3
    vrrp_check_unicast_src
    enable_script_security
    script_user root
}

vrrp_script chk_manual_failover {
    script "/usr/lib/keepalived/checks/chk_manual_failover"
    interval 5
    weight 200
}

vrrp_instance VI_1 {
    interface ens3
    priority 50
    state BACKUP
    virtual_router_id 51
    advert_int 1
    use_vmac

    accept
    unicast_src_ip [...]:f::16
    unicast_peer {
        [...]:f::15
    }
    virtual_ipaddress {
        [...]:f::1111:1
    }
    track_script {
        chk_manual_failover

    }

    notify "/usr/lib/keepalived/ha-notify"
}

Here's the files in /proc you've requested:

/proc/sys/net/ipv4/conf/all
/proc/sys/net/ipv4/conf/all/accept_local
/proc/sys/net/ipv4/conf/all/accept_redirects
/proc/sys/net/ipv4/conf/all/accept_source_route
/proc/sys/net/ipv4/conf/all/arp_accept
/proc/sys/net/ipv4/conf/all/arp_announce
/proc/sys/net/ipv4/conf/all/arp_filter
/proc/sys/net/ipv4/conf/all/arp_ignore
/proc/sys/net/ipv4/conf/all/arp_notify
/proc/sys/net/ipv4/conf/all/bootp_relay
/proc/sys/net/ipv4/conf/all/disable_policy
/proc/sys/net/ipv4/conf/all/disable_xfrm
/proc/sys/net/ipv4/conf/all/drop_gratuitous_arp
/proc/sys/net/ipv4/conf/all/drop_unicast_in_l2_multicast
/proc/sys/net/ipv4/conf/all/force_igmp_version
/proc/sys/net/ipv4/conf/all/forwarding
/proc/sys/net/ipv4/conf/all/igmpv2_unsolicited_report_interval
/proc/sys/net/ipv4/conf/all/igmpv3_unsolicited_report_interval
/proc/sys/net/ipv4/conf/all/ignore_routes_with_linkdown
/proc/sys/net/ipv4/conf/all/log_martians
/proc/sys/net/ipv4/conf/all/mc_forwarding
/proc/sys/net/ipv4/conf/all/medium_id
/proc/sys/net/ipv4/conf/all/promote_secondaries
/proc/sys/net/ipv4/conf/all/proxy_arp
/proc/sys/net/ipv4/conf/all/proxy_arp_pvlan
/proc/sys/net/ipv4/conf/all/route_localnet
/proc/sys/net/ipv4/conf/all/rp_filter
/proc/sys/net/ipv4/conf/all/secure_redirects
/proc/sys/net/ipv4/conf/all/send_redirects
/proc/sys/net/ipv4/conf/all/shared_media
/proc/sys/net/ipv4/conf/all/src_valid_mark
/proc/sys/net/ipv4/conf/all/tag
/proc/sys/net/ipv4/conf/ens3
/proc/sys/net/ipv4/conf/ens3/accept_local
/proc/sys/net/ipv4/conf/ens3/accept_redirects
/proc/sys/net/ipv4/conf/ens3/accept_source_route
/proc/sys/net/ipv4/conf/ens3/arp_accept
/proc/sys/net/ipv4/conf/ens3/arp_announce
/proc/sys/net/ipv4/conf/ens3/arp_filter
/proc/sys/net/ipv4/conf/ens3/arp_ignore
/proc/sys/net/ipv4/conf/ens3/arp_notify
/proc/sys/net/ipv4/conf/ens3/bootp_relay
/proc/sys/net/ipv4/conf/ens3/disable_policy
/proc/sys/net/ipv4/conf/ens3/disable_xfrm
/proc/sys/net/ipv4/conf/ens3/drop_gratuitous_arp
/proc/sys/net/ipv4/conf/ens3/drop_unicast_in_l2_multicast
/proc/sys/net/ipv4/conf/ens3/force_igmp_version
/proc/sys/net/ipv4/conf/ens3/forwarding
/proc/sys/net/ipv4/conf/ens3/igmpv2_unsolicited_report_interval
/proc/sys/net/ipv4/conf/ens3/igmpv3_unsolicited_report_interval
/proc/sys/net/ipv4/conf/ens3/ignore_routes_with_linkdown
/proc/sys/net/ipv4/conf/ens3/log_martians
/proc/sys/net/ipv4/conf/ens3/mc_forwarding
/proc/sys/net/ipv4/conf/ens3/medium_id
/proc/sys/net/ipv4/conf/ens3/promote_secondaries
/proc/sys/net/ipv4/conf/ens3/proxy_arp
/proc/sys/net/ipv4/conf/ens3/proxy_arp_pvlan
/proc/sys/net/ipv4/conf/ens3/route_localnet
/proc/sys/net/ipv4/conf/ens3/rp_filter
/proc/sys/net/ipv4/conf/ens3/secure_redirects
/proc/sys/net/ipv4/conf/ens3/send_redirects
/proc/sys/net/ipv4/conf/ens3/shared_media
/proc/sys/net/ipv4/conf/ens3/src_valid_mark
/proc/sys/net/ipv4/conf/ens3/tag
/proc/sys/net/ipv4/conf/vrrp.51
/proc/sys/net/ipv4/conf/vrrp.51/accept_local
/proc/sys/net/ipv4/conf/vrrp.51/accept_redirects
/proc/sys/net/ipv4/conf/vrrp.51/accept_source_route
/proc/sys/net/ipv4/conf/vrrp.51/arp_accept
/proc/sys/net/ipv4/conf/vrrp.51/arp_announce
/proc/sys/net/ipv4/conf/vrrp.51/arp_filter
/proc/sys/net/ipv4/conf/vrrp.51/arp_ignore
/proc/sys/net/ipv4/conf/vrrp.51/arp_notify
/proc/sys/net/ipv4/conf/vrrp.51/bootp_relay
/proc/sys/net/ipv4/conf/vrrp.51/disable_policy
/proc/sys/net/ipv4/conf/vrrp.51/disable_xfrm
/proc/sys/net/ipv4/conf/vrrp.51/drop_gratuitous_arp
/proc/sys/net/ipv4/conf/vrrp.51/drop_unicast_in_l2_multicast
/proc/sys/net/ipv4/conf/vrrp.51/force_igmp_version
/proc/sys/net/ipv4/conf/vrrp.51/forwarding
/proc/sys/net/ipv4/conf/vrrp.51/igmpv2_unsolicited_report_interval
/proc/sys/net/ipv4/conf/vrrp.51/igmpv3_unsolicited_report_interval
/proc/sys/net/ipv4/conf/vrrp.51/ignore_routes_with_linkdown
/proc/sys/net/ipv4/conf/vrrp.51/log_martians
/proc/sys/net/ipv4/conf/vrrp.51/mc_forwarding
/proc/sys/net/ipv4/conf/vrrp.51/medium_id
/proc/sys/net/ipv4/conf/vrrp.51/promote_secondaries
/proc/sys/net/ipv4/conf/vrrp.51/proxy_arp
/proc/sys/net/ipv4/conf/vrrp.51/proxy_arp_pvlan
/proc/sys/net/ipv4/conf/vrrp.51/route_localnet
/proc/sys/net/ipv4/conf/vrrp.51/rp_filter
/proc/sys/net/ipv4/conf/vrrp.51/secure_redirects
/proc/sys/net/ipv4/conf/vrrp.51/send_redirects
/proc/sys/net/ipv4/conf/vrrp.51/shared_media
/proc/sys/net/ipv4/conf/vrrp.51/src_valid_mark
/proc/sys/net/ipv4/conf/vrrp.51/tag
/proc/sys/net/ipv6/conf/all
/proc/sys/net/ipv6/conf/all/accept_dad
/proc/sys/net/ipv6/conf/all/accept_ra
/proc/sys/net/ipv6/conf/all/accept_ra_defrtr
/proc/sys/net/ipv6/conf/all/accept_ra_from_local
/proc/sys/net/ipv6/conf/all/accept_ra_min_hop_limit
/proc/sys/net/ipv6/conf/all/accept_ra_mtu
/proc/sys/net/ipv6/conf/all/accept_ra_pinfo
/proc/sys/net/ipv6/conf/all/accept_ra_rt_info_max_plen
/proc/sys/net/ipv6/conf/all/accept_ra_rt_info_min_plen
/proc/sys/net/ipv6/conf/all/accept_ra_rtr_pref
/proc/sys/net/ipv6/conf/all/accept_redirects
/proc/sys/net/ipv6/conf/all/accept_source_route
/proc/sys/net/ipv6/conf/all/addr_gen_mode
/proc/sys/net/ipv6/conf/all/autoconf
/proc/sys/net/ipv6/conf/all/dad_transmits
/proc/sys/net/ipv6/conf/all/disable_ipv6
/proc/sys/net/ipv6/conf/all/disable_policy
/proc/sys/net/ipv6/conf/all/drop_unicast_in_l2_multicast
/proc/sys/net/ipv6/conf/all/drop_unsolicited_na
/proc/sys/net/ipv6/conf/all/enhanced_dad
/proc/sys/net/ipv6/conf/all/force_mld_version
/proc/sys/net/ipv6/conf/all/force_tllao
/proc/sys/net/ipv6/conf/all/forwarding
/proc/sys/net/ipv6/conf/all/hop_limit
/proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
/proc/sys/net/ipv6/conf/all/keep_addr_on_down
/proc/sys/net/ipv6/conf/all/max_addresses
/proc/sys/net/ipv6/conf/all/max_desync_factor
/proc/sys/net/ipv6/conf/all/mc_forwarding
/proc/sys/net/ipv6/conf/all/mldv1_unsolicited_report_interval
/proc/sys/net/ipv6/conf/all/mldv2_unsolicited_report_interval
/proc/sys/net/ipv6/conf/all/mtu
/proc/sys/net/ipv6/conf/all/ndisc_notify
/proc/sys/net/ipv6/conf/all/ndisc_tclass
/proc/sys/net/ipv6/conf/all/proxy_ndp
/proc/sys/net/ipv6/conf/all/regen_max_retry
/proc/sys/net/ipv6/conf/all/router_probe_interval
/proc/sys/net/ipv6/conf/all/router_solicitation_delay
/proc/sys/net/ipv6/conf/all/router_solicitation_interval
/proc/sys/net/ipv6/conf/all/router_solicitation_max_interval
/proc/sys/net/ipv6/conf/all/router_solicitations
/proc/sys/net/ipv6/conf/all/seg6_enabled
/proc/sys/net/ipv6/conf/all/seg6_require_hmac
/proc/sys/net/ipv6/conf/all/stable_secret
/proc/sys/net/ipv6/conf/all/suppress_frag_ndisc
/proc/sys/net/ipv6/conf/all/temp_prefered_lft
/proc/sys/net/ipv6/conf/all/temp_valid_lft
/proc/sys/net/ipv6/conf/all/use_oif_addrs_only
/proc/sys/net/ipv6/conf/all/use_tempaddr
/proc/sys/net/ipv6/conf/ens3
/proc/sys/net/ipv6/conf/ens3/accept_dad
/proc/sys/net/ipv6/conf/ens3/accept_ra
/proc/sys/net/ipv6/conf/ens3/accept_ra_defrtr
/proc/sys/net/ipv6/conf/ens3/accept_ra_from_local
/proc/sys/net/ipv6/conf/ens3/accept_ra_min_hop_limit
/proc/sys/net/ipv6/conf/ens3/accept_ra_mtu
/proc/sys/net/ipv6/conf/ens3/accept_ra_pinfo
/proc/sys/net/ipv6/conf/ens3/accept_ra_rt_info_max_plen
/proc/sys/net/ipv6/conf/ens3/accept_ra_rt_info_min_plen
/proc/sys/net/ipv6/conf/ens3/accept_ra_rtr_pref
/proc/sys/net/ipv6/conf/ens3/accept_redirects
/proc/sys/net/ipv6/conf/ens3/accept_source_route
/proc/sys/net/ipv6/conf/ens3/addr_gen_mode
/proc/sys/net/ipv6/conf/ens3/autoconf
/proc/sys/net/ipv6/conf/ens3/dad_transmits
/proc/sys/net/ipv6/conf/ens3/disable_ipv6
/proc/sys/net/ipv6/conf/ens3/disable_policy
/proc/sys/net/ipv6/conf/ens3/drop_unicast_in_l2_multicast
/proc/sys/net/ipv6/conf/ens3/drop_unsolicited_na
/proc/sys/net/ipv6/conf/ens3/enhanced_dad
/proc/sys/net/ipv6/conf/ens3/force_mld_version
/proc/sys/net/ipv6/conf/ens3/force_tllao
/proc/sys/net/ipv6/conf/ens3/forwarding
/proc/sys/net/ipv6/conf/ens3/hop_limit
/proc/sys/net/ipv6/conf/ens3/ignore_routes_with_linkdown
/proc/sys/net/ipv6/conf/ens3/keep_addr_on_down
/proc/sys/net/ipv6/conf/ens3/max_addresses
/proc/sys/net/ipv6/conf/ens3/max_desync_factor
/proc/sys/net/ipv6/conf/ens3/mc_forwarding
/proc/sys/net/ipv6/conf/ens3/mldv1_unsolicited_report_interval
/proc/sys/net/ipv6/conf/ens3/mldv2_unsolicited_report_interval
/proc/sys/net/ipv6/conf/ens3/mtu
/proc/sys/net/ipv6/conf/ens3/ndisc_notify
/proc/sys/net/ipv6/conf/ens3/ndisc_tclass
/proc/sys/net/ipv6/conf/ens3/proxy_ndp
/proc/sys/net/ipv6/conf/ens3/regen_max_retry
/proc/sys/net/ipv6/conf/ens3/router_probe_interval
/proc/sys/net/ipv6/conf/ens3/router_solicitation_delay
/proc/sys/net/ipv6/conf/ens3/router_solicitation_interval
/proc/sys/net/ipv6/conf/ens3/router_solicitation_max_interval
/proc/sys/net/ipv6/conf/ens3/router_solicitations
/proc/sys/net/ipv6/conf/ens3/seg6_enabled
/proc/sys/net/ipv6/conf/ens3/seg6_require_hmac
/proc/sys/net/ipv6/conf/ens3/stable_secret
/proc/sys/net/ipv6/conf/ens3/suppress_frag_ndisc
/proc/sys/net/ipv6/conf/ens3/temp_prefered_lft
/proc/sys/net/ipv6/conf/ens3/temp_valid_lft
/proc/sys/net/ipv6/conf/ens3/use_oif_addrs_only
/proc/sys/net/ipv6/conf/ens3/use_tempaddr
/proc/sys/net/ipv6/conf/vrrp.51
/proc/sys/net/ipv6/conf/vrrp.51/accept_dad
/proc/sys/net/ipv6/conf/vrrp.51/accept_ra
/proc/sys/net/ipv6/conf/vrrp.51/accept_ra_defrtr
/proc/sys/net/ipv6/conf/vrrp.51/accept_ra_from_local
/proc/sys/net/ipv6/conf/vrrp.51/accept_ra_min_hop_limit
/proc/sys/net/ipv6/conf/vrrp.51/accept_ra_mtu
/proc/sys/net/ipv6/conf/vrrp.51/accept_ra_pinfo
/proc/sys/net/ipv6/conf/vrrp.51/accept_ra_rt_info_max_plen
/proc/sys/net/ipv6/conf/vrrp.51/accept_ra_rt_info_min_plen
/proc/sys/net/ipv6/conf/vrrp.51/accept_ra_rtr_pref
/proc/sys/net/ipv6/conf/vrrp.51/accept_redirects
/proc/sys/net/ipv6/conf/vrrp.51/accept_source_route
/proc/sys/net/ipv6/conf/vrrp.51/addr_gen_mode
/proc/sys/net/ipv6/conf/vrrp.51/autoconf
/proc/sys/net/ipv6/conf/vrrp.51/dad_transmits
/proc/sys/net/ipv6/conf/vrrp.51/disable_ipv6
/proc/sys/net/ipv6/conf/vrrp.51/disable_policy
/proc/sys/net/ipv6/conf/vrrp.51/drop_unicast_in_l2_multicast
/proc/sys/net/ipv6/conf/vrrp.51/drop_unsolicited_na
/proc/sys/net/ipv6/conf/vrrp.51/enhanced_dad
/proc/sys/net/ipv6/conf/vrrp.51/force_mld_version
/proc/sys/net/ipv6/conf/vrrp.51/force_tllao
/proc/sys/net/ipv6/conf/vrrp.51/forwarding
/proc/sys/net/ipv6/conf/vrrp.51/hop_limit
/proc/sys/net/ipv6/conf/vrrp.51/ignore_routes_with_linkdown
/proc/sys/net/ipv6/conf/vrrp.51/keep_addr_on_down
/proc/sys/net/ipv6/conf/vrrp.51/max_addresses
/proc/sys/net/ipv6/conf/vrrp.51/max_desync_factor
/proc/sys/net/ipv6/conf/vrrp.51/mc_forwarding
/proc/sys/net/ipv6/conf/vrrp.51/mldv1_unsolicited_report_interval
/proc/sys/net/ipv6/conf/vrrp.51/mldv2_unsolicited_report_interval
/proc/sys/net/ipv6/conf/vrrp.51/mtu
/proc/sys/net/ipv6/conf/vrrp.51/ndisc_notify
/proc/sys/net/ipv6/conf/vrrp.51/ndisc_tclass
/proc/sys/net/ipv6/conf/vrrp.51/proxy_ndp
/proc/sys/net/ipv6/conf/vrrp.51/regen_max_retry
/proc/sys/net/ipv6/conf/vrrp.51/router_probe_interval
/proc/sys/net/ipv6/conf/vrrp.51/router_solicitation_delay
/proc/sys/net/ipv6/conf/vrrp.51/router_solicitation_interval
/proc/sys/net/ipv6/conf/vrrp.51/router_solicitation_max_interval
/proc/sys/net/ipv6/conf/vrrp.51/router_solicitations
/proc/sys/net/ipv6/conf/vrrp.51/seg6_enabled
/proc/sys/net/ipv6/conf/vrrp.51/seg6_require_hmac
/proc/sys/net/ipv6/conf/vrrp.51/stable_secret
/proc/sys/net/ipv6/conf/vrrp.51/suppress_frag_ndisc
/proc/sys/net/ipv6/conf/vrrp.51/temp_prefered_lft
/proc/sys/net/ipv6/conf/vrrp.51/temp_valid_lft
/proc/sys/net/ipv6/conf/vrrp.51/use_oif_addrs_only
/proc/sys/net/ipv6/conf/vrrp.51/use_tempaddr

The logs are this:

Apr 04 13:50:09 1.keepalived systemd[1]: Started Service for snap application keepalived.daemon.
Apr 04 13:50:09 1.keepalived Keepalived[28217]: Starting VRRP child process, pid=28221
Apr 04 13:50:09 1.keepalived Keepalived_vrrp[28221]: Registering Kernel netlink reflector
Apr 04 13:50:09 1.keepalived Keepalived_vrrp[28221]: Registering Kernel netlink command channel
Apr 04 13:50:09 1.keepalived Keepalived_vrrp[28221]: Opening file '/etc/keepalived/keepalived.conf'.
Apr 04 13:50:09 1.keepalived Keepalived_vrrp[28221]: (VI_1): Success creating VMAC interface vrrp.51
Apr 04 13:50:09 1.keepalived Keepalived_vrrp[28221]: Registering gratuitous NDISC shared channel
Apr 04 13:50:09 1.keepalived Keepalived_vrrp[28221]: (VI_1) Entering BACKUP STATE (init)
Apr 04 13:50:09 1.keepalived Keepalived_vrrp[28221]: Script `chk_manual_failover` now returning 1
Apr 04 13:50:09 1.keepalived Keepalived_vrrp[28221]: VRRP_Script(chk_manual_failover) failed (exited with status 1)
Apr 04 13:50:09 1.keepalived keepalived[28233]: Transition to state 'BACKUP' on VRRP instance 'VI_1'.
Apr 04 13:50:13 1.keepalived Keepalived_vrrp[28221]: (VI_1) Entering MASTER STATE
Apr 04 13:50:13 1.keepalived Keepalived_vrrp[28221]: (VI_1) using locally configured advertisement interval (1000 milli-sec)
Apr 04 13:50:13 1.keepalived keepalived[28241]: Transition to state 'MASTER' on VRRP instance 'VI_1'.
$ keepalived -v

Keepalived v2.0.10 (11/12,2018)

Copyright(C) 2001-2018 Alexandre Cassen, <acassen@gmail.com>

Built with kernel headers for Linux 4.15.18
Running on Linux 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019

configure options: --prefix= --enable-snmp --enable-snmp-rfc LDFLAGS= -L/root/parts/keepalived/install/lib -L/root/parts/keepalived/install/usr/lib -L/root/parts/keepalived/install/lib/x86_64-linux-gnu -L/root/parts/keepalived/install/usr/lib/x86_64-linux-gnu

Config options:  LIBIPSET_DYNAMIC LVS VRRP VRRP_AUTH OLD_CHKSUM_COMPAT FIB_ROUTING SNMP_V3_FOR_V2 SNMP_VRRP SNMP_CHECKER SNMP_RFCV2 SNMP_RFCV3

System options:  PIPE2 SIGNALFD INOTIFY_INIT1 VSYSLOG EPOLL_CREATE1 IPV4_DEVCONF LIBNL3 RTA_ENCAP RTA_EXPIRES RTA_NEWDST RTA_PREF FRA_SUPPRESS_PREFIXLEN FRA_SUPPRESS_IFGROUP FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK RTEXT_FILTER_SKIP_STATS FRA_L3MDEV FRA_UID_RANGE RTAX_FASTOPEN_NO_COOKIE RTA_VIA FRA_OIFNAME RTA_TTL_PROPAGATE IFA_FLAGS IP_MULTICAST_ALL LWTUNNEL_ENCAP_MPLS LWTUNNEL_ENCAP_ILA LIBIPTC LIBIPSET_PRE_V7 LIBIPVS_NETLINK IPVS_DEST_ATTR_ADDR_FAMILY IPVS_SYNCD_ATTRIBUTES IPVS_64BIT_STATS VRRP_VMAC SOCK_NONBLOCK SOCK_CLOEXEC O_PATH GLOB_BRACE INET6_ADDR_GEN_MODE VRF SO_MARK SCHED_RT SCHED_RESET_ON_FORK
@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented Apr 4, 2019

Many thanks for the info. I'm sorry I wasn't clear about what I wanted from the proc files. Could you please run the following and post the output:

for v in ipv4 ipv6
do
    for i in all vrrp.51 ens3
    do
        for f in /proc/sys/net/$v/conf/$i/*
        do
            echo $f = $(cat $f)
        done
    done
done

I've now realised that use_vmac cannot work with unicast, since the unicast_peer addresses would need to be configured on the vmac interfaces, and keepalived doesn't (at the moment) have the ability to add static addresses to vmac interfaces. Is it possible for you to use multicasting rather than unicast?

@chr4

This comment has been minimized.

Copy link
Author

commented Apr 10, 2019

I've now realised that use_vmac cannot work with unicast, since the unicast_peer addresses would need to be configured on the vmac interfaces, and keepalived doesn't (at the moment) have the ability to add static addresses to vmac interfaces.

Oh, that explains things! I must have overread this in the documentation.

Unfortunately, our Hoster doesn't allow for multicast at the moment, so we're stuck with unicast :(

To summarize: Am I right that the best solution would be to use use_vmac together with multicast (I assume this would then work with systemd-networkd) - and if unicast is required, it's necessary to attach the VIP on the primary interface (a dummy interface won't work). Therefore: As long as systemd-network doesn't fix systemd/systemd#12050, it's not possible to use keepalived with systemd-network reliably.

Any additions?

@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented Apr 11, 2019

If systend-networkd doesn't interfere with IP addresses on macvlan interfaces that aren't created by it, then there might be one possibility of making this work.

If you manually create macvlan interfaces on your ens3 interfaces, and assign IP addresses to them, prior to keepalived running, then you should be able to configure keepalived to use the macvlan interfaces with the IP addresses you have assigned to the macvlan interfaces. In this case, you would not configure use_vmac.

@chr4

This comment has been minimized.

Copy link
Author

commented Apr 11, 2019

Mmh, manually using a macvlan (in bridge mode) works, but the VIP is not available. I suppose most likely because my carrier only allows one MAC address on the interface (?).

When using ipvlan instead, it seems to work fine though, including the unsolicited neighbour adverts (ipv6). Is there a reason not to use ipvlan on production? Seems like the general recommendation is using macvlan.

The ipvlan configuration:

ip link add link ens3 keepalived0 type ipvlan mode l2
ip link set keepalived0 up
interface ens3

# Doesn't seem to make a difference whether or not to set this
# vmac_xmit_base

virtual_ipaddress {
    [...]:1111:1 dev keepalived0
}
@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented Apr 11, 2019

I can see no reason why using an ipvlan should not work. The reason I suggested macvlan is that I have used them, but I haven't used ipvlans and didn't think of doing so.

vmac_xmit_base is for when use_vmac is specified, causing keepalived to create a macvlan, but vmac_xmit_base means that the adverts are then send on the base interface.

If you have this working now, can the issue be closed?

@chr4

This comment has been minimized.

Copy link
Author

commented Apr 11, 2019

Yes, I think I can manage to sleep well at nights now :)

Thanks so much for the detailed and fast help! Keep up the good work 🥇

@chr4 chr4 closed this Apr 11, 2019

@chr4

This comment has been minimized.

Copy link
Author

commented Apr 18, 2019

Short followup:
I'm currently thinking about implementing something like a use_ipvlan flag, that works similar to use_vmac but instead uses ipvlan. Would that be something worth working on a pull request for?

That would simplify the setup quite a bit for people that can't use VMACs, as a lot of carriers do not allow multiple MAC addresses per interface.

@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented Apr 18, 2019

@chr4 I had thought about that as an option myself, so yes, I think it is a good idea. Providing a patch via a pull request would be very helpful; I would be quite happy to provide feedback on any patch if necessary.

A couple of differences to consider for ipvlans vs macvlans:

  • If an ipvlan is created with its parent being a macvlan/macvtap, it will be created with the macvlan/macvtap being the parent, whereas creating a macvlan on top of a macvlan/macvtap, the parent of the created macvlan is the parent of the macvlan/macvtap.
  • the ipvlan interface will need to have an IP address configured and assigned to the interface
@chr4

This comment has been minimized.

Copy link
Author

commented Apr 24, 2019

Thanks for the input. Some further research suggests, that IPVLAN only works properly when attaching interfaces to a specific network namespace.

In my tests it works quite well when using ipv6, but communication isn't possible with ipv4 (pings received by parent interface, but never replied to). When attaching the interface to a namespace and use keepalived's net_namespace pinging works, but I cannot access the HA service in the default namespace. Also, both nodes go into MASTER mode and apparently VRRP doesn't work, as the interface needs to be set to the IPVLAN interface inside the namespace.

Documentation seems to be really scarce? The kernel documentation only provides examples including namespaces.

@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented Apr 25, 2019

I have manually created an ipvlan interface in each of two systems, not using network namespaces, and am successfully running keepalived over the ipvlan interfaces.

The commands I used to create the interfaces were:

ip link add link BASE_IF ipvlan0 type ipvlan mode l2
ip addr add 2.3.4.n/24 brd + dev ipvlan0
ip link set ipvlan0 up

I then specify interface ipvlan0 in the keepalived config (and don't specify use_vmac since it is not possible to create a macvlan on top of an ipvlan), and keepalived works as normal.

@chr4

This comment has been minimized.

Copy link
Author

commented Apr 29, 2019

I've been using the exact same commands, even copy and pasted yours, but I can't get it to work. The interface is created successfully, also the VIP is assigned correctly. But incoming ICMPs (let alone other packets) are just not replied to. I've doublechecked the /proc/sys/net entries, but they all look normal. Not sure what's happening here.

I'm currently testing in an OpenStack environment. I thought the IPVLAN approach is transparent to the underlying network hardware, but there might be something I've missed? What's weird is, that I can see the incoming packets with tcpdump (ICMP as well as the TCP SYNs), but the kernel somehow decides not to answer them (no idea why it works when using namespaces). Well, I though there might be some other limitation I'm not aware of...

@johannbg

This comment has been minimized.

Copy link

commented May 8, 2019

Just for information sake for upstream and other interested parties

The issue with networkd pruning floating virtual IPs for high-availability environments is being address in [1]

Note downstreams will need to backport this fix once it has landed.

  1. systemd/systemd#12511

pqarmitage added a commit to pqarmitage/keepalived that referenced this issue May 13, 2019

Add support for use_ipvlan (use an ipvlan i/f similar to use_vmac)
Issue acassen#1170 identified that use_vmac didn't work with systemd-networkd
since systemd-networkd was removing IP addresses created by keepalived
(and any other application). It was discovered that systemd-networkd
did not remove IP addresses from ipvlans.

This commit adds support for ipvlans, but to work around the problem,
and because it might have other uses.

Systemd commit - systemd/systemd#12511 has added
configuration options to stop systemd-networkd removing IP addresses
added by other applications, but it is not merged yet, and it will be a
while before all the distros merge it.

Signed-off-by: Quentin Armitage <quentin@armitage.org.uk>
@pqarmitage

This comment has been minimized.

Copy link
Collaborator

commented May 13, 2019

Commit 897690b adds support for IPVLANs, which may help until systemd/systemd#12511 is merged into the distros.

I have tested this on kernels from 4.16 through to 5.0 with both IPv4 and IPv6 and it works successfully for me. @chr4 if you try using this and have problems making it work, please log them in this issue, and if necessary we will reopen the issue.

@johannbg

This comment has been minimized.

Copy link

commented May 13, 2019

Note that adding an [DHCP] section and setting CriticalConnection=yes in that [DHCP] section (even if you do not use DHCP) should work as a workaround. ( I have not managed to find the time to test this myself thou ) so someone here might be able to confirm/deny if that works as a workaround for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.