New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPv6 and IPv4 not working together on Keepalived 1.3.2 #497

Closed
amitesh88 opened this Issue Jan 17, 2017 · 13 comments

Comments

Projects
None yet
3 participants
@amitesh88

amitesh88 commented Jan 17, 2017

We need to configure multiple IPv4 and IPv6 on keepalived and that is the reason we switched over to Keepalived 1.3.2 .
I am using Keepalived 1.3.2 on centos 7.2 . On this version, I am facing issue with IPv6 and IPv4 when configured together. It always assigns IPv4 address only and not IPv6.
However whenever i use either of IPv4 or IPv6 ,keepalived works with proper failover.

Earlier when i was using keepalived version 1.2.13 , server goes to hang state when configure with multiple IPv4 and IPv6. With single IPv4 and IPv6 keepalived was working perfectly fine. We need to configure multiple IPv4 and IPv6 on keepalived and that is the reason we switched over to Keepalived 1.3.2 .
Can anyone help ??

@pqarmitage

This comment has been minimized.

Collaborator

pqarmitage commented Jan 17, 2017

It is not possible to configure both IPv4 and IPv6 addresses as virtual_ipaddresses in a single vrrp_instance; the reason is that the VRRP protocol doesn't support it. If you need to associate both IPv4 and IPv6 addresses with a single vrrp_instance, then configure the addresses of one family in a virtual_ipaddress_excluded block.

Although earlier versions of keepalived didn't complain if IPv4 and IPv6 addresses were both configured, it didn't work properly.

Probably a better solution than using a virtual_ipaddress_excluded block is to configure two vrrp instances, one for IPv4 and one for IPv6, and if it helps, they can both use the same virtual_router_id since VRRP IPv4 and IPv6 instances are completely independent of each other. If you want to ensure that the IPv4 and IPv6 vrrp_instances are always in the same state as each other, then you can configure a vrrp_sync_group to include both the instances. Then if one of the instances goes into fault state, the other instance will be forced to fault state as well.

If this doesn't resolve your issue, then could you please post a copy of your configuration so that specific suggestions can be made.

@amitesh88

This comment has been minimized.

amitesh88 commented Jan 17, 2017

Thanks a lot pqarmitage for quick and accurate reply...
we have added 4 IPv4 in virtual_ipaddress block and 4 IPv6 in virtual_ipaddress_excluded block
and now centos 7.2 takes both IPv4 and IPv6 together and failover is also happening smoothely.
We are still observing for some more time.
Thanks a lot 👍 🥇

@amitesh88

This comment has been minimized.

amitesh88 commented Jan 17, 2017

hello pqarmitage ....we have implemented required changes and it was working fine but now whenever we start or stop keepalived the server goes in hang state. Attached are the file keepalived config file of two servers for your reference.
Services used
systemctl start ipvsadm
systemctl start nginx
systemctl start keepalived
Kindly Help
Thanks in advance

check_nginx.txt
keepalived.conf.1.txt
keepalived.conf.2.txt

@pqarmitage

This comment has been minimized.

Collaborator

pqarmitage commented Jan 18, 2017

First of all, I am not clear what you mean by the server goes in hang state. Is it a process that hangs, or is it the whole system? What logs are there that relate to the 'hang'? It would be helpful if you could post the log entries.

I note that the files you have attached are DOS format files, i.e. each line ends . Can you confirm whether the actual files (the config files and the check_nginx script) contain the character at the end of each line or not.

Looking at your configurations, there are a few changes that could be made:

  1. Remove the dev bond1 from the virtual ip addresses (including the excluded ones). It defaults to the interface of the vrrp instance (this won't make a difference but the config looks simpler).
  2. Remove the /24s and /64s from the virtual ip addresses (including the excluded ones). You are adding the specific addresses, not creating the whole subnetwork (this can make a difference).
  3. There is no keyword lvs_sync_daemon_inteface. You should be specifying lvs_sync_daemon bind1 V41.
  4. The keyword version is not valid in the global_defs block (it should be vrrp_version).
  5. There is no need for state MASTER or state BACKUP
  6. Authentication is not valid with vrrp version 3

There may be other errors, but these are the ones I have noticed.

Since you have some invalid keywords configured, it would appear that you are not checking the keepalived logs to see what errors it is reporting. You need to check and resolve all the reported configuration errors.

The check_nginx script is interesting:

#!/bin/bash
counter=$(ps -C nginx --no-heading|wc -l)
if [ "${counter}" = "0" ]; then
/usr/bin/systemctl stop nginx
    sleep 1 
    counter=$(ps -C nginx --no-heading|wc -l)
    if [ "${counter}" = "0" ]; then
        /usr/bin/systemctl stop keepalived
    fi
fi

First of all, I would modify it as follows:

#!/bin/bash

counter=$(ps -C nginx --no-heading|wc -l)
if [[ ${counter} -eq 0 ]]; then
    /usr/bin/systemctl stop nginx
    sleep 1 
    counter=$(ps -C nginx --no-heading|wc -l)
    if [[ ${counter} -eq 0 ]]; then
        /usr/bin/systemctl stop keepalived
    fi
fi
exit 0

The main substance of the change is that it returns a specific exit code. keepalived uses the exit code to determine whether the script has succeeded or failed, and so an explicit exit code should be returned. The other changes are semantics - $counter is a number rather than a character string.

The next issue with the script is that if the first value of $counter is 0 (which means nginx is not running), you then attempt to stop the nginx service, but presumably since it is not running it is already stopped, and then it checks again if nginx is running, but since it wasn't running before, and all you have attempted to do is to stop the (non-running) service, it would appear that the second time $counter is checked it will always be 0, and hence keepalived will always be stopped if nginx is not running.

What I think would be more conventional would be if nginx is not running to return a non-zero exit code so the script fails, and then keepalived will reduce the priority of the vrrp instances by 5, due to the weight -5 statement in the vrrp_script block of the config. As you have keepalived configured, if nginx fails on 192.168.10.90, the priority of V41 will reduce to 93, and so 192.168.10.91 will take over as master (with priority 97), and 192.168.10.91 will remain master for V43. On the other hand, if nginx fails on 192.168.10.91, 192.168.10.90 will remain master for V41, with 192.168.10.91 reducing its priority for V41 to 92); however, in this case 192.168.10.91 will remain master for V43 since its priority will reduce to 90, which is still greater that the priority of V43 on 192.168.10.90).

The check_nginx script would then become:

#!/bin/bash

counter=$(ps -C nginx --no-heading|wc -l)
if [[ ${counter} -eq 0 ]]; then
    /usr/bin/systemctl stop nginx
    exit 1
fi
exit 0

This way, keepalived remains running on both systems, but if nginx fails on 192.168.10.90, 192.168.10.91 will take over as master of V41, and 192.168.10.90 will become backup.

@amitesh88

This comment has been minimized.

amitesh88 commented Jan 20, 2017

check_nginx.txt
keepalived.conf.1.txt
keepalived.conf.2.txt
keepalived.conf.3.txt
keepalived.conf.4.txt

Hi pqarmitage
Thankyou so much for your valuable suggestions.
I must first clear the doubts that you had put in above mail.

  1. Server goes in hang state means Physical Server.
  2. The last log entry that we get in /var/log/messages before the server goes in hang state is simple
    keepalived service stopped.
  3. Yes the file that i have attahced was converted to text format and there only it changed to DOS format.
    But actual file is typical Linux file.

Now our new implementations . that hopefully looks successful
We did below changes in sysctl.conf in older version i.e. keepalived 1.2.13 and it worked :)
net.ipv4.ip_forward = 1
net.ipv6.conf.bond1.forwarding = 1
net.ipv6.conf.bond1.accept_source_route = 1
net.ipv6.conf.bond1.accept_redirects = 1

After achieving this , we have also made changes recommended by u which u can find in the attached files. To make it clear earlier chages that was shared was on keepalived upgraded version 1.3.2 and these changes are on older keepalived version i.e. 1.2.13 .
We need your insights on this new configuration.

@pqarmitage

This comment has been minimized.

Collaborator

pqarmitage commented Jan 22, 2017

You still have invalid keywords in your config, which will be being reported in the log file, but it appears that you are not checking that. You've added weight to the global_defs which is not valid, and also you still haven't resolved lvs_sync_daemon_inteface which I mentioned in a previous post. Since you don't appear to be using LVS, then I suggest removing both the keywords from the global_defs block.

  1. Since VRRP version 3 is the current standard, I would recommend adding vrrp_version 3 in global_defs, and removing the authentication blocks.

  2. Remove state BACKUP and state MASTER, they aren't achieving anything.

  3. Since you are running keepalived across 4 systems, and they are all on the same subnetwork, I would use multicasting rather than unicasting. So achieve this, simply delete the unicast_src_ip and unicast_peer statements/blocks. This will then allow keepalived to operate in conformance with the VRRP RFC and makes the configuration simpler.

Since you haven't described what it is you are wanting to achieve, nor your environment, I cannot comment on the suitability of your configurations. From your configurations, it appears that under normal circumstances each system will be master for one of the vrrp instances, and backup for the other 3 instances, but if nginx is not running on a system, then that system will be backup for all vrrp instances. This will work fine unless nginx is not running on any of the systems, in which case the masters will be the same as if nginx is running on all the systems.

I'm not clear why you need 4 IPv4 addresses and 4 IPv6 addresses per vrrp instance, but that is presumably due to your local requirements.

It might be that you would be better off using the IPVS functionality of keepalived for what you are trying to achieve, but without knowing what you are trying to achieve overall I cannot say.

I hope the above helps.

@amitesh88

This comment has been minimized.

amitesh88 commented Feb 1, 2017

Hi pqarmitage ...
We are using Nginx as Reverse Proxy and Loadbalancer server and Keepalived is used to assign multiple IPs (IPv4 and IPv6) . We intent our load balancer to have multiple IPs so that each server can handle multiple connections.
Right now we have 70 millions users and we are expecting more in fututre.
Currently we are using 8 Nginx Load Balancers with Keepalived (4+4 cluster).
Below are the modified keepalived configuration on one of the servers.
Thanks for your help and guidance. Please suggest if something more needs to be polished in our configuration.
keepalived.conf.bkp.txt
check_nginx.sh.txt

@pqarmitage

This comment has been minimized.

Collaborator

pqarmitage commented Feb 13, 2017

In file keepalived.conf.bkp.txt at line 2, the keyword should be vrrp_version and not vrrp-version.

The check_nginx.sh script appears to be checking if nginx has failed, and if so stops the nginx service. The problem is that thereafter, once a second the check_nginx.sh script is again run, and it will every time attempt again to stop the already stopped nginx service.

Are you finding that the nginx processes are unreliable and that you need this type of checking for the service having failed?

Do you have a separate process/procedure for restarting the nginx service after it has failed? Otherwise you'll eventually end up with all the keepalived instances in fault state.

@pqarmitage

This comment has been minimized.

Collaborator

pqarmitage commented Mar 19, 2017

Closing due to no response for over 1 month.

@pqarmitage pqarmitage closed this Mar 19, 2017

@deshui123

This comment has been minimized.

deshui123 commented May 18, 2018

Hi All,
We also need to configure multiple IPv4 and IPv6 on keepalived.

In keepalived 1.2.13, we just use IPV4 and the config is ipv4_keepalivbed.txt.
For this config, it's multi-master mode. All VIPs are active and assigned to different nodes. If one node down, the VIPs on this node will transfer to other active nodes according to the priority. For our testing, the failover happened smoothely.

In order to configure multiple IPv4 and IPv6, we use keepalived 1.3.5. I also want to achieve the same function -- the smooth failover like only IPV4.
According to your dicussion, i want to add virtual_ipaddress_excluded in the conf, like ipv4_6_keepalivbed.txt.

Could you please help to check the new conf(ipv4_6_keepalivbed.txt)?
Does it works for multiple IPv4 and IPv6?

ipv4_6_keepalivbed.txt
ipv4_keepalivbed.txt

ENV: Redhat 3.10.0-693.17.1.el7.x86_64

@pqarmitage

This comment has been minimized.

Collaborator

pqarmitage commented May 18, 2018

I think it would be better to use separate vrrp instances for the IPv6 addresses, rather than the virtual_ipaddress_excluded blocks, so for example VI_5 could be split into VI_5 for IPv4 and VI6_5 for IPv6.

keepalived will track the interface that the vrrp_instance is configured on, so the track_interface blocks are unnecessary.

Since the IPv6 addresses are being added on eth0, would it be better for the IPv6 vrrp instances to use eth0? Also note that IPv6 instances use VRRP version 3 which doesn't support authentication.

Do you want the vrrp instances for IP address 192.16.1.89 and fe80::f816:3eff:fede:4c1a to be synchronised? If so, a sync group can be used, as I have done in ipv4_6_keepalivebed_sg.txt and ipv4_6_keepalivebed_sg1.txt

The maximum authentication password length is 8 characters, so I have truncated them. Also, do you really want/need authentication; VRRP version 3 removed it since it wasn't seen to be of any benefit.

state BACKUP is unnecessary - it is the default, so I have removed them.

Labels do not work with IPv6 addresses, so I have removed them.

I have attached 3 updated versions of your configuration.

ipv4_6_keepalivbed_1.txt creates separate IPv6 vrrp instances for the IPv6 addresses.

ipv4_6_keepalivbed_sg.txt adds sync groups to synchronise the vrrp instances VI_5 and VI6_5 etc.

ipv4_6_keepalivbed_sg1.txt uses a single sync group for all 6 vrrp instances, since they are all tracking the same items (eth0, eth1, and the 4 track scripts).

If you were using the code from the beta branch, the specifications of the track scripts could be moved into the vrrp_sync_group definition to avoid specifying them against each vrrp instance.

I hope that helps.

@deshui123

This comment has been minimized.

deshui123 commented May 21, 2018

Thank you very much for your detailed and patient feedback.
My keepalived version is
`[root@h-mul-worker-edge-01 ~]$ keepalived -v
Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2

Copyright(C) 2001-2017 Alexandre Cassen, acassen@gmail.com

Build options: PIPE2 LIBNL3 RTA_ENCAP RTA_EXPIRES FRA_OIFNAME FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK LIBIPTC LIBIPSET_DYNAMIC LVS LIBIPVS_NETLINK VRRP VRRP_AUTH VRRP_VMAC SOCK_NONBLOCK SOCK_CLOEXEC FIB_ROUTING INET6_ADDR_GEN_MODE SNMP_V3_FOR_V2 SNMP SNMP_KEEPALIVED SNMP_CHECKER SNMP_RFC SNMP_RFCV2 SNMP_RFCV3 SO_MARK
`
How to hide the output “Build option”?
I will remove "track_interface""authentication" and label of ipv6.
Still have some questions:

  1. Since VRRP IPv4 and IPv6 instances are completely independent of each other, the virtual_router_id of ipv4 or ipv6 instance can be same or not, and it doesn't matter right?
  2. for vrrp_instance VI6_5 in ipv4_6_keepalivbed_1.txt, if interface of ipv6 is eth0, and i want to track eth1, then the configure maybe like:
    interface eth0 track_interface { eth1 } virtual_ipaddress { fe80::f816:3eff:fede:4c1a dev eth0 }
    if interface of ipv6 is eth0, and i want eth1 for interface, then the configure maybe like:
    interface eth1 virtual_ipaddress { fe80::f816:3eff:fede:4c1a dev eth0 }
    Am i right?
    Could you please share some knowledge about interface, track_interface and "dev interface" of virtual_ipaddress?
    3)Compared with ipv4_6_keepalivbed_sg.txt and ipv4_6_keepalivbed_sg1.txt. The different is there is only a single sync group in ipv4_6_keepalivbed_sg1.txt.
    But all vips in the same vrrp_sync_group, if one instances goes into fault state, the other instances will be forced to fault state as well, right? i want to achieve multi-master mode. Maybe vrrp_sync_group is not needed by my case.
  3. Can i prepare separate config files for each instance and include these configs in keepalived.conf?
    It's much better if you can provide such example.
    Thanks for your example, they deepened my understanding of keepalilived
@pqarmitage

This comment has been minimized.

Collaborator

pqarmitage commented May 25, 2018

@deshui123 You have asked a number of questions; answers are below:

How to hide the output “Build option”?
keepalived -v | grep -v "Build options"

1. Since VRRP IPv4 and IPv6 instances are completely independent of each other the virtual_router_id of ipv4 or ipv6 instance can be same or not, and it doesn't matter right?
Yes, that is correct. As RFC5798 says:

Within a VRRP router, the virtual routers in each of the IPv4 and IPv6 address families are a domain unto themselves and do not overlap.

That is, for example, there could be a IPv4 vrrp instance using VRID 1 and also a IPv6 vrrp instance using VRID1 both running on eth0. There could also be a IPv4 vrrp instance using VRID 1 and also a IPv6 vrrp instance using VRID1 both running on eth1, and also a IPv4 vrrp instance using VRID 2 and also a IPv6 vrrp instance using VRID2 both running on eth0. That is, the pair IP protocol and VRID must be unique on a given link layer network (i.e. multicast domain).

2. Use of track_interface

interface eth0
track_interface {
    eth1
}
virtual_ipaddress {
    fe80::f816:3eff:fede:4c1a dev eth0
}

This means that the vrrp instance will run over eth0 (the interface eth0 statement), in other words the VRRP adverts will be sent over eth0 and adverts will be listened for on eth0. The vrrp instance will track both eth0 and eth1 (by default the vrrp instance will track its own interface). Tracking interfaces means that if any of the tracked interfaces goes down the vrrp instance will transition to FAULT state. The dev eth0 on the virtual_ipaddress entry is not needed, since keepalived will add virtual addresses to the interface of the vrrp instance by default.

interface eth1
virtual_ipaddress {
    fe80::f816:3eff:fede:4c1a dev eth0
}

This means that the vrrp instance will use eth1, but that the IP address will be added to eth0. You need to consider whether you really want this; if the IP address is to be added to eth0 would it be better to run VRRP over eth0 for this instance?

Could you please share some knowledge about interface, track_interface and "dev interface" of virtual_ipaddress?
interface is the interface over which the VRRP protocol will run, i.e. the interface over which VRRP adverts are sent and received for this instance. This interface will automatically be tracked by the VRRP instance (i.e. if the interface does down the VRRP instance will go to fault state).
track_interface is to add additional interfaces for the vrrp instance to track. For example, if the virtual router is configuring a virtual address on eth0 but it is forwarding received packets over eth1, then the router won't work if eth1 goes down, so it would want to track that interface as well.
dev interface on a virtual_ipaddress specifies the interface that the virtual ipaddress is to be configured on if it is not the interface of the vrrp instance itself.

Compared with ipv4_6_keepalivbed_sg.txt and ipv4_6_keepalivbed_sg1.txt. The different is there is only a single sync group in ipv4_6_keepalivbed_sg1.txt.
But all vips in the same vrrp_sync_group, if one instances goes into fault state, the other instances will be forced to fault state as well, right? i want to achieve multi-master mode. Maybe vrrp_sync_group is not needed by my case.

I'm not sure that generally it makes much sense to have IPv4 and IPv6 vrrp instances in the same sync group (although it will work) since IPv4 and IPv6 are completely separate addressing and routing domains.
An example of where you might want a sync group is as follows:

------------------- Network 1 -----------------  VRID1 network 1 VIP 10.0.1.254
       | eth1                        |  eth1
  ------------                  ------------
  | Router 1 |                  | Router 2 |
  ------------                  ------------
       | eth2                        | eth2
------------------- Network 2 -----------------   VRID1 network 2, VIP 10.0.2.254

If Router 2 is a backup for Router 1 an the routers are responsible for routing between Network 1 and Network 2, then if Router 1 loses its connection to Network 1, it can no longer forward traffic between the two networks; Router 1 will cease being master on Network 1 if eth1 goes down, but it must also stop being master on Network 2 to that Router 2 can take over both VIPs.

Configuration could be:

# Router 1

vrrp_sync_group SG_1 {
    group {
      VI_1
      VI_2
}

vrrp_instance VI_1 {
    interface eth1
    vrid 1
    priority 250     # high priorities cause quicker transition to master
    virtual_ipaddress {
        10.0.1.254
   }
}

vrrp_instance VI_2 {
    interface eth2
    vrid 1
    priority 250     # high priorities cause quicker transition to master
    virtual_ipaddress {
        10.0.2.254
   }
}
# Router 2

vrrp_sync_group SG_1 {
    group {
      VI_1
      VI_2
}

vrrp_instance VI_1 {
    interface eth1
    vrid 1
    priority 240     # high priorities cause quicker transition to master
    virtual_ipaddress {
        10.0.1.254
   }
}

vrrp_instance VI_2 {
    interface eth2
    vrid 1
    priority 240     # high priorities cause quicker transition to master
    virtual_ipaddress {
        10.0.2.254
   }
}

But all vips in the same vrrp_sync_group, if one instances goes into fault state, the other instances will be forced to fault state as well, right? i want to achieve multi-master mode. Maybe vrrp_sync_group is not needed by my case.
Yes, if one instance goes to fault state, then all will if they are in the same sync group. For multi-master mode then you don't want all vrrrp_instances in one sync group. The way to think about it is:

If Instance A isn't master, then Instance B must not be master, and if Instance B isn't
master, then Instance A mustn't be master then Instance A and Instance B should
be in the same sync group.

Can i prepare separate config files for each instance and include these configs in keepalived.conf?
It's much better if you can provide such example.

Suppose you have two routers, with system names router0.domain.net and router1.domain.net, then you can use conditional configuration to use the same configuration file on both systems. The conditional names will default to router0 and router1, but these names can be overridden by using the -i NAME command line option.
I've had to guess at how you want to configure it, but I've set up the configuration on the basis that as first choice router0 should be master for IPv4 and router1 should be master for IPv6. I have also assumed that you want all IPv4 instances to by synchronised, and all IPv6 instances also synchronised.

I've attached an example configuration file that could be used on both router0 and router1 that uses templating and replaceable parameters. Some of the functionality such as the templating and track_scripts in sync groups needs functionality only available in the beta branch of keepalived.

ipv4_6_keepalivbed_sg.txt

This may not do what you want to achieve, but it should give you some ideas. To get a feeling for what the configuration generates, you could run keepalived with the -d option and look at the logs which will contain the generated configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment