Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VRRP HLD #475

Closed
wants to merge 10 commits into from
Closed

VRRP HLD #475

wants to merge 10 commits into from

Conversation

dks19
Copy link

@dks19 dks19 commented Sep 30, 2019

Hi all,
This is high level design for VRRP feature in SONiC. Currently, VRRPv2 along with interface tracking is supported. Please review and provide feedback/comments.

Thanks,
Dilip

@msftclas
Copy link

msftclas commented Sep 30, 2019

CLA assistant check
All CLA requirements met.

The Virtual Router Redundancy Protocol (VRRP) functionality is designed to eliminate the single point of
failure inherent in the static default routed environment. VRRP specifies an election protocol that
dynamically assigns responsibility of gateway router to a VRRP instance on one of the routers on a LAN. The VRRP instance controlling the IP address(es) associated with a virtual router is called the Master, and routes the traffic. The election process provides dynamic fail-over in the forwarding responsibility should the Master become unavailable. Any of the virtual router's IP addresses on a LAN can then be used as the default first hop router by end-hosts. The advantage gained from using VRRP is a higher availability default path without requiring configuration of dynamic routing or router discovery protocols on every end-host.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to list the out the possible use cases for the sonic user to enable to this feature? Can this work on data center MLAG kind of deployments ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use cases are networks needing backup of logical gateway. The network could be like described below or for a similar use case,

  • 3-tier access layer, distribution layer & core layer network, where the redundancy for gateway is needed in distribution layer.
  • Layer 3 CLOS network, where the redundancy is needed for the first hop gateways.

Generally static anycast gateway is used in MLAG deployment, but yes VRRP could also be used in MLAG deployment.


Following requirements are beyond scope of this release.

1. VRRPv3 (IPv6) support

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is preventing not to support vrrp3?
How is it different from FRR VRRP? Since FRR is part of SONiC stack, can't we stick to FRR VRRP (one stack)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently VRRPv3 (IPv6) support is added.

Since keepalived VRRP is robust, feature rich and well maintained in linux community, keepalived VRRP was chosen over FRR VRRP.


### 3.2.6 Uplink Interface Tracking

Interfaces other than the VRRP instance interface can be tracked for up/down events. When interface-tracking is enabled in the VRRP instance configuration, the tracked interface's operational status will be monitored. When a interface operational down event is detected on a tracked-interface, the track-priority/weight is subtracted from the current router’s priority value. Similarly, when interface operational up event is detected on the tracked-interface, the track-priority/weight is added to the router’s current priority value.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the uplink tracking works? for instance let's say there are more than 8 uplink interfaces how do we does it effects on mastership?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uplink tracking affects the overall priority for VRRP router. If the tracked interface is in 'UP' state, then the VRRP priority for the router is increased by the weight associated with the tracked interface.

Currently only 8 interfaces can be tracked per VRRP instance. Please let us know if there is a real use case of tracking more than 8 interfaces per VRRP instance.

Note multiple VRRP instances could be configured on an VRRP interface.


## 7 Warm Reboot Support

Currently, warm-reboot is not supported for VRRP. That is, warm-reboot will simply restart the VRRP docker without VRRP storing any data for warm restart.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should prevent warm reboot if there is vrrp configuraiton in place.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the new design VRRP master's will relinquish the mastership of VRRP instances (by sending priority 0 keepalives, so that the standby can take over the mastership instantaneously) before going to warm-reboot. During warm-reboot the neighboring routers will be master. Post warm-reboot the rebooted router could claim back the mastership.

This way the forwarding is kept functional (and not changing) during the process of warm-reboot.


### vrrporchd

- Listens to VRRP_Table in APP_DB and adds virtual MAC entries in ASIC_DB for Master instances
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is unclear. can you explain what does vrrporch do?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vrrpsyncd listen to kernel and for each VRRP master instances, vrrpsyncd will add interface name, VIP & VMAC to APP_DB.
vrrporch will inturn listen to APP_DB and for entry in VRRP_TABLE, program the VIP as my IP and the VMAC as my MAC(virtual RIF) in ASIC_DB.

between VRID and addresses must be coordinated among all VRRP routers on a LAN. However, there is
no restriction against reusing a VRID with a different address mapping on different LANs. The scope of
each virtual router is restricted to a single LAN.
To minimize network traffic, only the Master for each virtual router sends periodic VRRP Advertisement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break paragraph

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will split it.


Currently, warm-reboot is not supported for VRRP. That is, warm-reboot will simply restart the VRRP docker without VRRP storing any data for warm restart.

## 8 Unit Test cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need detailed integration test plan

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.


### 3.2.4 VRRP Advertisement Frame

VRRP control packets have IP protocol type as 112 (reserved for VRRP), and are sent to VRRP multicast address 224.0.0.18. Source MAC in VRRP control packets is virtual MAC address and source IP is interface IP.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need sai trap id for this. do we have it defined in the SAI, if yes, then add this as a sai requirement.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The already defined SAI trap IDs viz., SAI_HOSTIF_TRAP_TYPE_VRRP & SAI_HOSTIF_TRAP_TYPE_VRRPV6 will be used to trap the VRRP packets.


Example:-

**Key**: VRRP_TABLE:Vlan1000:[40.10.8.101/32:ipv4]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use interface table? why can we extend intforch? why do we need a separate vrrporch?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intforch was not used to separate the use case. In VRRP multiple my IPs and my MACs (virtual RIF) needs to be programmed and it could so happen that there could be multiple sources (different VRRP instances on the same interface), generating same my IP.

As intforch cannot handle it as is, creating vrrporch module to handle the functionality separately.


### vrrpsyncd

Listens to MACVLAN interface programming in kernel. Status of MACVLAN interface determines Master/Backup state of VRRP instances. VRRP_Table in APP_DB will be programmed with interface name and VIP for Master instances.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

macvlan interface can be used for other use cases, are we assuming all macvlan interface are created by keepalived?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, only MACVLANs whose interface name starts with 'vrrp' in kernel is associated with VRRP. keepalived adds MACVLANs with name starting with 'vrrp' substring.

@disaster123
Copy link

Does this one support active/active setups?

@dks19
Copy link
Author

dks19 commented Oct 30, 2019 via email

@disaster123
Copy link

@dks19 any chance to implement active active or vrr? I‘m willing to pay for it.

@yxieca yxieca force-pushed the master branch 2 times, most recently from 8498931 to 8837dc2 Compare April 15, 2022 16:51
@nser77
Copy link

nser77 commented May 1, 2023

Quite interesting topic.

I personally find a lot of benefit into keepalived:

  1. keepalived is heavily focused on the vrrp protocol (v2 and v3) and supports many configurations around it.
  2. keepalived has many track controls to detect a failure: interfaces, script, process and files.
  3. keepalived supports both iptables and nftables.
           # keepalived uses a firewall (either nftables or iptables) for two purposes:
           #  i)  To implement no_accept mode
           #  ii) To stop IGMP/MLD/Router-Solicit packets being sent on VMAC interfaces,
           #      and to move IGMP/MLD messages onto the underlying interface.
  1. keepalived supports VRRP synchronization group(s): VRRP Sync Group is an extension to VRRP protocol (man).
  2. keepalived supports static route creation - it may need some implementations like fpmsyncd with bgp (see above).
  3. keepalived is a lightweight, robust and smart software.
  4. keepalived uses SNMP.
  5. keepalived works with namespaces.

Anyway, keepalived works quite close to the kernel so in this case the heavy problem might be related to the netlink integration and how keepalived could works natively into the SONiC environment.

[...]
Same as IPVS wrapper. Keepalived work with its own network interface representation. 
IP address and interface flags are set and monitored through kernel Netlink channel. 
The Netlink messaging sub-system is used for setting VRRP VIPs. 
On the other hand, the Netlink kernel messaging broadcast capability is used to reflect 
into our userspace Keepalived internal data representation any events related to interfaces. 
So any other userspace (others program) netlink manipulation is reflected to our 
Keepalived data representation via Netlink Kernel broadcast (RTMGRP_LINK & RTMGRP_IPV4_IFADDR).
[...]

As per SONiC achitecture, seems to be logic to manage the integration betwen keepalived and netlink trought the producer/consumer achitecture and with some of those processes that are already implemented for some of those scopes:

[...]

Portsyncd: Listens to port-related netlink events. 
During boot-up, portsyncd obtains physical-port information by parsing system's hardware-profile config files.
In all these cases, portsyncd ends up pushing all the collected state into APPL_DB.
Attributes such as port-speeds, lanes and mtu are transferred through this channel.
Portsyncd also inject state into STATE_DB. Refer to next section for more details.

Intfsyncd: Listens to interface-related netlink events and push collected state into APPL_DB. 
Attributes such as new/changed ip-addresses associated to an interface are handled by this process.

[...]

Orchagent: The most critical component in the Swss subsystem. 
Orchagent contains logic to extract all the relevant state injected by *syncd daemons, 
process and message this information accordingly, and finally push it towards its south-bound interface.
This south-bound interface is yet again another database within the redis-engine (ASIC_DB), so as we can see, 
Orchagent operates both as a consumer (for example for state coming from APPL_DB),
and also as a producer (for state being pushed into ASIC_DB).

IntfMgrd: Reacts to state arriving from APPL_DB, CONFIG_DB and STATE_DB to configure interfaces in the linux kernel. 
This step is only accomplished if there is no conflicting or inconsistent state within any of the databases being monitored.
Refer to the above database-container section for examples of this undesired behavior.

VlanMgrd: Reacts to state arriving from APPL_DB, CONFIG_DB and STATE_DB to configure vlan-interfaces in the linux kernel.
As in IntfMgrd's case, this step will be only attempted if there is no dependent state/conditions being unmet.

[...]

As @dks19 suggested in his HLD, a vrrpsyncd process should also be added and, as per SONiC architecture, it may need to act as a producer for kernel integration (routes, ipaddress, etc. ..) - like fpmsyncd acts for bgp - and also as consumer, for all of those required netlink messages in both directions; from my point of view, the integration betwen keepalived and SONiC should relay over the SAI ecosystem and keepalivedshould not directly operate to the kernel; on the other hand, I'm not sure if vrrporchd is required.

Remember that vrrp works over multicast, but keepalived also implements unicast.

The integration of keepalived in SONiC would be very useful if it were fully integrated into the producer/consumer architecture, that's the power of SONiC, but keepalived really works close to the kernel; so, before proceding, an handshake with the mainters of keepalived might be required.

At the moment, I'm quite interested but still skeptical about this integration (perhaps due to lack of knowledge), anyway, keepalived should not replace frr in the SONiC architecture, but simply provide a mission-critical vrrp support for the enterprise in a separate, dedicated container.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jul 14, 2023

CLA Not Signed

@jefftant
Copy link

jefftant commented Jul 15, 2023 via email

@adyeung
Copy link
Collaborator

adyeung commented Apr 11, 2024

Converge VRRP proposal with #1446

@adyeung adyeung closed this Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants