-
Notifications
You must be signed in to change notification settings - Fork 229
Tips for Deployments
- Introduction
- Basic configuration
- Gatekeeper server configuration
- Grantor server configuration
- Exporting logs to InfluxDB with gkle
- Example Bird configuration
These are notes on a Gatekeeper deployment consisting of one Gatekeeper server and two Grantor servers. They assume Ubuntu 20.04 servers with Gatekeeper installed via packages.
This small deployment is meant to help new users to get started with Gatekeeper, so they can evaluate Gatekeeper, write their policy, and incrementally grow their deployment from this first step.
The network topology is shown below, where the Gatekeeper server has its front port connected to a data center uplink and its back port connected to a router. The router works as a gateway for a number of servers which provide services to the Internet via the external network, while the internal network is used for administrative purposes. The Gatekeeper server uses a routing daemon such as Bird to establish a full-routing BGP session with the uplink provider and an iBGP session with the router. In this setup, the Gatekeeper server has a single uplink connection, so it will be configured with a default route to the uplink provider's router, therefore reducing its memory usage. For more complex configurations, a patched Bird version can be used to feed learned prefixes into Gatekeeper. The Grantor servers have their front port connected to the external network. They do not have a back port configuration in Gatekeeper, and the internal network link is used solely for administrator access.
external network
+-------------------+-----------+------------+
| | | front | front
+-----+------+ +----+---+ +----+----+ +----+----+
| | | | | | | |
uplink -------+ gatekeeper +-------+ router | | grantor | | grantor |
front | | back | | | | | |
+-----+------+ +----+---+ +----+----+ +----+----+
| | | |
+-------------------+-----------+------------+
internal network
Gatekeeper front IPv4: 10.1.0.1/30
Gatekeeper front IPv6: 2001:db8:1::1/126
Uplink router IPv4: 10.1.0.2/30
Uplink router IPv6: 2001:db8:1::2/126
Gatekeeper back IPv4: 10.2.0.1/30
Gatekeeper back IPv6: fd00:2::1/126
Router IPv4 on Gatekeeper link: 10.2.0.2/30
Router IPv6 on Gatekeeper link: fd00:2::2/126
External network IPv4 CIDR: 1.2.3.0/24
External network IPv6 CIDR: 2001:db8:123::/48
Grantor front IPv4: 1.2.3.4 and 1.2.3.5
Grantor front IPv6: 2001:db8:123::4 and 2001:db8:123::5
Note that in this simple example, there is a single Gatekeeper server. In a deployment that requires handling high bandwidth attacks, multiple Gatekeeper servers can be used, and the router in front of them must be configured with ECMP, using a hash of the source and destination IP addresses to achieve load balancing.
These steps can be performed for both Gatekeeper and Grantor servers, with the caveat that Grantors only have a front port, so any references to the back port can be ignored.
- Enable IOMMU and huge pages
The Gatekeeper server in this deployment has 256 GB of RAM.
We reserve 16 GB for the kernel and allocate the remaining 240 GB in 1 GB huge pages.
To pass the appropriate command line arguments to the kernel, edit /etc/default/grub
and set GRUB_CMDLINE_LINUX_DEFAULT
, running update-grub
afterwards.
Here we also enable IOMMU support via the intel_iommu
argument.
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on default_hugepagesz=1G hugepagesz=1G hugepages=240"
- Rename front and back ports
It's useful to have friendly interface names in machines with many NICs.
We're going to call the Gatekeeper front and back ports, appropriately, "front" and "back".
This will be done with systemd link files.
In the link file, it's important to specify a Match
section option that doesn't cause the kernel to rename back the interface once it has been taken control of by Gatekeeper.
For this deployment, we have used the PCI addresses of the interfaces. It can be obtained via udevadm:
# udevadm info /sys/class/net/<front port name> | grep ID_PATH=
E: ID_PATH=pci-0000:01:00.0
# udevadm info /sys/class/net/<back port name> | grep ID_PATH=
E: ID_PATH=pci-0000:02:00.0
Create systemd link files for the front and back interfaces (the latter only in the Gatekeeper server) and run update-initramfs -u
afterwards.
An example using the output from the above udevadm
commands is given below:
# /etc/systemd/network/10-front.link
[Match]
Property=ID_PATH=pci-0000:01:00.0
[Link]
Name=front
# /etc/systemd/network/10-back.link
[Match]
Property=ID_PATH=pci-0000:02:00.0
[Link]
Name=back
Once these two changes are in place, reboot the machine for them to take effect.
It's also important to remember that DPDK won't take over an interface that is in the UP state, so it's advised to remove the front
and back
interfaces from the operating system's network configuration (e.g. /etc/network/interfaces
in Ubuntu).
The first step is to edit the /etc/gatekeeper/envvars
and set the GATEKEEPER_INTERFACES
variable with the PCI addresses of the front and back interfaces:
GATEKEEPER_INTERFACES="01:00.0 02:00.0"
For the Gatekeeper server, set gatekeeper_server
to true
in /etc/gatekeeper/main_config.lua
:
local gatekeeper_server = true
Gatekeeper is composed of multiple functional blocks, each one with its own Lua configuration script located in /etc/gatekeeper
.
GK block: /etc/gatekeeper/gk.lua
In this file, the following variables have been set as below:
local log_level = staticlib.c.RTE_LOG_NOTICE
local flow_ht_size = 250000000
local max_num_ipv4_rules = 1024
local num_ipv4_tbl8s = 128
local max_num_ipv6_rules = 1024
local num_ipv6_tbl8s = 256
The flow_ht_size
variable is set close to the largest number that enables Gatekeeper to boot up. The larger the flow table, the better Gatekeeper can deal with complex attacks since it can keep state for more flows. To estimate how much memory a given value will consume, multiply flow_ht_size
by the number of NUMA nodes, two (i.e. the default number of instances of GK blocks per NUMA node), and 256 bytes. The Gatekeeper server in this deployment has two Intel Xeon processors, that is, two NUMA nodes, so our setting consumes 250000000 * 2 * 2 * 256 bytes ~ 238GB. Notice that this value is an upper bound, so it, in fact, consumes less memory than this estimate. Finally, it is worth pointing out that this setup tracks 250000000 * 2 * 2 = 1 billion flows.
The values for the max_num_ipv[46]_rules
and num_ipv[46]_tbl8s
variables have been set to small values, as we are configuring Gatekeeper with default routes.
If you are injecting BGP routes into Gatekeeper, the values for these variables depend on the size of the routing table.
To calculate them, we first generate IPv4 and IPv6 routing table dumps from full routing BGP sessions, creating, respectively, the ipv4-ranges
and ipv6-ranges
text files, each containing one CIDR per line.
The max_num_ipv[46]_rules
and num_ipv[46]_tbl8s
variables are set to a round number above the values given by the gtctl tool as described in the project's README file, using the gtcl estimate
command, where the ipv4-ranges
and ipv6-ranges
files are lists of prefixes obtained from routing table dumps of full routing IPv4 and IPv6 BGP sessions.
$ gtctl estimate -4 ipv4-ranges
ipv4: rules=1811522, tbl8s=1554
$ gtctl estimate -6 ipv6-ranges
ipv6: rules=313120, tbl8s=76228
In that case, we would set the following values:
local max_num_ipv4_rules = 2000000
local num_ipv4_tbl8s = 2000
local max_num_ipv6_rules = 400000
local num_ipv6_tbl8s = 100000
Solicitor block: /etc/gatekeeper/sol.lua
Up to version 1.1. By default, Gatekeeper limits the request bandwidth to 5% of the link capacity. In our deployment, we are using 10 Gbps interfaces for the Gatekeeper server and router, but the external network runs on 1 Gbps ethernet. With this configuration, 5% of the link capacity would amount to 50% of the external network bandwidth, so we reduce the request bandwidth rate to 0.5% of the Gatekeeper link capacity:
local req_bw_rate = 0.005
Starting at version 1.2. Gatekeeper needs to know the bandwidth of the destination (or protected) network to calculate the bandwidth of the request channel. In our deployment, we are using 10 Gbps interfaces for the Gatekeeper server and router, but the external network runs on 1 Gbps ethernet. Thus, the bandwidth of the destination network is 1 Gbps:
local destination_bw_gbps = 1
Network block configuration: /etc/gatekeeper/net.lua
In this file, set the variables below according to your network setup.
Examples have been given below for a front port named front
and a back port named back
.
In this deployment, the front port belongs to a VLAN and uses LACP, so we set the appropriate VLAN tags for IPv4 and IPv6, and the bonding mode to staticlib.c.BONDING_MODE_8023AD
.
In our environment, the back port is not in a VLAN, nor does it use link aggregation.
The back_mtu
variable is set to a high value to account for IP-IP encapsulation in packets sent to the Grantor servers.
Note that the MTU for the network interfaces in the path from the Gatekeeper servers' back_port
to the Grantor server's front_port
should be set to this value (other network interfaces in the network do not need to be reconfigured).
local user = "gatekeeper"
local front_ports = {"front"}
local front_ips = {"10.1.0.1/30", "2001:db8:1::1/126"}
local front_bonding_mode = staticlib.c.BONDING_MODE_8023AD
local front_ipv4_vlan_tag = 1234
local front_ipv6_vlan_tag = 1234
local front_vlan_insert = true
local front_mtu = 1500
local back_ports = {"back"}
local back_ips = {"10.2.0.1/30", "fd00:2::1/126"}
local back_bonding_mode = staticlib.c.BONDING_MODE_ROUND_ROBIN
local back_ipv4_vlan_tag = 0
local back_ipv6_vlan_tag = 0
local back_vlan_insert = false
local back_mtu = 2048
In the remaining Lua configuration files, we simply set the log_level
variable.
For production use, we specify the WARNING
level:
local log_level = staticlib.c.RTE_LOG_WARNING
As mentioned in the introduction, the Gatekeeper server will be configured using the uplink provider's router as a default gateway.
We recommend the usage of a default gateway for Gatekeeper whenever possible, because this configuration reduces memory usage (which can instead be used for flow storage), improves CPU cache hits (allowing for a larger packet processing rate) and simplifies the deployment process because a stock BGP daemon such as Bird can be used.
Default gateways are set using Gatekeeper's dynamic configuration mechanism, which consists in writing a Lua script that performs the desired configuration and passing it to the gkctl
tool.
The same mechanism is used to specify the Grantor servers used by Gatekeeper.
As illustrated in the network topology description, the uplink provider's router has IPv4 address 10.1.0.2
and IPv6
address 2001:db8:1::2
and the two Grantor servers have external IPv4 addresses 1.2.3.4
and 1.2.3.5
and external IPv6 addresses 2001:db8:123::4
and 2001:db8:123::5
.
The router's addresses in the interface connected to the Gatekeeper server's back port are 10.2.0.2
and fd00:2::2
, and the external network IPv4 and IPv6 CIDR blocks are, respectively, 1.2.3.0/24
and 2001:db8:123::/48
.
Create the /etc/gatekeeper/init.lua
file with the contents below:
require "gatekeeper/staticlib"
local dyc = staticlib.c.get_dy_conf()
-- IPv4 default gateway:
dylib.c.add_fib_entry('0.0.0.0/0', nil, '10.1.0.2', dylib.c.GK_FWD_GATEWAY_FRONT_NET, dyc.gk)
-- IPv6 default gateway:
dylib.c.add_fib_entry('::/0', nil, '2001:db8:1::2', dylib.c.GK_FWD_GATEWAY_FRONT_NET, dyc.gk)
-- IPv4 grantor configuration:
local addrs = {
{ gt_ip = '1.2.3.4', gw_ip = '10.2.0.2' },
{ gt_ip = '1.2.3.5', gw_ip = '10.2.0.2' },
}
dylib.add_grantor_entry_lb('1.2.3.0/24', addrs, dyc.gk)
-- IPv6 grantor configuration:
local addrs = {
{ gt_ip = '2001:db8:123::4', gw_ip = 'fd00:2::2' },
{ gt_ip = '2001:db8:123::5', gw_ip = 'fd00:2::2' },
}
dylib.add_grantor_entry_lb('2001:db8:123::/48', addrs, dyc.gk)
This script must be sent to Gatekeeper via the gkctl
tool after startup.
The best way to do this is to configure a systemd override with an ExecStartPost
command that runs gkctl
, with a long enough timeout to account for the Gatekeeper startup delay.
Run systemctl edit gatekeeper
and insert the following content:
[Service]
ExecStartPost=/usr/sbin/gkctl -t 300 /etc/gatekeeper/init.lua
TimeoutStartSec=300
If for some reason you cannot use a default gateway setup (e.g. if you have multiple uplinks connected to the Gatekeeper server), you need to deploy a patched Bird version which allows for communication between the two daemons.
In that case, remove the calls to dylib.c.add_fib_entry()
that set the IPv4 and IPv6 default routes from the init.lua
script above, and add the following block in your Bird configuration file:
protocol device {
...
port 0x6A7E;
...
}
protocol kernel kernel4 {
ipv4 {
...
port 0x6A7E;
...
}
}
protocol kernel kernel6 {
ipv6 {
...
port 0x6A7E;
...
}
}
The device
and kernel
settings allow Bird to interact with a userspace process listening on a socket identified by the given port ID, which must match the cps_conf.nl_pid
setting in /etc/gatekeeper/cps.lua
(0x6A7E is the default).
Simply start and enable Gatekeeper via systemd:
# systemctl start gatekeeper
# systemctl enable gatekeeper
For the Grantor server, set gatekeeper_server
to false
in /etc/gatekeeper/main_config.lua
:
local gatekeeper_server = false
GT block: /etc/gatekeeper/gt.lua
In this file, the following variables have been set as below:
local n_lcores = 2
local lua_policy_file = "policy.lua"
local lua_base_directory = "/etc/gatekeeper"
Network block configuration: /etc/gatekeeper/net.lua
For Grantor servers, the network configuration is analogous to the one for the Gatekeeper servers, with the exception that there's no back port when running Gatekeeper in Grantor mode.
Here we assume no link aggregation and no VLAN configuration.
Notice the MTU configuration matching the Gatekeeper server's back_mtu
value.
local user = "gatekeeper"
local front_ports = {"front"}
local front_ips = {"1.2.3.4/24", "2001:db8:123::4/48"}
local front_bonding_mode = staticlib.c.BONDING_MODE_ROUND_ROBIN
local front_ipv4_vlan_tag = 0
local front_ipv6_vlan_tag = 0
local front_vlan_insert = false
local front_mtu = 2048
In the remaining Lua configuration files, we simply set the log_level
variable.
For production use, we specify the WARNING
level:
local log_level = staticlib.c.RTE_LOG_WARNING
The Grantor configuration in gt.lua
points to a Lua policy script, a fundamental element of the Gatekeeper architecture.
When a packet from a new flow arrives at the Gatekeeper server, it is forwarded to the Grantor server for a policy decision.
In the simplest case, this decision is a binary choice of granting or declining packets belonging to this flow, along with the maximum bandwidth for the granted flows and the duration of each decision.
However, the policy response is in fact a reference to a BPF program installed in the Gatekeeper server, which can not only accept or deny packets, but also control the bandwidth budget available to the flow and adapt its response according to changing traffic patterns.
Once a BPF program has been assigned to the flow, further packets will be handled directly by the Gatekeeper server, according to the rules encoded in the program, and no new requests will be sent to the Grantor server until the flow expires.
The entry point of the policy script is a function called lookup_policy
, which receives as arguments a packet information object, which allows policy decisions to be made based on layer 2, 3 and 4 header fields, and a policy object, which can be used to set bandwidth and duration limits to the policy decision.
This function must return a boolean value to indicate whether the policy decision is to grant or decline the flow.
In practice, we can use the decision_granted
and decision_declined
functions and their variations from the policylib
Lua package to set the policy parameters (i.e. the BPF program index, the bandwidth budget and the duration of the decision) and return the appropriate boolean value.
These functions set the BPF program index field of the policy decision, respectively, to the granted
and declined
programs, which are bundled with a standard Gatekeeper installation.
In the example below, we will in fact use the decision_grantedv2
function, which is a simple wrapper for decision_grantedv2_will_full_params
.
They set the BPF program index to the more flexible grantedv2
program, also included with Gatekeeper.
It supports negative and secondary bandwidth settings, allows for direct delivery to be selected and can also be reused in custom BPF programs.
We will also use the decision_web
function, also a wrapper for decision_grantedv2_will_full_params
, which selects the web
BPF program that also comes with Gatekeeper.
This example BPF program allows for ICMP packets and incoming TCP segments with destination ports HTTP, HTTPS, SSH and FTP-related ports.
It also allows incoming TCP segments with source ports HTTP and HTTPS and an example of how to allow replies to connections initiated from the server.
Finally, we will illustrate the use of the decision_tcpsrv
function.
This function accepts lists of listening and remote ports and selects the tcp-services
BPF program.
This is a generic BPF program that allows incoming TCP segments with a destination port matching the given listening ports or with a source port matching the given remote ports.
Apart from some idiosyncratic services like FTP, allowing inbound and outbound traffic to certain ports is all that needs to be done for most TCP-based protocols, so tcp-services
greatly reduces the need to write custom BPF programs.
These functions have the following signatures:
function policylib.{decision_granted,decision_grantedv2,decision_web}(
policy, -- the policy object
tx_rate_kib_sec, -- maximum bandwidth in KiB/s
cap_expire_sec, -- policy decision (capability) duration, in seconds
next_renewal_ms, -- how long until sending a renewal request for this flow, in milliseconds
renewal_step_ms -- when sending renewal requests, don't send more than one per this duration, in milliseconds.
)
function policylib.decision_grantedv2_will_full_params(
program_index, -- corresponds to the index of the bpf_programs table in gk.lua in the Gatekeeper server.
policy, -- the policy object
tx1_rate_kib_sec, -- maximum primary bandwidth in KiB/s
tx2_rate_kib_sec, -- maximum secondary bandwidth in KiB/s
cap_expire_sec, -- policy decision (capability) duration, in seconds
next_renewal_ms, -- how long until sending a renewal request for this flow, in milliseconds
renewal_step_ms, -- when sending renewal requests, don't send more than one per this duration, in milliseconds.
direct_if_possible -- whether to enable direct delivery
)
function policylib.decision_declined(
policy, -- the policy object
expire_sec -- policy decision (capability) duration, in seconds
)
function policylib.decision_tcpsrv(
policy, -- the policy object
tx_rate_kib_sec, -- maximum primary bandwidth in KiB/s
cap_expire_sec, -- policy decision (capability) duration, in seconds
next_renewal_ms, -- how long until sending a renewal request for this flow, in milliseconds
renewal_step_ms, -- when sending renewal requests, don't send more than one per this duration, in milliseconds
ports -- the ports object, obtained by calling policylib.tcpsrv_ports(listening_ports, remote_ports)
)
As a practical example, we show below a policy script that is able to perform the following decisions:
- Grant or decline flows based on their source IPv4 addresses, based on labeled prefixes loaded from an external file;
- Grant or decline flows based on their destination IPv4 addresses, allowing traffic to a subrange containing web servers;
- Decline malformed packets;
- Grant packets not matching the rules above, with limited bandwidth.
We start by requiring the libraries policylib
from Gatekeeper and ffi
from LuaJIT.
Requiring policylib
also gives us access to the lpmlib
package, which contains functions to manipulate LPM (Longest Prefix Match) tables.
local policylib = require("gatekeeper/policylib")
local ffi = require("ffi")
Next, we define helper functions that represent our policy decisions.
These functions take a policy
argument, which has type struct ggu_policy
, but which can be considered as an opaque object for our purposes, as it's simply forwarded to the functions policylib.decision_grantedv2
or policylib.decision_declined
, described above.
-- Decline flows with malformed packets.
local function decline_malformed_packet(policy)
return policylib.decision_declined(policy, 10)
end
-- Decline flows by policy decision.
local function decline(policy)
return policylib.decision_declined(policy, 60)
end
-- Grant flow by policy decision.
local function grant(policy)
return policylib.decision_grantedv2(
policy,
3072, -- tx_rate_kib_sec = 3 MiB/s
300, -- cap_expire_sec = 5 minutes
240000, -- next_renewal_ms = 4 minutes
3000 -- renewal_step_ms = 3 seconds
)
end
-- Grant flows destined to web servers by policy decision.
local function grant_web(policy)
return policylib.decision_web(
policy,
3072, -- tx_rate_kib_sec = 3 MiB/s
300, -- cap_expire_sec = 5 minutes
240000, -- next_renewal_ms = 4 minutes
3000 -- renewal_step_ms = 3 seconds
)
end
-- Returns a policy function that grants flows destined to listening_ports or coming from remote_ports.
local function grant_tcpsrv(listening_ports, remote_ports)
local ports = policylib.tcpsrv_ports(listening_ports, remote_ports)
return function(policy)
return policylib.decision_tcpsrv(
policy,
3072, -- tx_rate_kib_sec = 3 MiB/s
300, -- cap_expire_sec = 5 minutes
240000, -- next_renewal_ms = 4 minutes
3000, -- renewal_step_ms = 3 seconds
ports
)
end
end
-- Grant flow not matching any policy, with reduced bandwidth.
local function grant_unmatched(policy)
return policylib.decision_grantedv2(
policy,
1024, -- tx_rate_kib_sec = 1 MiB/s
300, -- cap_expire_sec = 5 minutes
240000, -- next_renewal_ms = 4 minutes
3000 -- renewal_step_ms = 3 seconds
)
end
We then define a Lua table that maps its indices to policy decisions. The indices in this table correspond to the label that is associated to a network prefix when inserted in LPM (Longest Prefix Match) tables to be created below. Therefore, when inspecting a packet, we can perform a lookup for its source and/or destination IP addresses in this LPM table, using the returned label to obtain the function that will grant or decline this flow.
In the table below, flows labeled 1
in the LPM table will be declined, while those labeled 2
and 3
will be granted, respectively, by the grantedv2
and web
BPF programs.
Flows labeled 4
will grant incoming flows to ports 25 (SMTP), 587 (Submission) and 465 (SMTPS), and from port 25 (SMTP), while those labeled 5
will grant incoming flows to port 3306 (MySQL), with no allowed flows from external services.
The grant_unmatched
function is called statically and therefore is not referenced in the table.
local policy_decision_by_label = {
[1] = decline,
[2] = grant,
[3] = grant_web,
[4] = grant_tcpsrv({25, 587, 465}, {25}),
[5] = grant_tcpsrv({3306}, {}),
}
The policy script continues with the definition of the aforementioned LPM tables, with the use of the helper function new_lpm_from_file
.
The fact that the src_lpm_ipv4
and dst_lpm_ipv4
variables are global (i.e. their declarations do not use the local
keyword) is relevant, because it allows them to be accessed by other scripts.
This is useful, for example, to update an LPM table, or to print it for inspection.
The new_lpm_from_file
function, given below, assumes the input file is in a two-column format, where the first column is a network prefix in CIDR notation, and the second column is its label.
It uses functions in the lpmlib
package to create and populate the LPM table.
Given the policy_decision_by_label
table above, the input file containing source addresses ranges should use label 1
for those we want to decline and label 2
for those we want to grant access to.
Similarly, the input file containing destination address ranges should attach the label 3
to its prefixes.
src_lpm_ipv4 = new_lpm_from_file("/path/to/lpm/source/addresses/file")
dst_lpm_ipv4 = new_lpm_from_file("/path/to/lpm/destination/addresses/file")
function new_lpm_from_file(path)
-- Find minimum values for num_rules and num_tbl8s.
local num_rules = 0
local num_tbl8s = 0
local prefixes = {}
for line in io.lines(path) do
local prefix, label = string.match(line, "^(%S+)%s+(%d+)$")
if not prefix or not label then
error(path .. ": invalid line: " .. line)
end
-- Convert string in CIDR notation to IP address and prefix length.
local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
num_rules = num_rules + 1
num_tbl8s = num_tbl8s + lpmlib.lpm_add_tbl8s(ip_addr, prefix_len, prefixes)
end
-- Adjust parameters.
local scaling_factor_rules = 2
local scaling_factor_tbl8s = 2
num_rules = math.max(1, scaling_factor_rules * num_rules)
num_tbl8s = math.max(1, scaling_factor_tbl8s * num_tbl8s)
-- Create and populate LPM table.
local lpm = lpmlib.new_lpm(num_rules, num_tbl8s)
for line in io.lines(path) do
local prefix, label = string.match(line, "^(%S+)%s+(%d+)$")
if not prefix or not label then
error(path .. ": invalid line: " .. line)
end
-- Convert string in CIDR notation to IP address and prefix length.
local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
lpmlib.lpm_add(lpm, ip_addr, prefix_len, tonumber(label))
end
return lpm
end
Finally, we implement the lookup_policy
function.
As described above, this is the entry point of the policy script, i.e., the function called by the Grantor server to obtain a policy decision for a given packet.
The function receives two arguments.
The first is pkt_info
, which is a gt_packet_headers
struct, accessible from the policy script via the ffi
module.
These are the headers of the IP-in-IP encapsulated packet sent from Gatekeeper to Grantor.
The second argument is policy
, which we will simply pass along to the policy decision functions.
The lookup_policy
function starts by checking if the inner packet is an IPv4 packet.
In production we have IPv6-specific LPM tables and other policies, but for simplicity, in this example we will just apply the default policy for non-IPv4 traffic.
The function then proceeds with an LPM table lookup for the source address of the incoming packet, which, if successful, will return a policy decision function that is then applied.
Otherwise, the script attempts to obtain a policy by performing a lookup in the destination addresses LPM table.
These two steps are performed by the helper functions lookup_src_lpm_ipv4_policy
and lookup_dst_lpm_ipv4_policy
, respectively, which are given below.
Finally, if no policy is found, we apply the default policy decision function, grant_unmatched
.
function lookup_policy(pkt_info, policy)
if pkt_info.inner_ip_ver ~= policylib.c.IPV4 then
return grant_unmatched(policy)
end
local fn = lookup_src_lpm_ipv4_policy(pkt_info)
if fn then
return fn(policy)
end
local fn = lookup_dst_lpm_ipv4_policy(pkt_info)
if fn then
return fn(policy)
end
return grant_unmatched(policy)
end
The lookup_src_lpm_ipv4_policy
and lookup_dst_lpm_ipv4_policy
functions perform lookups, respectively, on the src_lpm_ipv4
and dst_lpm_ipv4
tables, which were populated with network prefixes loaded from input files, as described above.
We use the ffi.cast
function to obtain an IPv4 header, so that we can access the packet's source IP address and look it up in the LPM table, with lpmlib.lpm_lookup
.
This function returns the matching label for the network prefix to which the flow's source address belongs, which will be used to obtain its associated policy decision function via the mapping in the policy_decision_by_label
Lua table.
Note that lpmlib.lpm_lookup
returns a negative number if no match is found, and since the policy_decision_by_label
table has no negative indices, the table lookup will return nil
, and the lookup_policy
function will proceed without performing the code in the then
branch of the if
statements.
function lookup_src_lpm_ipv4_policy(pkt_info)
local ipv4_header = ffi.cast("struct rte_ipv4_hdr *", pkt_info.inner_l3_hdr)
local label = lpmlib.lpm_lookup(src_lpm_ipv4, ipv4_header.src_addr)
return policy_decision_by_label[label]
end
function lookup_dst_lpm_ipv4_policy(pkt_info)
local ipv4_header = ffi.cast("struct rte_ipv4_hdr *", pkt_info.inner_l3_hdr)
local label = lpmlib.lpm_lookup(dst_lpm_ipv4, ipv4_header.dst_addr)
return policy_decision_by_label[label]
end
Finally, we add four helper functions to the policy script.
These functions are not used by the policy itself, but by the dynamic configuration script that keeps the LPM table up to date.
The add_src_v4_prefix
and add_dst_v4_prefix
functions take a prefix string in CIDR format and an integer label and insert them in the appropriate LPM table.
The del_src_v4_prefix
and del_dst_v4_prefix
functions take a prefix string in CIDR format and remove them from the appropriate LPM table.
More details about dynamically updating the LPM table are given below.
function add_src_v4_prefix(prefix, label)
local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
lpmlib.lpm_add(src_lpm_ipv4, ip_addr, prefix_len, label)
end
function add_dst_v4_prefix(prefix, label)
local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
lpmlib.lpm_add(dst_lpm_ipv4, ip_addr, prefix_len, label)
end
function del_src_v4_prefix(prefix)
local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
lpmlib.lpm_del(src_lpm_ipv4, ip_addr, prefix_len)
end
function del_dst_v4_prefix(prefix)
local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
lpmlib.lpm_del(dst_lpm_ipv4, ip_addr, prefix_len)
end
The example policy script given above loads network prefixes and labels from a file. In practice, these prefixes are usually assembled from multiple online sources of unwanted source networks, such as Spamhaus' EDROP or Team Cymru's Bogon prefixes to decline flows whose source address belongs to these prefixes.
These online unwanted prefix lists are continuously updated, and may contain intersecting network blocks, so it makes sense to use a tool designed to fetch, merge and label them automatically, generating a file that can be consumed by the policy script. The Drib tool has been developed with this purpose.
This tool aggregates IP prefixes from configurable online and offline sources and allows each source to be labeled with its own "class", which is just an arbitrary string. Once the prefixes are aggregated, Drib can render a template, feeding it with the prefixes and their respective class. We use the source class configuration in Drib as the label to be associated with a prefix when inserted in the policy's LPM table.
Going back to the policy script, recall the definition of the policy_decision_by_label
variable:
local policy_decision_by_label = {
[1] = decline,
[2] = grant,
[3] = grant_web,
[4] = grant_tcpsrv({25, 587, 465}, {25}),
[5] = grant_tcpsrv({3306}, {}),
}
This means prefixes labeled 1
will be declined, and those labeled 2-5
will be granted according to the respective BPF programs.
Below we show a Drib configuration file, /etc/drib/drib.yaml
, that labels network blocks fetched from the EDROP and Bogons lists with a class value of 1
.
To make the example more complete, we also add a static network block labeled with a class value of 2
as an "office" network from which we always want to accept traffic.
Finally, we add static network blocks for web servers with a class value of 3
, to which we accept web-related traffic according to the rules in the web
BPF program, SMTP servers with a class value of 4
, to which we accept SMTP and mail submission traffic, and database servers to which we accept MySQL traffic.
Traffic related to the latter two network blocks will be governed by the same BPF program, tcp-services
.
Note that Drib supports specifying a group-scoped kind
setting, which is a tag shared by all prefixes in a given group.
We define the decline
and grant
groups with kind src
for the source address prefixes and the servers
group with kind dst
for the destination address prefixes, and use the entry.kind
field in templates that will generate Lua scripts that manipulate the src_lpm_ipv4
and dst_lpm_ipv4
LPM tables.
log_level: "warn"
bootstrap: {
input: "/etc/drib/bootstrap.tpl",
output: "/var/lib/drib/bootstrap_{proto}_{kind}",
}
ipv4: {
decline: {
priority: 30,
kind: "src",
edrop: {
remote: {
url: "https://www.spamhaus.org/drop/edrop.txt",
check_interval: "12h",
parser: {ranges: {one_per_line: {comment: ";"}}},
},
class: "1",
},
fullbogons: {
remote: {
url: "https://www.team-cymru.org/Services/Bogons/fullbogons-ipv4.txt",
check_interval: "1d",
parser: {ranges: {one_per_line: {comment: "#"}}},
},
class: "1",
},
},
grant: {
priority: 30,
kind: "src",
office: {
range: "100.90.80.0/24",
class: "2",
},
},
servers: {
priority: 20,
kind: "dst",
web: {
range: "1.2.3.0/26",
class: "3",
},
smtp: {
range: "1.2.3.64/27",
class: "4",
},
mysql: {
range: "1.2.3.96/27",
class: "5",
},
},
}
Given this configuration, the following bootstrap template file, /etc/drib/bootstrap.tpl
, is used to generate input files in the format expected by the policy script, that is, a two-column file with a network prefix in CIDR format in the first column, and an integer label in the second one:
{% for entry in ranges -%}
{{entry.range}} {{entry.class}}
{% endfor -%}
A cron job is set up to run the drib aggregate
command, which will download the EDROP and Bogon prefixes, merge them, exclude the office network range from the resulting set, and save a serialization of the result in what is called an aggregate file.
We tie everything together by calling the drib bootstrap --no-download
command in a systemd override ExecStartPre
command.
This will make Drib read an existing aggregate file (generated by the aforementioned cron job) and render the above template.
When Gateekeeper runs in Grantor mode, it will run the policy script, which will then read the recently-rendered template with the set of prefixes obtained from Drib.
The systemd override can be created with the systemctl edit gatekeeper
command in the Grantor servers.
Add the following content to the override file:
[Service]
ExecStartPre=/usr/sbin/drib bootstrap --no-download
This ensures the policy script will load up to date data when Gatekeeper starts in Grantor mode.
The setup described above works well for the generation of an initial (bootstrap) list of prefixes on Gatekeeper startup. However, the EDROP and Bogons lists, as well as similar online unwanted prefix lists, are continually updated, and Gatekeeper's in-memory LPM tables should be kept up to date.
To do this, we use the gtctl tool.
This is a tool that is able to parse Drib's aggregate files (generated in the cron job mentioned in the previous section) and compare it to an aggregate file saved from a previous run, generating sets of newly inserted and removed IP addresses.
These sets are used as inputs to render policy update scripts, which gtctl
then feeds into Gatekeeper via its dynamic configuration mechanism.
The policy update template, /etc/gtctl/policy_update.lua.tpl
simply generates calls to the add_src_v4_prefix
, add_dst_v4_prefix
, del_src_v4_prefix
and del_dst_v4_prefix
functions defined in the policy script.
Note the usage of the entry.kind
field in the template so that the appropriate function is called.
local function update_lpm_tables()
{%- for entry in ipv4.remove %}
del_{{entry.kind}}_v4_prefix("{{entry.range}}")
{%- endfor %}
{%- for entry in ipv4.insert %}
add_{{entry.kind}}_v4_prefix("{{entry.range}}", {{entry.class}})
{%- endfor %}
end
local dyc = staticlib.c.get_dy_conf()
dylib.update_gt_lua_states_incrementally(dyc.gt, update_lpm_tables, false)
Depending on the number of updates, it might be necessary to create a new LPM table that is able to accommodate the new set of prefixes.
For this case, gtctl
uses a policy replacement template, /etc/gtctl/policy_replace.lua.tpl
, to generate the script:
{{lpm_table}} = nil
collectgarbage()
{{lpm_table}} = {{lpm_table_constructor}}({{params.num_rules}}, {{params.num_tbl8s}})
local function update_lpm_tables()
{%- for entry in ipv4.insert %}
add_{{entry.kind}}_v4_prefix("{{entry.range}}", {{entry.class}})
{%- endfor %}
end
local dyc = staticlib.c.get_dy_conf()
dylib.update_gt_lua_states_incrementally(dyc.gt, update_lpm_tables, false)
The template above mentions the params
variable.
This variable is created by gtctl
after running a parameters estimation script, /etc/gtctl/lpm_params.lua.tpl
, which is also rendered from a template:
require "gatekeeper/staticlib"
require "gatekeeper/policylib"
local dyc = staticlib.c.get_dy_conf()
if dyc.gt == nil then
return "Gatekeeper: failed to run as Grantor server\n"
end
local function get_lpm_params()
local lcore = policylib.c.gt_lcore_id()
local num_rules, num_tbl8s = {{lpm_params_function}}({{lpm_table}})
return lcore .. ":" .. num_rules .. "," .. num_tbl8s .. "\n"
end
dylib.update_gt_lua_states_incrementally(dyc.gt, get_lpm_params, false)
Given these templates, the gtctl
configuration file, /etc/gtctl/gtctl.yaml
, which references them, is shown below.
log_level: "warn"
remove_rendered_scripts: true
socket: "/var/run/gatekeeper/dyn_cfg.socket"
state_dir: "/var/lib/gtctl"
replace: {
input: "/etc/gtctl/policy_replace.lua.tpl",
output: "/var/lib/gtctl/policy_replace_{proto}_{kind}.{2i}.lua",
max_ranges_per_file: 1500,
}
update: {
input: "/etc/gtctl/policy_update.lua.tpl",
output: "/var/lib/gtctl/policy_update_{proto}_{kind}.{2i}.lua",
max_ranges_per_file: 1500,
}
lpm: {
table_format: "{kind}_lpm_{proto}", # for this example's drib.yaml, yields "src_lpm_ipv4" and "dst_lpm_ipv4"
parameters_script: {
input: "/etc/gtctl/lpm_params.lua.tpl",
output: "/var/lib/gtctl/lpm_params_{proto}_{kind}.lua",
},
ipv4: {
lpm_table_constructor: "lpmlib.new_lpm",
lpm_get_params_function: "lpmlib.lpm_get_paras",
},
ipv6: {
lpm_table_constructor: "lpmlib.new_lpm6",
lpm_get_params_function: "lpmlib.lpm6_get_paras",
},
}
The only missing piece is a way to run gtctl
once a new aggregate file has been generated by Drib.
Our current solution is to rely on our configuration management tool, Puppet, to detect this and trigger the gtctl
execution:
file { '/var/lib/gtctl/aggregate.new':
ensure => 'present',
source => 'puppet:///drib/aggregate',
owner => 'root',
group => 'root',
mode => '0644',
notify => Exec['gtctl'],
}
exec { 'gtctl':
command => 'gtctl dyncfg -a /var/lib/gtctl/aggregate.new',
onlyif => 'systemctl is-active gatekeeper',
refreshonly => true,
}
Let's extend the example above with a new range of IPv4 addresses for recursive DNS servers. These are assumed to be for internal use only (i.e. they are used only by other servers, and not open to the Internet), and therefore should accept no external connections. However, in order to be able to perform recursive DNS queries, replies to packets sent to TCP and UDP port 53 must be allowed to reach the server. In other words, the BPF program must accept incoming packets with TCP and UDP source port 53.
We create a dns-recursive.c
file with the following changes compared to the web.c
file from the Gatekeeper repository.
- Include UDP headers: add the
udp.h
header file to the list of the program'sinclude
s near the top of the file:
#include <netinet/udp.h>
- Handle UDP traffic: this code grants access to UDP datagrams with source port 53, meaning they are sent by other DNS servers as replies to the queries made by our server.
In the
switch
statement on thectx->l4_proto
field, we add the following case:
case IPPROTO_UDP: {
struct udphdr *udp_hdr;
if (ctx->fragmented)
goto secondary_budget;
if (unlikely(pkt->l4_len < sizeof(*udp_hdr))) {
/* Malformed UDP header. */
return GK_BPF_PKT_RET_DECLINE;
}
udp_hdr = rte_pktmbuf_mtod_offset(pkt, struct udphdr *,
pkt->l2_len + pkt->l3_len);
/* Authorized external services. */
switch (ntohs(udp_hdr->uh_sport)) {
case 53: /* DNS */
break;
default:
return GK_BPF_PKT_RET_DECLINE;
}
goto forward;
}
- In the TCP section (below the comment "Only TCP packets from here on") we remove the whole
switch
statement on the TCP destination port (tcp_hdr->th_dport
), replacing it with a switch statement on the TCP source port, analogously to the UDP source portswitch
described in the previous step.
/* Authorized external services. */
switch (ntohs(tcp_hdr->th_sport)) {
case 53: /* DNS */
if (tcp_hdr->syn && !tcp_hdr->ack) {
/* No listening ports. */
return GK_BPF_PKT_RET_DECLINE;
}
break;
default:
return GK_BPF_PKT_RET_DECLINE;
}
To compile the program, it is necessary to build Gatekeeper by following the instructions in the Build from Source section of the README. Once Gatekeeper is compiled, run the following command:
$ GATEKEEPER_ROOT=/path/to/gatekeeper/repository
$ clang -O2 -target bpf \
-I$(GATEKEEPER_ROOT)/include -I$(GATEKEEPER_ROOT)/bpf -Wno-int-to-void-pointer-cast \
-o dns-recursive.bpf -c dns-recursive.c
The resulting dns-recursive.bpf
file must be uploaded to the Gatekeeper server and installed along the other BPF programs, by default in /etc/gatekeeper/bpf
.
Next, it must be added to the bpf_programs
table in the gk.lua
file.
We add it with an index of 100
, as indices below that number are considered to be reserved.
In /etc/gatekeeper/gk.lua
, the bpf_programs
variable will look like this:
local bpf_programs = {
[0] = "granted.bpf",
[1] = "declined.bpf",
[2] = "grantedv2.bpf",
[3] = "web.bpf",
[4] = "tcp-services.bpf",
-- Add the line below:
[100] = "dns-recursive.bpf",
}
The new BPF program will be loaded when Gatekeeper is restarted, but it is possible to load it dynamically using gkctl
.
Create the following Lua script in a file named insert-bpf-program.lua
:
require "gatekeeper/staticlib"
local dyc = staticlib.c.get_dy_conf()
local path = "/etc/gatekeeper/bpf/dns-recursive.bpf"
local index = 100
local ret = dylib.c.gk_load_bpf_flow_handler(dyc.gk, index, path, true)
if ret < 0 then
return "gk: failed to load BPF program " .. path .. " (" .. index .. ") in runtime"
end
return "gk: done"
Then load it into a running Gatekeeper instance with gkctl
:
# gkctl insert-bpf-program.lua
Now that we have a new BPF program installed in the Gatekeeper server, we can adapt our policy to use it.
First, add the grant_dns
function:
local function grant_dns(policy)
return policylib.decision_grantedv2_will_full_params(
100, -- dns-recursive.bpf index in bpf_programs in gk.lua
policy,
10240, -- primary bandwidth limit = 10 MiB/s
512, -- secondary bandwidth limit (5% of primary bandwidth)
300, -- cap_expire_sec = 5 minutes
240000, -- next_renewal_ms = 4 minutes
3000, -- renewal_step_ms = 3 seconds
true -- direct_if_possible
)
end
Next, add this function to the policy_decision_by_label
table so that is looks like this:
local policy_decision_by_label = {
[1] = decline,
[2] = grant,
[3] = grant_web,
[4] = grant_tcpsrv({25, 587, 465}, {25}),
[5] = grant_tcpsrv({3306}, {}),
-- Add the line below:
[6] = grant_dns,
}
Copy the new policy.lua
file to the Grantor servers, replacing the previous one in /etc/gatekeeper/policy.lua
.
The new policy will be read when the gatekeeper
service is restarted on the Grantor server, but we can also use gkctl
to reload it on a running server.
If Gatekeeper was installed using the provided Debian packages, the script /usr/share/gatekeeper/reload_policy.lua
should be available in the Grantor server.
Otherwise, it can be found in the gkctl/scripts
directory in the Gatekeeper repository.
Simply run the command below.
# gkctl /usr/share/gatekeeper/reload_policy.lua
If Drib is being used to manage IP address ranges, add the recursive DNS IPv4 range to the servers
block in /etc/drib/drib.yaml
.
Note the use of class 6
to match the index added to the policy_decision_by_label
variable in the policy script.
servers: {
# ...
dns: {
range: "1.2.3.64/29",
class: "6",
},
},
For completeness' sake, the full source code for the dns-recursive.c
program can be found below.
#include <net/ethernet.h>
#include <netinet/tcp.h>
#include <netinet/udp.h>
#include "grantedv2.h"
#include "libicmp.h"
SEC("init") uint64_t
dns_init(struct gk_bpf_init_ctx *ctx)
{
return grantedv2_init_inline(ctx);
}
SEC("pkt") uint64_t
dns_pkt(struct gk_bpf_pkt_ctx *ctx)
{
struct grantedv2_state *state =
(struct grantedv2_state *)pkt_ctx_to_cookie(ctx);
struct rte_mbuf *pkt = pkt_ctx_to_pkt(ctx);
uint32_t pkt_len = pkt->pkt_len;
struct tcphdr *tcp_hdr;
uint64_t ret = grantedv2_pkt_begin(ctx, state, pkt_len);
if (ret != GK_BPF_PKT_RET_FORWARD) {
/* Primary budget exceeded. */
return ret;
}
/* Allowed L4 protocols. */
switch (ctx->l4_proto) {
case IPPROTO_ICMP:
ret = check_icmp(ctx, pkt);
if (ret != GK_BPF_PKT_RET_FORWARD)
return ret;
goto secondary_budget;
case IPPROTO_ICMPV6:
ret = check_icmp6(ctx, pkt);
if (ret != GK_BPF_PKT_RET_FORWARD)
return ret;
goto secondary_budget;
case IPPROTO_UDP: {
struct udphdr *udp_hdr;
if (ctx->fragmented)
goto secondary_budget;
if (unlikely(pkt->l4_len < sizeof(*udp_hdr))) {
/* Malformed UDP header. */
return GK_BPF_PKT_RET_DECLINE;
}
udp_hdr = rte_pktmbuf_mtod_offset(pkt, struct udphdr *,
pkt->l2_len + pkt->l3_len);
/* Authorized external services. */
switch (ntohs(udp_hdr->uh_sport)) {
case 53: /* DNS */
break;
default:
return GK_BPF_PKT_RET_DECLINE;
}
goto forward;
}
case IPPROTO_TCP:
break;
default:
return GK_BPF_PKT_RET_DECLINE;
}
/*
* Only TCP packets from here on.
*/
if (ctx->fragmented)
goto secondary_budget;
if (unlikely(pkt->l4_len < sizeof(*tcp_hdr))) {
/* Malformed TCP header. */
return GK_BPF_PKT_RET_DECLINE;
}
tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct tcphdr *,
pkt->l2_len + pkt->l3_len);
/* Authorized external services. */
switch (ntohs(tcp_hdr->th_sport)) {
case 53: /* DNS */
if (tcp_hdr->syn && !tcp_hdr->ack) {
/* No listening ports. */
return GK_BPF_PKT_RET_DECLINE;
}
break;
default:
return GK_BPF_PKT_RET_DECLINE;
}
goto forward;
secondary_budget:
ret = grantedv2_pkt_test_2nd_limit(state, pkt_len);
if (ret != GK_BPF_PKT_RET_FORWARD)
return ret;
forward:
return grantedv2_pkt_end(ctx, state);
}
Gatekeeper has a log exporter that aggregates and exports Gatekeeper's logs to an InfluxDB instance, allowing for traffic information observability, for example via a Chronograf dashboard.
The log exporter can be installed via a Debian package availble in the project's releases page.
Edit the exporter configuration file with your InfluxDB credentials and then start the gkle
service with systemctl
.
In the log exporter repository, a Chronograf dashboard example can be found. It can be imported into Chronograf and it will provide a few graphs of the aggregated data collected from Gatekeeper.
The following log statistics are available:
-
tot_pkts_num
: total number of packets; -
tot_pkts_size
: total size of packets; -
pkts_num_granted
: number of granted packets; -
pkts_size_granted
: size of granted packets; -
pkts_num_request
: number of request packets; -
pkts_size_request
: size of request packets; -
pkts_num_declined
: number of declined packets; -
pkts_size_declined
: size of declined packets; -
tot_pkts_num_dropped
: total number of dropped packets; -
tot_pkts_size_dropped
: total size of dropped packets; -
tot_pkts_num_distributed
: total number of distributed packets; -
tot_pkts_size_distributed
: total size of distributed packets. -
flow_table_occupancy
: percentage of the flow table in use.
This section contains a sample Bird configuration file for the Gatekeeper server. It sets up BGP sessions with the uplink provider and with the router as described in the Introduction.
In this section we use ASNs 64502 for the uplink provider and 64501 for the AS where the Gatekeeper server is running.
log syslog { info, warning, error, auth, fatal, bug };
router id 10.1.0.1;
# The order in which the files are loaded is important.
protocol kernel kernel4 {
ipv4 {
export all;
};
}
protocol kernel kernel6 {
ipv6 {
export all;
};
}
protocol device {
scan time 10;
}
ipv4 table t_bgp4;
ipv6 table t_bgp6;
# Send routes learnt via BGP to the kernel through the master table.
protocol pipe bgp_into_master4 {
table master4;
peer table t_bgp4;
import all; # from table t_bgp into table master
export none; # from table master into table t_bgp
}
protocol pipe bgp_into_master6 {
table master6;
peer table t_bgp6;
import all; # from table t_bgp6 into table master
export none; # from table master into table t_bgp6
}
#
# Functions used by filters, below.
#
function is_ipv4_martian()
prefix set martians;
{
martians = [
0.0.0.0/8+,
10.0.0.0/8+,
100.64.0.0/10+,
127.0.0.0/8+,
169.254.0.0/16+,
172.16.0.0/12+,
192.0.0.0/24+,
192.0.2.0/24+,
192.168.0.0/16+,
198.18.0.0/15+,
198.51.100.0/24+,
203.0.113.0/24+,
224.0.0.0/3+,
255.255.255.255/32+
];
return net ~ martians;
}
function ipv4_prefix_can_be_imported(prefix set my_prefixes; int peer_asn)
{
if net = 0.0.0.0/0 then {
print "rejecting ipv4 default route via BGP";
return false;
}
if is_ipv4_martian() then {
printn "ipv4 martian prefix: ";
print net;
return false;
}
if net ~ my_prefixes then {
printn "ipv4 prefix is owned by me: ";
print net;
return false;
}
if peer_asn > 0 && bgp_path.first != peer_asn then {
printn "next hop is not the BGP neighbor (";
printn bgp_path.first;
printn " is not ";
printn peer_asn;
printn "): ";
print net;
return false;
}
return true;
}
function ipv4_prefix_can_be_exported(prefix set my_prefixes)
{
return net ~ my_prefixes;
}
function is_ipv6_martian()
prefix set martians;
{
martians = [
::/128,
::1/128,
::ffff:0:0/96+,
100::/64+,
2001::/23,
2001:2::/48+,
2001:10::/28+,
2001:db8::/32+,
2002::/17+,
3ffe::/16+,
5f00::/8+,
fc00::/7,
fe80::/10,
ff00::/8+
];
return net ~ martians;
}
function ipv6_prefix_can_be_imported(prefix set my_prefixes; int peer_asn)
{
if net = ::/0 then {
print "rejecting ipv6 default route via BGP";
return false;
}
if is_ipv6_martian() then {
printn "ipv6 martian prefix: ";
print net;
return false;
}
if net ~ my_prefixes then {
printn "ipv4 prefix is owned by me: ";
print net;
return false;
}
if peer_asn > 0 && bgp_path.first != peer_asn then {
printn "next hop is not the BGP neighbor (";
printn bgp_path.first;
printn " is not ";
printn peer_asn;
printn "): ";
print net;
return false;
}
return true;
}
function ipv6_prefix_can_be_exported(prefix set my_prefixes)
{
return net ~ my_prefixes;
}
#
# Uplink provider IPv4 BGP session
#
ipv4 table t_uplink_ipv4;
# Prefixes we want to export to uplink_ipv4.
protocol static export_to_uplink_ipv4 {
ipv4 {
table t_uplink_ipv4;
import all;
};
# send routes to table t_uplink_ipv4.
route 1.2.3.0/24 reject;
}
# Import filters.
filter uplink_ipv4_in
int peer_asn;
prefix set my_prefixes;
{
peer_asn = 64502;
my_prefixes = [
1.2.3.0/24+
];
if ipv4_prefix_can_be_imported(my_prefixes, peer_asn) then
accept;
reject "prefix cannot be imported";
}
# Export filters.
filter uplink_ipv4_out
prefix set my_prefixes;
{
my_prefixes = [
1.2.3.0/24
];
if ipv4_prefix_can_be_exported(my_prefixes) then {
accept;
}
printn "configuration error: tried to export prefix ";
print net;
reject;
}
# The BGP session.
protocol bgp uplink_ipv4 {
description "uplink_ipv4";
local as 64501;
neighbor 10.1.0.2 as 64502;
source address 10.1.0.1;
ipv4 {
table t_uplink_ipv4;
igp table master4;
import filter uplink_ipv4_in;
export filter uplink_ipv4_out;
};
}
# Send all routes learnt in the BGP session above to the central bgp table.
protocol pipe uplink_ipv4_into_bgp {
table t_bgp4;
peer table t_uplink_ipv4;
import where proto = "uplink_ipv4";
export none;
}
#
# Uplink provider IPv6 BGP session
#
ipv6 table t_uplink_ipv6;
# Prefixes we want to export to uplink_ipv6.
protocol static export_to_uplink_ipv6 {
ipv6 {
table t_uplink_ipv6;
import all;
};
# send routes to table t_uplink_ipv6.
route 2001:db8:123::/48 reject;
}
# Import filters.
filter uplink_ipv6_in
int peer_asn;
prefix set my_prefixes;
{
peer_asn = 64502;
my_prefixes = [
2001:db8:123::/48+
];
if ipv6_prefix_can_be_imported(my_prefixes, peer_asn) then
accept;
reject "prefix cannot be imported";
}
# Export filters.
filter uplink_ipv6_out
prefix set my_prefixes;
{
my_prefixes = [
2001:db8:123::/48
];
if ipv6_prefix_can_be_exported(my_prefixes) then {
accept;
}
printn "configuration error: tried to export prefix ";
print net;
reject;
}
# The BGP session.
protocol bgp uplink_ipv6 {
description "uplink_ipv6";
local as 64501;
neighbor 2001:db8:1::2 as 64502;
source address 2001:db8:1::1;
ipv6 {
table t_uplink_ipv6;
igp table master6;
import filter uplink_ipv6_in;
export filter uplink_ipv6_out;
};
}
# Send all routes learnt in the BGP session above to the central bgp table.
protocol pipe uplink_ipv6_into_bgp {
table t_bgp6;
peer table t_uplink_ipv6;
import where proto = "uplink_ipv6";
export none;
}
#
# iBGP IPv4 session with our router
#
protocol bgp router_ipv4 {
description "router_ipv4";
direct;
local as 64501;
neighbor 10.2.0.2 as 64501;
source address 10.2.0.1;
ipv4 {
table t_bgp4;
igp table master4;
next hop self;
import none;
export all;
};
}
#
# iBGP IPv6 session with our router
#
protocol bgp router_ipv6 {
description "router_ipv6";
direct;
local as 64501;
neighbor fd00:2::2 as 64501;
source address fd00:2::1;
ipv6 {
table t_bgp6;
igp table master6;
next hop self;
import none;
export all;
};
}