Skip to content

Commit 440a9f4

Browse files
committed
ovn: Add a case of policy based routing.
OVN currently supports multiple gateway routers (residing on different chassis) connected to the same logical topology. When external traffic enters the logical topology, they can enter from any gateway routers and reach its eventual destination. This is achieved with proper static routes configured on the gateway routers. But when traffic is initiated in the logical space by a logical port, we do not have a good way to distribute that traffic across multiple gateway routers. This commit introduces one particular way to do it. Based on the source IP address or source IP network of the packet, we can now jump to a specific gateway router. This is very useful for a specific use case of Kubernetes. When traffic is initiated inside a container heading to outside world, we want to be able to send such traffic outside the gateway router residing in the same host as that of the container. Since each host gets a specific subnet, we can use source IP address based policy routing to decide on the gateway router. Rationale for using the same routing table for both source and destination IP address based routing: Some hardware network vendors support policy routing in a different table on arbitrary "match". And when a packet enters, if there is a match in policy based routing table, the default routing table is not consulted at all. In case of OVN, we mainly want policy based routing for north-south traffic. We want east-west traffic to flow as-is. Creating a separate table for policy based routing complicates the configuration quite a bit. For e.g., if we have a source IP network based rule added, to decide a particular gateway router as a next hop, we should add rules at a higher priority for all the connected routes to make sure that east-west traffic is not effected in the policy based routing table itself. Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
1 parent 75fd74f commit 440a9f4

8 files changed

Lines changed: 334 additions & 39 deletions

File tree

NEWS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ Post-v2.6.0
44
* QoS is now implemented via egress shaping rather than ingress policing.
55
* DSCP marking is now supported, via the new northbound QoS table.
66
* IPAM now supports fixed MAC addresses.
7+
* Support for source IP address based routing.
78
- Fixed regression in table stats maintenance introduced in OVS
89
2.3.0, wherein the number of OpenFlow table hits and misses was
910
not accurate.

ovn/northd/ovn-northd.c

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3247,10 +3247,20 @@ find_lrp_member_ip(const struct ovn_port *op, const char *ip_s)
32473247
static void
32483248
add_route(struct hmap *lflows, const struct ovn_port *op,
32493249
const char *lrp_addr_s, const char *network_s, int plen,
3250-
const char *gateway)
3250+
const char *gateway, const char *policy)
32513251
{
32523252
bool is_ipv4 = strchr(network_s, '.') ? true : false;
32533253
struct ds match = DS_EMPTY_INITIALIZER;
3254+
const char *dir;
3255+
uint16_t priority;
3256+
3257+
if (policy && !strcmp(policy, "src-ip")) {
3258+
dir = "src";
3259+
priority = plen * 2;
3260+
} else {
3261+
dir = "dst";
3262+
priority = (plen * 2) + 1;
3263+
}
32543264

32553265
/* IPv6 link-local addresses must be scoped to the local router port. */
32563266
if (!is_ipv4) {
@@ -3260,7 +3270,7 @@ add_route(struct hmap *lflows, const struct ovn_port *op,
32603270
ds_put_format(&match, "inport == %s && ", op->json_key);
32613271
}
32623272
}
3263-
ds_put_format(&match, "ip%s.dst == %s/%d", is_ipv4 ? "4" : "6",
3273+
ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : "6", dir,
32643274
network_s, plen);
32653275

32663276
struct ds actions = DS_EMPTY_INITIALIZER;
@@ -3284,7 +3294,7 @@ add_route(struct hmap *lflows, const struct ovn_port *op,
32843294

32853295
/* The priority here is calculated to implement longest-prefix-match
32863296
* routing. */
3287-
ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, plen,
3297+
ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, priority,
32883298
ds_cstr(&match), ds_cstr(&actions));
32893299
ds_destroy(&match);
32903300
ds_destroy(&actions);
@@ -3397,7 +3407,9 @@ build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
33973407
goto free_prefix_s;
33983408
}
33993409

3400-
add_route(lflows, out_port, lrp_addr_s, prefix_s, plen, route->nexthop);
3410+
char *policy = route->policy ? route->policy : "dst-ip";
3411+
add_route(lflows, out_port, lrp_addr_s, prefix_s, plen, route->nexthop,
3412+
policy);
34013413

34023414
free_prefix_s:
34033415
free(prefix_s);
@@ -4031,13 +4043,13 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
40314043
for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
40324044
add_route(lflows, op, op->lrp_networks.ipv4_addrs[i].addr_s,
40334045
op->lrp_networks.ipv4_addrs[i].network_s,
4034-
op->lrp_networks.ipv4_addrs[i].plen, NULL);
4046+
op->lrp_networks.ipv4_addrs[i].plen, NULL, NULL);
40354047
}
40364048

40374049
for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
40384050
add_route(lflows, op, op->lrp_networks.ipv6_addrs[i].addr_s,
40394051
op->lrp_networks.ipv6_addrs[i].network_s,
4040-
op->lrp_networks.ipv6_addrs[i].plen, NULL);
4052+
op->lrp_networks.ipv6_addrs[i].plen, NULL, NULL);
40414053
}
40424054
}
40434055

ovn/ovn-nb.ovsschema

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"name": "OVN_Northbound",
3-
"version": "5.4.0",
4-
"cksum": "4176761817 11225",
3+
"version": "5.4.1",
4+
"cksum": "3773248894 11490",
55
"tables": {
66
"NB_Global": {
77
"columns": {
@@ -196,6 +196,10 @@
196196
"Logical_Router_Static_Route": {
197197
"columns": {
198198
"ip_prefix": {"type": "string"},
199+
"policy": {"type": {"key": {"type": "string",
200+
"enum": ["set", ["src-ip",
201+
"dst-ip"]]},
202+
"min": 0, "max": 1}},
199203
"nexthop": {"type": "string"},
200204
"output_port": {"type": {"key": "string", "min": 0, "max": 1}}},
201205
"isRoot": false},

ovn/ovn-nb.xml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1083,12 +1083,40 @@
10831083
Each record represents a static route.
10841084
</p>
10851085

1086+
<p>
1087+
When multiple routes match a packet, the longest-prefix match is chosen.
1088+
For a given prefix length, a <code>dst-ip</code> route is preferred over
1089+
a <code>src-ip</code> route.
1090+
</p>
1091+
10861092
<column name="ip_prefix">
10871093
<p>
10881094
IP prefix of this route (e.g. 192.168.100.0/24).
10891095
</p>
10901096
</column>
10911097

1098+
<column name="policy">
1099+
<p>
1100+
If it is specified, this setting describes the policy used to make
1101+
routing decisions. This setting must be one of the following strings:
1102+
</p>
1103+
<ul>
1104+
<li>
1105+
<code>src-ip</code>: This policy sends the packet to the
1106+
<ref column="nexthop"/> when the packet's source IP address matches
1107+
<ref column="ip_prefix"/>.
1108+
</li>
1109+
<li>
1110+
<code>dst-ip</code>: This policy sends the packet to the
1111+
<ref column="nexthop"/> when the packet's destination IP address
1112+
matches <ref column="ip_prefix"/>.
1113+
</li>
1114+
</ul>
1115+
<p>
1116+
If not specified, the default is <code>dst-ip</code>.
1117+
</p>
1118+
</column>
1119+
10921120
<column name="nexthop">
10931121
<p>
10941122
Nexthop IP address for this route. Nexthop IP address should be the IP

ovn/utilities/ovn-nbctl.8.xml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -380,7 +380,7 @@
380380
<h1>Logical Router Static Route Commands</h1>
381381

382382
<dl>
383-
<dt>[<code>--may-exist</code>] <code>lr-route-add</code> <var>router</var> <var>prefix</var> <var>nexthop</var> [<var>port</var>]</dt>
383+
<dt>[<code>--may-exist</code>] [<code>--policy</code>=<var>POLICY</var>] <code>lr-route-add</code> <var>router</var> <var>prefix</var> <var>nexthop</var> [<var>port</var>]</dt>
384384
<dd>
385385
<p>
386386
Adds the specified route to <var>router</var>.
@@ -395,6 +395,12 @@
395395
on <var>nexthop</var>.
396396
</p>
397397

398+
<p>
399+
<code>--policy</code> describes the policy used to make routing
400+
decisions. This should be one of "dst-ip" or "src-ip". If not
401+
specified, the default is "dst-ip".
402+
</p>
403+
398404
<p>
399405
It is an error if a route with <var>prefix</var> already exists,
400406
unless <code>--may-exist</code> is specified.

ovn/utilities/ovn-nbctl.c

Lines changed: 32 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -377,7 +377,7 @@ Logical router port commands:\n\
377377
('enabled' or 'disabled')\n\
378378
\n\
379379
Route commands:\n\
380-
lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
380+
[--policy=POLICY] lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
381381
add a route to ROUTER\n\
382382
lr-route-del ROUTER [PREFIX]\n\
383383
remove routes from ROUTER\n\
@@ -2031,6 +2031,11 @@ nbctl_lr_route_add(struct ctl_context *ctx)
20312031
lr = lr_by_name_or_uuid(ctx, ctx->argv[1], true);
20322032
char *prefix, *next_hop;
20332033

2034+
const char *policy = shash_find_data(&ctx->options, "--policy");
2035+
if (policy && strcmp(policy, "src-ip") && strcmp(policy, "dst-ip")) {
2036+
ctl_fatal("bad policy: %s", policy);
2037+
}
2038+
20342039
prefix = normalize_prefix_str(ctx->argv[2]);
20352040
if (!prefix) {
20362041
ctl_fatal("bad prefix argument: %s", ctx->argv[2]);
@@ -2091,6 +2096,9 @@ nbctl_lr_route_add(struct ctl_context *ctx)
20912096
nbrec_logical_router_static_route_set_output_port(route,
20922097
ctx->argv[4]);
20932098
}
2099+
if (policy) {
2100+
nbrec_logical_router_static_route_set_policy(route, policy);
2101+
}
20942102
free(rt_prefix);
20952103
free(next_hop);
20962104
free(prefix);
@@ -2104,6 +2112,9 @@ nbctl_lr_route_add(struct ctl_context *ctx)
21042112
if (ctx->argc == 5) {
21052113
nbrec_logical_router_static_route_set_output_port(route, ctx->argv[4]);
21062114
}
2115+
if (policy) {
2116+
nbrec_logical_router_static_route_set_policy(route, policy);
2117+
}
21072118

21082119
nbrec_logical_router_verify_static_routes(lr);
21092120
struct nbrec_logical_router_static_route **new_routes
@@ -2457,7 +2468,7 @@ nbctl_lrp_get_enabled(struct ctl_context *ctx)
24572468
}
24582469

24592470
struct ipv4_route {
2460-
int plen;
2471+
int priority;
24612472
ovs_be32 addr;
24622473
const struct nbrec_logical_router_static_route *route;
24632474
};
@@ -2468,8 +2479,8 @@ ipv4_route_cmp(const void *route1_, const void *route2_)
24682479
const struct ipv4_route *route1p = route1_;
24692480
const struct ipv4_route *route2p = route2_;
24702481

2471-
if (route1p->plen != route2p->plen) {
2472-
return route1p->plen > route2p->plen ? -1 : 1;
2482+
if (route1p->priority != route2p->priority) {
2483+
return route1p->priority > route2p->priority ? -1 : 1;
24732484
} else if (route1p->addr != route2p->addr) {
24742485
return ntohl(route1p->addr) < ntohl(route2p->addr) ? -1 : 1;
24752486
} else {
@@ -2478,7 +2489,7 @@ ipv4_route_cmp(const void *route1_, const void *route2_)
24782489
}
24792490

24802491
struct ipv6_route {
2481-
int plen;
2492+
int priority;
24822493
struct in6_addr addr;
24832494
const struct nbrec_logical_router_static_route *route;
24842495
};
@@ -2489,8 +2500,8 @@ ipv6_route_cmp(const void *route1_, const void *route2_)
24892500
const struct ipv6_route *route1p = route1_;
24902501
const struct ipv6_route *route2p = route2_;
24912502

2492-
if (route1p->plen != route2p->plen) {
2493-
return route1p->plen > route2p->plen ? -1 : 1;
2503+
if (route1p->priority != route2p->priority) {
2504+
return route1p->priority > route2p->priority ? -1 : 1;
24942505
}
24952506
return memcmp(&route1p->addr, &route2p->addr, sizeof(route1p->addr));
24962507
}
@@ -2505,6 +2516,12 @@ print_route(const struct nbrec_logical_router_static_route *route, struct ds *s)
25052516
free(prefix);
25062517
free(next_hop);
25072518

2519+
if (route->policy) {
2520+
ds_put_format(s, " %s", route->policy);
2521+
} else {
2522+
ds_put_format(s, " %s", "dst-ip");
2523+
}
2524+
25082525
if (route->output_port) {
25092526
ds_put_format(s, " %s", route->output_port);
25102527
}
@@ -2530,11 +2547,13 @@ nbctl_lr_route_list(struct ctl_context *ctx)
25302547
= lr->static_routes[i];
25312548
unsigned int plen;
25322549
ovs_be32 ipv4;
2550+
const char *policy = route->policy ? route->policy : "dst-ip";
25332551
char *error;
2534-
25352552
error = ip_parse_cidr(route->ip_prefix, &ipv4, &plen);
25362553
if (!error) {
2537-
ipv4_routes[n_ipv4_routes].plen = plen;
2554+
ipv4_routes[n_ipv4_routes].priority = !strcmp(policy, "dst-ip")
2555+
? (2 * plen) + 1
2556+
: 2 * plen;
25382557
ipv4_routes[n_ipv4_routes].addr = ipv4;
25392558
ipv4_routes[n_ipv4_routes].route = route;
25402559
n_ipv4_routes++;
@@ -2544,7 +2563,9 @@ nbctl_lr_route_list(struct ctl_context *ctx)
25442563
struct in6_addr ipv6;
25452564
error = ipv6_parse_cidr(route->ip_prefix, &ipv6, &plen);
25462565
if (!error) {
2547-
ipv6_routes[n_ipv6_routes].plen = plen;
2566+
ipv6_routes[n_ipv6_routes].priority = !strcmp(policy, "dst-ip")
2567+
? (2 * plen) + 1
2568+
: 2 * plen;
25482569
ipv6_routes[n_ipv6_routes].addr = ipv6;
25492570
ipv6_routes[n_ipv6_routes].route = route;
25502571
n_ipv6_routes++;
@@ -2947,7 +2968,7 @@ static const struct ctl_command_syntax nbctl_commands[] = {
29472968

29482969
/* logical router route commands. */
29492970
{ "lr-route-add", 3, 4, "ROUTER PREFIX NEXTHOP [PORT]", NULL,
2950-
nbctl_lr_route_add, NULL, "--may-exist", RW },
2971+
nbctl_lr_route_add, NULL, "--may-exist,--policy=", RW },
29512972
{ "lr-route-del", 1, 2, "ROUTER [PREFIX]", NULL, nbctl_lr_route_del,
29522973
NULL, "--if-exists", RW },
29532974
{ "lr-route-list", 1, 1, "ROUTER", NULL, nbctl_lr_route_list, NULL,

tests/ovn-nbctl.at

Lines changed: 23 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -657,20 +657,23 @@ AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1/64], [
657657
])
658658

659659
AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1])
660+
AT_CHECK([ovn-nbctl --policy=src-ip lr-route-add lr0 9.16.1.0/24 11.0.0.1])
660661

661662
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
662663
IPv4 Routes
663-
10.0.0.0/24 11.0.0.1
664-
10.0.1.0/24 11.0.1.1 lp0
665-
0.0.0.0/0 192.168.0.1
664+
10.0.0.0/24 11.0.0.1 dst-ip
665+
10.0.1.0/24 11.0.1.1 dst-ip lp0
666+
9.16.1.0/24 11.0.0.1 src-ip
667+
0.0.0.0/0 192.168.0.1 dst-ip
666668
])
667669

668670
AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1 lp1])
669671
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
670672
IPv4 Routes
671-
10.0.0.0/24 11.0.0.1 lp1
672-
10.0.1.0/24 11.0.1.1 lp0
673-
0.0.0.0/0 192.168.0.1
673+
10.0.0.0/24 11.0.0.1 dst-ip lp1
674+
10.0.1.0/24 11.0.1.1 dst-ip lp0
675+
9.16.1.0/24 11.0.0.1 src-ip
676+
0.0.0.0/0 192.168.0.1 dst-ip
674677
])
675678

676679
dnl Delete non-existent prefix
@@ -680,11 +683,12 @@ AT_CHECK([ovn-nbctl lr-route-del lr0 10.0.2.1/24], [1], [],
680683
AT_CHECK([ovn-nbctl --if-exists lr-route-del lr0 10.0.2.1/24])
681684

682685
AT_CHECK([ovn-nbctl lr-route-del lr0 10.0.1.1/24])
686+
AT_CHECK([ovn-nbctl lr-route-del lr0 9.16.1.0/24])
683687

684688
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
685689
IPv4 Routes
686-
10.0.0.0/24 11.0.0.1 lp1
687-
0.0.0.0/0 192.168.0.1
690+
10.0.0.0/24 11.0.0.1 dst-ip lp1
691+
0.0.0.0/0 192.168.0.1 dst-ip
688692
])
689693

690694
AT_CHECK([ovn-nbctl lr-route-del lr0])
@@ -698,17 +702,17 @@ AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])
698702

699703
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
700704
IPv6 Routes
701-
2001:db8::/64 2001:db8:0:f102::1 lp0
702-
2001:db8:1::/64 2001:db8:0:f103::1
703-
::/0 2001:db8:0:f101::1
705+
2001:db8::/64 2001:db8:0:f102::1 dst-ip lp0
706+
2001:db8:1::/64 2001:db8:0:f103::1 dst-ip
707+
::/0 2001:db8:0:f101::1 dst-ip
704708
])
705709

706710
AT_CHECK([ovn-nbctl lr-route-del lr0 2001:0db8:0::/64])
707711

708712
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
709713
IPv6 Routes
710-
2001:db8:1::/64 2001:db8:0:f103::1
711-
::/0 2001:db8:0:f101::1
714+
2001:db8:1::/64 2001:db8:0:f103::1 dst-ip
715+
::/0 2001:db8:0:f101::1 dst-ip
712716
])
713717

714718
AT_CHECK([ovn-nbctl lr-route-del lr0])
@@ -725,14 +729,14 @@ AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])
725729

726730
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
727731
IPv4 Routes
728-
10.0.0.0/24 11.0.0.1
729-
10.0.1.0/24 11.0.1.1 lp0
730-
0.0.0.0/0 192.168.0.1
732+
10.0.0.0/24 11.0.0.1 dst-ip
733+
10.0.1.0/24 11.0.1.1 dst-ip lp0
734+
0.0.0.0/0 192.168.0.1 dst-ip
731735

732736
IPv6 Routes
733-
2001:db8::/64 2001:db8:0:f102::1 lp0
734-
2001:db8:1::/64 2001:db8:0:f103::1
735-
::/0 2001:db8:0:f101::1
737+
2001:db8::/64 2001:db8:0:f102::1 dst-ip lp0
738+
2001:db8:1::/64 2001:db8:0:f103::1 dst-ip
739+
::/0 2001:db8:0:f101::1 dst-ip
736740
])
737741

738742
OVN_NBCTL_TEST_STOP

0 commit comments

Comments
 (0)