Permalink
Browse files

ovn: Add a case of policy based routing.

OVN currently supports multiple gateway routers (residing on
different chassis) connected to the same logical topology.

When external traffic enters the logical topology, they can enter
from any gateway routers and reach its eventual destination. This
is achieved with proper static routes configured on the gateway
routers.

But when traffic is initiated in the logical space by a logical
port, we do not have a good way to distribute that traffic across
multiple gateway routers.

This commit introduces one particular way to do it. Based on the
source IP address or source IP network of the packet, we can now
jump to a specific gateway router.

This is very useful for a specific use case of Kubernetes.
When traffic is initiated inside a container heading to outside world,
we want to be able to send such traffic outside the gateway router
residing in the same host as that of the container. Since each
host gets a specific subnet, we can use source IP address based
policy routing to decide on the gateway router.

Rationale for using the same routing table for both source and
destination IP address based routing:

Some hardware network vendors support policy routing in a different table
on arbitrary "match".  And when a packet enters, if there is a match
in policy based routing table, the default routing table is not
consulted at all.  In case of OVN, we mainly want policy based routing
for north-south traffic. We want east-west traffic to flow as-is. Creating
a separate table for policy based routing complicates the configuration
quite a bit. For e.g., if we have a source IP network based rule added,
to decide a particular gateway router as a next hop, we should add rules at
a higher priority for all the connected routes to make sure that east-west
traffic is not effected in the policy based routing table itself.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
1 parent 75fd74f commit 440a9f4b32bead4fa82d93b1fdceed3e55c20b4b @shettyg shettyg committed Oct 6, 2016
Showing with 334 additions and 39 deletions.
  1. +1 −0 NEWS
  2. +18 −6 ovn/northd/ovn-northd.c
  3. +6 −2 ovn/ovn-nb.ovsschema
  4. +28 −0 ovn/ovn-nb.xml
  5. +7 −1 ovn/utilities/ovn-nbctl.8.xml
  6. +32 −11 ovn/utilities/ovn-nbctl.c
  7. +23 −19 tests/ovn-nbctl.at
  8. +219 −0 tests/ovn.at
View
@@ -4,6 +4,7 @@ Post-v2.6.0
* QoS is now implemented via egress shaping rather than ingress policing.
* DSCP marking is now supported, via the new northbound QoS table.
* IPAM now supports fixed MAC addresses.
+ * Support for source IP address based routing.
- Fixed regression in table stats maintenance introduced in OVS
2.3.0, wherein the number of OpenFlow table hits and misses was
not accurate.
@@ -3247,10 +3247,20 @@ find_lrp_member_ip(const struct ovn_port *op, const char *ip_s)
static void
add_route(struct hmap *lflows, const struct ovn_port *op,
const char *lrp_addr_s, const char *network_s, int plen,
- const char *gateway)
+ const char *gateway, const char *policy)
{
bool is_ipv4 = strchr(network_s, '.') ? true : false;
struct ds match = DS_EMPTY_INITIALIZER;
+ const char *dir;
+ uint16_t priority;
+
+ if (policy && !strcmp(policy, "src-ip")) {
+ dir = "src";
+ priority = plen * 2;
+ } else {
+ dir = "dst";
+ priority = (plen * 2) + 1;
+ }
/* IPv6 link-local addresses must be scoped to the local router port. */
if (!is_ipv4) {
@@ -3260,7 +3270,7 @@ add_route(struct hmap *lflows, const struct ovn_port *op,
ds_put_format(&match, "inport == %s && ", op->json_key);
}
}
- ds_put_format(&match, "ip%s.dst == %s/%d", is_ipv4 ? "4" : "6",
+ ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : "6", dir,
network_s, plen);
struct ds actions = DS_EMPTY_INITIALIZER;
@@ -3284,7 +3294,7 @@ add_route(struct hmap *lflows, const struct ovn_port *op,
/* The priority here is calculated to implement longest-prefix-match
* routing. */
- ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, plen,
+ ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, priority,
ds_cstr(&match), ds_cstr(&actions));
ds_destroy(&match);
ds_destroy(&actions);
@@ -3397,7 +3407,9 @@ build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
goto free_prefix_s;
}
- add_route(lflows, out_port, lrp_addr_s, prefix_s, plen, route->nexthop);
+ char *policy = route->policy ? route->policy : "dst-ip";
+ add_route(lflows, out_port, lrp_addr_s, prefix_s, plen, route->nexthop,
+ policy);
free_prefix_s:
free(prefix_s);
@@ -4031,13 +4043,13 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
add_route(lflows, op, op->lrp_networks.ipv4_addrs[i].addr_s,
op->lrp_networks.ipv4_addrs[i].network_s,
- op->lrp_networks.ipv4_addrs[i].plen, NULL);
+ op->lrp_networks.ipv4_addrs[i].plen, NULL, NULL);
}
for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
add_route(lflows, op, op->lrp_networks.ipv6_addrs[i].addr_s,
op->lrp_networks.ipv6_addrs[i].network_s,
- op->lrp_networks.ipv6_addrs[i].plen, NULL);
+ op->lrp_networks.ipv6_addrs[i].plen, NULL, NULL);
}
}
@@ -1,7 +1,7 @@
{
"name": "OVN_Northbound",
- "version": "5.4.0",
- "cksum": "4176761817 11225",
+ "version": "5.4.1",
+ "cksum": "3773248894 11490",
"tables": {
"NB_Global": {
"columns": {
@@ -196,6 +196,10 @@
"Logical_Router_Static_Route": {
"columns": {
"ip_prefix": {"type": "string"},
+ "policy": {"type": {"key": {"type": "string",
+ "enum": ["set", ["src-ip",
+ "dst-ip"]]},
+ "min": 0, "max": 1}},
"nexthop": {"type": "string"},
"output_port": {"type": {"key": "string", "min": 0, "max": 1}}},
"isRoot": false},
View
@@ -1083,12 +1083,40 @@
Each record represents a static route.
</p>
+ <p>
+ When multiple routes match a packet, the longest-prefix match is chosen.
+ For a given prefix length, a <code>dst-ip</code> route is preferred over
+ a <code>src-ip</code> route.
+ </p>
+
<column name="ip_prefix">
<p>
IP prefix of this route (e.g. 192.168.100.0/24).
</p>
</column>
+ <column name="policy">
+ <p>
+ If it is specified, this setting describes the policy used to make
+ routing decisions. This setting must be one of the following strings:
+ </p>
+ <ul>
+ <li>
+ <code>src-ip</code>: This policy sends the packet to the
+ <ref column="nexthop"/> when the packet's source IP address matches
+ <ref column="ip_prefix"/>.
+ </li>
+ <li>
+ <code>dst-ip</code>: This policy sends the packet to the
+ <ref column="nexthop"/> when the packet's destination IP address
+ matches <ref column="ip_prefix"/>.
+ </li>
+ </ul>
+ <p>
+ If not specified, the default is <code>dst-ip</code>.
+ </p>
+ </column>
+
<column name="nexthop">
<p>
Nexthop IP address for this route. Nexthop IP address should be the IP
@@ -380,7 +380,7 @@
<h1>Logical Router Static Route Commands</h1>
<dl>
- <dt>[<code>--may-exist</code>] <code>lr-route-add</code> <var>router</var> <var>prefix</var> <var>nexthop</var> [<var>port</var>]</dt>
+ <dt>[<code>--may-exist</code>] [<code>--policy</code>=<var>POLICY</var>] <code>lr-route-add</code> <var>router</var> <var>prefix</var> <var>nexthop</var> [<var>port</var>]</dt>
<dd>
<p>
Adds the specified route to <var>router</var>.
@@ -396,6 +396,12 @@
</p>
<p>
+ <code>--policy</code> describes the policy used to make routing
+ decisions. This should be one of "dst-ip" or "src-ip". If not
+ specified, the default is "dst-ip".
+ </p>
+
+ <p>
It is an error if a route with <var>prefix</var> already exists,
unless <code>--may-exist</code> is specified.
</p>
@@ -377,7 +377,7 @@ Logical router port commands:\n\
('enabled' or 'disabled')\n\
\n\
Route commands:\n\
- lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
+ [--policy=POLICY] lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
add a route to ROUTER\n\
lr-route-del ROUTER [PREFIX]\n\
remove routes from ROUTER\n\
@@ -2031,6 +2031,11 @@ nbctl_lr_route_add(struct ctl_context *ctx)
lr = lr_by_name_or_uuid(ctx, ctx->argv[1], true);
char *prefix, *next_hop;
+ const char *policy = shash_find_data(&ctx->options, "--policy");
+ if (policy && strcmp(policy, "src-ip") && strcmp(policy, "dst-ip")) {
+ ctl_fatal("bad policy: %s", policy);
+ }
+
prefix = normalize_prefix_str(ctx->argv[2]);
if (!prefix) {
ctl_fatal("bad prefix argument: %s", ctx->argv[2]);
@@ -2091,6 +2096,9 @@ nbctl_lr_route_add(struct ctl_context *ctx)
nbrec_logical_router_static_route_set_output_port(route,
ctx->argv[4]);
}
+ if (policy) {
+ nbrec_logical_router_static_route_set_policy(route, policy);
+ }
free(rt_prefix);
free(next_hop);
free(prefix);
@@ -2104,6 +2112,9 @@ nbctl_lr_route_add(struct ctl_context *ctx)
if (ctx->argc == 5) {
nbrec_logical_router_static_route_set_output_port(route, ctx->argv[4]);
}
+ if (policy) {
+ nbrec_logical_router_static_route_set_policy(route, policy);
+ }
nbrec_logical_router_verify_static_routes(lr);
struct nbrec_logical_router_static_route **new_routes
@@ -2457,7 +2468,7 @@ nbctl_lrp_get_enabled(struct ctl_context *ctx)
}
struct ipv4_route {
- int plen;
+ int priority;
ovs_be32 addr;
const struct nbrec_logical_router_static_route *route;
};
@@ -2468,8 +2479,8 @@ ipv4_route_cmp(const void *route1_, const void *route2_)
const struct ipv4_route *route1p = route1_;
const struct ipv4_route *route2p = route2_;
- if (route1p->plen != route2p->plen) {
- return route1p->plen > route2p->plen ? -1 : 1;
+ if (route1p->priority != route2p->priority) {
+ return route1p->priority > route2p->priority ? -1 : 1;
} else if (route1p->addr != route2p->addr) {
return ntohl(route1p->addr) < ntohl(route2p->addr) ? -1 : 1;
} else {
@@ -2478,7 +2489,7 @@ ipv4_route_cmp(const void *route1_, const void *route2_)
}
struct ipv6_route {
- int plen;
+ int priority;
struct in6_addr addr;
const struct nbrec_logical_router_static_route *route;
};
@@ -2489,8 +2500,8 @@ ipv6_route_cmp(const void *route1_, const void *route2_)
const struct ipv6_route *route1p = route1_;
const struct ipv6_route *route2p = route2_;
- if (route1p->plen != route2p->plen) {
- return route1p->plen > route2p->plen ? -1 : 1;
+ if (route1p->priority != route2p->priority) {
+ return route1p->priority > route2p->priority ? -1 : 1;
}
return memcmp(&route1p->addr, &route2p->addr, sizeof(route1p->addr));
}
@@ -2505,6 +2516,12 @@ print_route(const struct nbrec_logical_router_static_route *route, struct ds *s)
free(prefix);
free(next_hop);
+ if (route->policy) {
+ ds_put_format(s, " %s", route->policy);
+ } else {
+ ds_put_format(s, " %s", "dst-ip");
+ }
+
if (route->output_port) {
ds_put_format(s, " %s", route->output_port);
}
@@ -2530,11 +2547,13 @@ nbctl_lr_route_list(struct ctl_context *ctx)
= lr->static_routes[i];
unsigned int plen;
ovs_be32 ipv4;
+ const char *policy = route->policy ? route->policy : "dst-ip";
char *error;
-
error = ip_parse_cidr(route->ip_prefix, &ipv4, &plen);
if (!error) {
- ipv4_routes[n_ipv4_routes].plen = plen;
+ ipv4_routes[n_ipv4_routes].priority = !strcmp(policy, "dst-ip")
+ ? (2 * plen) + 1
+ : 2 * plen;
ipv4_routes[n_ipv4_routes].addr = ipv4;
ipv4_routes[n_ipv4_routes].route = route;
n_ipv4_routes++;
@@ -2544,7 +2563,9 @@ nbctl_lr_route_list(struct ctl_context *ctx)
struct in6_addr ipv6;
error = ipv6_parse_cidr(route->ip_prefix, &ipv6, &plen);
if (!error) {
- ipv6_routes[n_ipv6_routes].plen = plen;
+ ipv6_routes[n_ipv6_routes].priority = !strcmp(policy, "dst-ip")
+ ? (2 * plen) + 1
+ : 2 * plen;
ipv6_routes[n_ipv6_routes].addr = ipv6;
ipv6_routes[n_ipv6_routes].route = route;
n_ipv6_routes++;
@@ -2947,7 +2968,7 @@ static const struct ctl_command_syntax nbctl_commands[] = {
/* logical router route commands. */
{ "lr-route-add", 3, 4, "ROUTER PREFIX NEXTHOP [PORT]", NULL,
- nbctl_lr_route_add, NULL, "--may-exist", RW },
+ nbctl_lr_route_add, NULL, "--may-exist,--policy=", RW },
{ "lr-route-del", 1, 2, "ROUTER [PREFIX]", NULL, nbctl_lr_route_del,
NULL, "--if-exists", RW },
{ "lr-route-list", 1, 1, "ROUTER", NULL, nbctl_lr_route_list, NULL,
View
@@ -657,20 +657,23 @@ AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1/64], [
])
AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1])
+AT_CHECK([ovn-nbctl --policy=src-ip lr-route-add lr0 9.16.1.0/24 11.0.0.1])
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv4 Routes
- 10.0.0.0/24 11.0.0.1
- 10.0.1.0/24 11.0.1.1 lp0
- 0.0.0.0/0 192.168.0.1
+ 10.0.0.0/24 11.0.0.1 dst-ip
+ 10.0.1.0/24 11.0.1.1 dst-ip lp0
+ 9.16.1.0/24 11.0.0.1 src-ip
+ 0.0.0.0/0 192.168.0.1 dst-ip
])
AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1 lp1])
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv4 Routes
- 10.0.0.0/24 11.0.0.1 lp1
- 10.0.1.0/24 11.0.1.1 lp0
- 0.0.0.0/0 192.168.0.1
+ 10.0.0.0/24 11.0.0.1 dst-ip lp1
+ 10.0.1.0/24 11.0.1.1 dst-ip lp0
+ 9.16.1.0/24 11.0.0.1 src-ip
+ 0.0.0.0/0 192.168.0.1 dst-ip
])
dnl Delete non-existent prefix
@@ -680,11 +683,12 @@ AT_CHECK([ovn-nbctl lr-route-del lr0 10.0.2.1/24], [1], [],
AT_CHECK([ovn-nbctl --if-exists lr-route-del lr0 10.0.2.1/24])
AT_CHECK([ovn-nbctl lr-route-del lr0 10.0.1.1/24])
+AT_CHECK([ovn-nbctl lr-route-del lr0 9.16.1.0/24])
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv4 Routes
- 10.0.0.0/24 11.0.0.1 lp1
- 0.0.0.0/0 192.168.0.1
+ 10.0.0.0/24 11.0.0.1 dst-ip lp1
+ 0.0.0.0/0 192.168.0.1 dst-ip
])
AT_CHECK([ovn-nbctl lr-route-del lr0])
@@ -698,17 +702,17 @@ AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv6 Routes
- 2001:db8::/64 2001:db8:0:f102::1 lp0
- 2001:db8:1::/64 2001:db8:0:f103::1
- ::/0 2001:db8:0:f101::1
+ 2001:db8::/64 2001:db8:0:f102::1 dst-ip lp0
+ 2001:db8:1::/64 2001:db8:0:f103::1 dst-ip
+ ::/0 2001:db8:0:f101::1 dst-ip
])
AT_CHECK([ovn-nbctl lr-route-del lr0 2001:0db8:0::/64])
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv6 Routes
- 2001:db8:1::/64 2001:db8:0:f103::1
- ::/0 2001:db8:0:f101::1
+ 2001:db8:1::/64 2001:db8:0:f103::1 dst-ip
+ ::/0 2001:db8:0:f101::1 dst-ip
])
AT_CHECK([ovn-nbctl lr-route-del lr0])
@@ -725,14 +729,14 @@ AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv4 Routes
- 10.0.0.0/24 11.0.0.1
- 10.0.1.0/24 11.0.1.1 lp0
- 0.0.0.0/0 192.168.0.1
+ 10.0.0.0/24 11.0.0.1 dst-ip
+ 10.0.1.0/24 11.0.1.1 dst-ip lp0
+ 0.0.0.0/0 192.168.0.1 dst-ip
IPv6 Routes
- 2001:db8::/64 2001:db8:0:f102::1 lp0
- 2001:db8:1::/64 2001:db8:0:f103::1
- ::/0 2001:db8:0:f101::1
+ 2001:db8::/64 2001:db8:0:f102::1 dst-ip lp0
+ 2001:db8:1::/64 2001:db8:0:f103::1 dst-ip
+ ::/0 2001:db8:0:f101::1 dst-ip
])
OVN_NBCTL_TEST_STOP
Oops, something went wrong.

0 comments on commit 440a9f4

Please sign in to comment.