Skip to content

Commit

Permalink
Fix joold advertise
Browse files Browse the repository at this point in the history
Had to rewrite kernelside joold again. New, better design. Implements
joold advertise (because it somehow used to be a no-op), while keeping
busy looping and packet allocations outside of the spinlock.

Deprecates ss-max-payload in favor of ss-max-sessions-per-packet,
partly because the latter is more intuitive (hopefully), and partly
because the former was trickier with the new implementation.

Also, please note that the ss-capacity warning changed:

> joold: Too many sessions deferred! I need to drop some; sorry.

Also tweaked the documentation a little. For some reason, it was
parroting that the channel between joolds is TCP, when it's supposed to
be UDP. Also patched some broken links.

Fixes #410.
  • Loading branch information
ydahhrk committed Aug 11, 2023
1 parent 07e6fd9 commit 4fcfe18
Show file tree
Hide file tree
Showing 14 changed files with 880 additions and 373 deletions.
3 changes: 2 additions & 1 deletion docs/en/config-atomic.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,8 @@ Also, `pool6` is mandatory and immutable (as normal). It must be set during inst
"<a href="usr-flags-global.html#ss-flush-asap">ss-flush-asap</a>": true,
"<a href="usr-flags-global.html#ss-flush-deadline">ss-flush-deadline</a>": 2000,
"<a href="usr-flags-global.html#ss-capacity">ss-capacity</a>": 512,
"<a href="usr-flags-global.html#ss-max-payload">ss-max-payload</a>": 1452
"<a href="usr-flags-global.html#ss-max-payload">ss-max-payload</a>": 1452,
"<a href="usr-flags-global.html#ss-max-sessions-per-packet">ss-max-sessions-per-packet</a>": 10
},

"<a href="usr-flags-pool4.html">pool4</a>": [
Expand Down
18 changes: 10 additions & 8 deletions docs/en/session-synchronization.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,14 +84,14 @@ Why are the daemons necessary? because kernel modules cannot open IP sockets; at

Synchronizing sessions is _all_ the daemons do; the traffic redirection part is delegated to other protocols. [Keepalived](http://www.keepalived.org/) is the implementation that takes care of this in the sample configuration below, but any other load balancer should also get the job done.

In this proposed/inauguratory implementation, SS traffic is distributed through an IPv4 or IPv6 unencrypted TCP connection. You might want to cast votes on the issue tracker or propose code if you favor some other solution.
In this proposed/inauguratory implementation, SS traffic is distributed through an IPv4 or IPv6 unencrypted UDP connection. You might want to cast votes on the issue tracker or propose code if you favor some other solution.

There are two operation modes in which SS can be used:

1. Active/Passive: One Jool instance serves traffic at any given time, the other ones serve as backup. The load balancer redirects traffic when the current active NAT64 dies.
2. Active/Active: All Jool instances serve traffic. The load balancer distributes traffic so no NAT64 is too heavily encumbered.

Active/Active is discouraged because the session synchronization across Jool instances does not lock and is not instantaneous; if the translating traffic is faster, the session tables can end up desynchronized. Users will perceive this mainly as difficulties opening connections through the translators.
> ![Warning!](../images/warning.svg) Active/Active is discouraged because the session synchronization across Jool instances does not lock and is not instantaneous; if the translating traffic is faster, the session tables can end up desynchronized. Users will perceive this mainly as difficulties opening connections through the translators.
It is also important to note that SS is relatively resource-intensive; its traffic is not only _extra_ traffic, but it must also do two full U-turns to userspace before reaching its destination:

Expand Down Expand Up @@ -167,6 +167,8 @@ This is generally usual boilerplate Jool mumbo jumbo. `2001:db8::4-5` and `192.0

It is important to note that every translator instance must have the same configuration as the other ones before SS is started. Make sure you've manually synchronized pool6, pool4, static BIB entries, the global variables and any other internal Jool configuration you might have.

The clocks don't need to be synchronized.

### Jool Instance

Because forking SS sessions on every translated packet is not free (performance-wise), jool instances are not SS-enabled by default. The fact that the module and the daemon are separate binaries enhances the importance of this fact; starting the daemon is not, by itself, enough to get sessions synchronized.
Expand Down Expand Up @@ -334,7 +336,7 @@ vrrp_instance VI_1 {
2001:db8::1/96
}

# J is our secondary NAT64; start in the "BACKUP" state.
# K is our secondary NAT64; start in the "BACKUP" state.
state BACKUP
# Will only upgrade to master if this is the highest priority node that
# is alive.
Expand Down Expand Up @@ -472,11 +474,11 @@ That's all.

### `jool`

1. [`ss-enabled`](usr-flags-global.html#--ss-enabled)
2. [`ss-flush-asap`](usr-flags-global.html#--ss-flush-asap)
3. [`ss-flush-deadline`](usr-flags-global.html#--ss-flush-deadline)
4. [`ss-capacity`](usr-flags-global.html#--ss-capacity)
5. [`ss-max-payload`](usr-flags-global.html#--ss-max-payload)
1. [`ss-enabled`](usr-flags-global.html#ss-enabled)
2. [`ss-flush-asap`](usr-flags-global.html#ss-flush-asap)
3. [`ss-flush-deadline`](usr-flags-global.html#ss-flush-deadline)
4. [`ss-capacity`](usr-flags-global.html#ss-capacity)
5. [`ss-max-sessions-per-packet`](usr-flags-global.html#ss-max-sessions-per-packet)

### `joold`

Expand Down
47 changes: 36 additions & 11 deletions docs/en/usr-flags-global.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ title: global Mode
25. [`ss-flush-deadline`](#ss-flush-deadline)
26. [`ss-capacity`](#ss-capacity)
27. [`ss-max-payload`](#ss-max-payload)
28. [`ss-max-sessions-per-packet`](#ss-max-sessions-per-packet)

## Description

Expand Down Expand Up @@ -636,7 +637,7 @@ In the Active/Active scenario in particular, this essentially means that every t

In the Active/Passive model, on the other hand, this level of compulsive replication is rather undesired. Since a single big packet is easier to send than the equivalent many smaller ones, it is preferable to queue the session updates and wrap a bunch of them in a single big multicast, thereby reducing the SS packet-to-translated packet ratio and CPU overhead. This is appropriate in Active/Passive mode because the backup NAT64s are not expected to receive traffic in the near future, and losing a few recent queued session updates on crash is no big deal when the sizeable rest of the database has already been dispatched.

Sessions will be queued until the maximum packet size is reached or a timer expires. The maximum packet size is defined by [`ss-max-payload`](#ss-max-payload) and the duration of the timer is [`ss-flush-deadline`](#ss-flush-deadline).
Sessions will be queued until the maximum packet size is reached or a timer expires. The maximum packet size is defined by [`ss-max-sessions-per-packet`](#ss-max-sessions-per-packet) and the duration of the timer is [`ss-flush-deadline`](#ss-flush-deadline).

As a rule of thumb, you might think of this option as an "Active/Active vs Active/Passive" switch; in the former this flag is practically mandatory, while in the latter it is needlessly CPU-taxing. (But still legal, which explains the default.)

Expand Down Expand Up @@ -670,7 +671,7 @@ If SS cannot keep up with the amount of traffic it needs to multicast, this maxi

Watch out for this message in the kernel logs:

Joold: Too many sessions deferred! I need to drop some; sorry.
joold: Too many sessions deferred! I need to drop some; sorry.

### `ss-max-payload`

Expand All @@ -679,23 +680,47 @@ Watch out for this message in the kernel logs:
- Modes: Stateful NAT64 only
- Source: [Issue 113]({{ site.repository-url }}/issues/113)

Number of bytes per packet Jool can fit SS content (header and sessions) in.
Deprecated; does nothing as of Jool 4.1.11.

`joold` (the daemon) is (aside from a few validations) just a bridge; it receives bytes from the kernel module, wraps them in a TCP packet and sends it to other daemons, who similarly pass the bytes untouched. They are not even aware that those bytes contain sessions.
### `ss-max-sessions-per-packet`

- Type: Integer
- Default: 10
- Modes: Stateful NAT64 only
- Source: [Issue 113]({{ site.repository-url }}/issues/113), [issue 410]({{ site.repository-url }}/issues/410)

`joold` (the daemon) is (aside from a few validations) just a bridge; it receives bytes from the kernel module, wraps them in a UDP packet and sends it to other daemons, who similarly pass the bytes untouched. They are not even aware that those bytes contain sessions.

Since fragmentation is undesired, and since the kernel module is (to all intents and purposes) the one that's building the SS packets, it should not exceed the PMTU while doing so. The module has little understanding of the "multicast" network though, so it lacks fancy utilities to compute it. That's where this option comes in.

The default value is based on 1500, the typical minimum MTU one can be forgiven to expect in a controlled network. (The SS "multicast" network.)
`ss-max-sessions-per-packet` is the maximum number of sessions joold will transfer per joold packet. You want to maximize it as much as possible, while avoiding IPv4/IPv6 fragmentation.

When `ss-flush-asap false`, Jool will pretty much always wait until this number of sessions has been collected before sending a joold packet. On `ss-flush-asap true`, it will tend to send sessions more eagerly, but will still strictly constrain itself to the maximum.

The equation is
The optimal value is `floor((M - I - U - R) / S)`, where

ss-max-payload = MTU - IP header size - UDP header size
1. `M` is the MTU of the path between your joolds (usually 1500),
2. `I` is the size of the header of the IP protocol your joolds will use to exchange sessions (40 for IPv6, 20 for IPv4),
3. `U` is the size of the UDP header (8),
4. `R` is the size of a Netlink attribute header,
5. and `S` is the size of a serialized session.

Since I don't know whether your network is IPv4 or IPv6, the default value was inferred from the following numbers:
`R` should be constant (unless something has gone horribly wrong), but ultimately depends on your kernel version. `S` depends on your Jool version, and should only change between minor updates (ie. when the first or second numbers of Jool's version change). One way to find both is by running Jool's `joold` unit test:

default = 1500 - max(20, 40) - 8 = 1452
```
$ cd /path/to/jool/source
$ cd test/unit/joold
$ make
$ sudo make test | head
...
Jool: Netlink attribute header size: 4
Jool: Serialized session size: 140
...
```

In Jool 3.5.0, The joold header spans 16 bytes and each session is 64 bytes long, which means Jool will be able to fit 22 sessions per SS packet by default.
So the default value came out of

Feel free to adjust your MTU to reduce CPU overhead further in Active/Passive setups. (See [`ss-flush-asap`](#ss-flush-asap).)
```
floor((1500 - max(20, 40) - 8 - 4) / 140)
```

1 change: 1 addition & 0 deletions src/common/config.c
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ struct nla_policy nat64_globals_policy[JNLAG_COUNT] = {
[JNLAG_JOOLD_FLUSH_DEADLINE] = { .type = NLA_U32 },
[JNLAG_JOOLD_CAPACITY] = { .type = NLA_U32 },
[JNLAG_JOOLD_MAX_PAYLOAD] = { .type = NLA_U32 },
[JNLAG_JOOLD_MAX_SESSIONS_PER_PACKET] = { .type = NLA_U32 },
};

int iname_validate(const char *iname, bool allow_null)
Expand Down
14 changes: 9 additions & 5 deletions src/common/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,7 @@ enum joolnl_attr_global {
JNLAG_JOOLD_FLUSH_DEADLINE,
JNLAG_JOOLD_CAPACITY,
JNLAG_JOOLD_MAX_PAYLOAD,
JNLAG_JOOLD_MAX_SESSIONS_PER_PACKET,

/* Needs to be last */
JNLAG_COUNT,
Expand Down Expand Up @@ -309,6 +310,8 @@ struct joolnlhdr {
char iname[INAME_MAX_SIZE];
};

#define JOOLNL_HDRLEN NLMSG_ALIGN(sizeof(struct joolnlhdr))

struct config_prefix6 {
bool set;
/** Please note that this could be garbage; see above. */
Expand Down Expand Up @@ -452,22 +455,23 @@ struct joold_config {
*/
__u32 capacity;

/** Deprecated as of 4.1.11, does nothing. */
__u32 max_payload;

/**
* Maximum amount of bytes joold should send per packet, excluding
* IP/UDP headers.
* Maximum number of sessions joold should send per packet.
*
* This exists because userspace joold sends sessions via UDP. UDP is
* rather packet-oriented, as opposed to stream-oriented, so it doesn't
* discover PMTU and instead tends to fragment when we send too many
* sessions per packet. Which is bad.
*
* So the user, after figuring out the MTU, can tweak this number to
* prevent fragmentation.
* So the user can tweak this number to prevent fragmentation.
*
* We should probably handle this ourselves but it sounds like a lot of
* code. (I guess I'm missing something.)
*/
__u32 max_payload;
__u32 max_sessions_per_pkt;
};

/**
Expand Down
19 changes: 19 additions & 0 deletions src/common/constants.h
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,25 @@
*/
#define DEFAULT_JOOLD_MAX_PAYLOAD 1452

/**
* This needs to be
*
* floor(
* (
* Typical MTU
* - max(IPv4 header size, IPv6 header size)
* - UDP header size
* - root attribute header size
* ) / (
* serialized session size
* )
* )
*
* The root attribute header size and serialized session size need to be
* computed the hard way. Run the joold unit test to find them in dmesg.
*/
#define DEFAULT_JOOLD_MAX_SESSIONS_PER_PKT ((1500 - 40 - 8 - 4) / 140)

/* -- IPv6 Pool -- */

#define WELL_KNOWN_PREFIX "64:ff9b::/96"
Expand Down
9 changes: 8 additions & 1 deletion src/common/global.c
Original file line number Diff line number Diff line change
Expand Up @@ -1000,9 +1000,16 @@ static const struct joolnl_global_meta globals_metadata[] = {
.id = JNLAG_JOOLD_MAX_PAYLOAD,
.name = "ss-max-payload",
.type = &gt_uint32,
.doc = "Maximum amount of bytes joold should send per packet.",
.doc = "Deprecated; does nothing as of Jool 4.1.11.",
.offset = offsetof(struct jool_globals, nat64.joold.max_payload),
.xt = XT_NAT64,
}, {
.id = JNLAG_JOOLD_MAX_SESSIONS_PER_PACKET,
.name = "ss-max-sessions-per-packet",
.type = &gt_uint32,
.doc = "Maximum number of sessions to send, per joold packet.",
.offset = offsetof(struct jool_globals, nat64.joold.max_sessions_per_pkt),
.xt = XT_NAT64,
},
};

Expand Down
1 change: 1 addition & 0 deletions src/mod/common/db/global.c
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ int globals_init(struct jool_globals *config, xlator_type type,
config->nat64.joold.flush_deadline = 1000 * DEFAULT_JOOLD_DEADLINE;
config->nat64.joold.capacity = DEFAULT_JOOLD_CAPACITY;
config->nat64.joold.max_payload = DEFAULT_JOOLD_MAX_PAYLOAD;
config->nat64.joold.max_sessions_per_pkt = DEFAULT_JOOLD_MAX_SESSIONS_PER_PKT;
break;

default:
Expand Down
Loading

0 comments on commit 4fcfe18

Please sign in to comment.