Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent UDP NAT overload from exhausting all available connections #432

Merged
merged 9 commits into from
Nov 3, 2018

Conversation

djs55
Copy link
Collaborator

@djs55 djs55 commented Nov 1, 2018

There is a configurable global connection limit which is useful on platforms (such as macOS) which have abnormally low maximum file descriptor limits.

The UDP NAT binds a socket per internal source address, and expires it after it has been idle for 60s.

Unfortunately a UDP protocol which sends UDP from random source ports and which doesn't expect replies (such as that in envoy, as used by kubeflow) can quickly exhaust the global limit which then breaks other things, such as port forwards.

This PR adds a new limit on the total number of active UDP NAT table entries and sockets, and expires the oldest when the new limit is hit. This should keep the network usable even when using one of these chatty UDP protocols.

In a future patch we will iterate over the bindings in order to
expire old ones, treating it as a LRU cache.

Signed-off-by: David Scott <dave.scott@docker.com>
The `max_active_flows` limits the number of listening UDP sockets.
When the limit is hit, we expire 25% of the oldest listening sockets.
The limit is initially set to 1024.

This should prevent a client exhausting the overall connection limit
by sending UDP traffic to random addresses.

Signed-off-by: David Scott <dave.scott@docker.com>
Previously we would only touch last_use on the outgoing path, which
means that a flow only receiving data would time out after 60s.

Signed-off-by: David Scott <dave.scott@docker.com>
This allows test cases (and other diagnostic code) to introspect the
current NAT table.

Signed-off-by: David Scott <dave.scott@docker.com>
It's useful for test cases to know how big the NAT table is.

Signed-off-by: David Scott <dave.scott@docker.com>
It should be ok to log this at info level since

- they're in large batches rather than per-flow
- seeing too many of these messages probably means the protocol is
  NAT-unfriendly.

Signed-off-by: David Scott <dave.scott@docker.com>
When combined with the `resend_all_replies`, this allows us to ask
a server to resend replies while we wait for new packets to appear.

Signed-off-by: David Scott <dave.scott@docker.com>
This allows us to simulate a UDP server which sends a constant
stream of replies after an initial hello message. Previously we
could only simulate request/response patterns.

Signed-off-by: David Scott <dave.scott@docker.com>
- test that replies bump the `last_use` time
- test that batch expiry events don't touch a recently-opened
  flow

Signed-off-by: David Scott <dave.scott@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant