Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SO_REUSEPORT support for EntryPoints #9834

Merged
merged 1 commit into from
Jan 30, 2024
Merged

Add SO_REUSEPORT support for EntryPoints #9834

merged 1 commit into from
Jan 30, 2024

Conversation

aofei
Copy link
Contributor

@aofei aofei commented Apr 11, 2023

What does this PR do?

Add SO_REUSEPORT support for EntryPoints.

Motivation

Attempts to solve the problem of canary deployment against Traefik itself.

Fixes #9823

More

  • Added/updated tests
  • Added/updated documentation

Additional Notes

This PR only focuses on the implementation. For discussion, please move to #9823.

@mloiseleur
Copy link
Contributor

I tested this PR.
It works like a charm.

@rtribotte rtribotte added this to the next milestone Dec 5, 2023
@rtribotte
Copy link
Member

Hello @aofei,

I can see that the update of the static configuration references files is missing:

  • docs/content/reference/static-configuration/file.toml
  • docs/content/reference/static-configuration/file.yaml

Can you please update them?

@aofei
Copy link
Contributor Author

aofei commented Dec 9, 2023

Hi @rtribotte!

I have just updated those two files and rebased my branch as well.

@aofei
Copy link
Contributor Author

aofei commented Dec 9, 2023

Also, should I change the base branch of this PR from master to v3.0?

@rtribotte
Copy link
Member

@aofei Thanks for the changes!

Also, should I change the base branch of this PR from master to v3.0?

For now, I don't know, I'm going to raise a discussion with other maintainers on that subject and let you know.

@rtribotte
Copy link
Member

rtribotte commented Dec 19, 2023

Hello @aofei,

I came across this article mentioning limitations associated with the SO_REUSEPORT directive, especially the closing of connections by the kernel when one of the processes using the port exits:

There are a number of reasons why a listening process might exit on a busy system. Perhaps it simply crashed. But, more likely, the server is being restarted to effect a configuration change or to switch to a new certificate. Such restarts can be phased across a pool of server processes so that they don't all exit at once; that should allow incoming connections to be handled without any apparent interruption of service. But when the above-described behavior comes into the picture, users can be turned away, which tends to have an unpleasant effect on their mood. The depressive effect on the operator of the site, who may have just lost the opportunity to learn that the would-be user is in the market for a new pair of socks, can be even worse.

I'm hoping to get your insights, but so far, these limitations could be a significant concern for Traefik users, at least I think this should be documented.

@rtribotte
Copy link
Member

rtribotte commented Dec 20, 2023

Also, should I change the base branch of this PR from master to v3.0?

@aofei So, as a follow-up, yes please, if you can rebase this PR on the v3.0 branch it would be great, we would like to have it for v3.

@rtribotte rtribotte modified the milestones: next, 3.0 Dec 20, 2023
@aofei
Copy link
Contributor Author

aofei commented Jan 2, 2024

@rtribotte Yes, you're right, thanks for bringing it up! It's a known bug in the SO_REUSEPORT implementation of the Linux kernel.

In case anyone else is interested, the problem mentioned by @rtribotte is that when a listening process exits, all incoming TCP connections assigned to it that are either in the middle of a 3WH (3-way handshake) or in the accept queue get closed, even if other processes could handle them. The root cause of this problem is that once the kernel assigns an incoming connection to a specific process (triggered by the arrival of the initial SYN packet of a 3WH), it won't reassign it, leading to connection failures (RST packets) when that process exits.

A common workaround is to pass the listener file descriptors between different processes without closing them. For example, Nginx does this, thanks to its concept of worker processes. Unfortunately, this doesn't work with Traefik, as Traefik is a single process.

While this is a problem, it doesn't make SO_REUSEPORT a bad feature. It's a kernel bug, until it's resolved or a workaround is found, I agree that it should be documented.

@aofei aofei changed the base branch from master to v3.0 January 2, 2024 19:52
@aofei
Copy link
Contributor Author

aofei commented Jan 3, 2024

@rtribotte I have updated the docs/content/routing/entrypoints.md documentation file and changed the base branch of this PR from master to v3.0. The failed checks don't seem relevant to this PR. Please let me know if I've got something wrong.

@nmengin nmengin added the priority/P2 need to be fixed in the future label Jan 8, 2024
@nmengin
Copy link
Contributor

nmengin commented Jan 16, 2024

Hey @aofei,

It seems that you have conflicts, could you fix them?
Do you need help on it?

@aofei
Copy link
Contributor Author

aofei commented Jan 17, 2024

Hi @nmengin! Thanks for letting me know about the conflicts. They occurred after my last commit, so I didn't notice. They have now been fixed.

@nmengin
Copy link
Contributor

nmengin commented Jan 17, 2024

Hey @aofei,

Thank you for your quick answer and action.

Copy link
Member

@kevinpollet kevinpollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution!

docs/content/routing/entrypoints.md Outdated Show resolved Hide resolved
docs/content/routing/entrypoints.md Outdated Show resolved Hide resolved
pkg/config/static/entrypoint_getlistenconfig_other.go Outdated Show resolved Hide resolved
pkg/config/static/entrypoint_getlistenconfig_other.go Outdated Show resolved Hide resolved
pkg/config/static/entrypoint_getlistenconfig_unix.go Outdated Show resolved Hide resolved
pkg/config/static/entrypoint_getlistenconfig_unix.go Outdated Show resolved Hide resolved
pkg/udp/conn.go Outdated Show resolved Hide resolved
pkg/udp/conn.go Outdated Show resolved Hide resolved
Copy link
Contributor

@nmengin nmengin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👏

Thank you @aofei 👍

@aofei
Copy link
Contributor Author

aofei commented Jan 30, 2024

@kevinpollet @rtribotte I just made the changes you requested. Please let me know if I've got something wrong.

Copy link
Member

@kevinpollet kevinpollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 👍

Copy link
Member

@rtribotte rtribotte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

@juliens juliens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@nmengin nmengin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/server kind/enhancement a new or improved feature. priority/P2 need to be fixed in the future size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add SO_REUSEPORT support for EntryPoints
8 participants