Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reload provider file configuration on SIGHUP #9993

Merged
merged 6 commits into from
Feb 1, 2024

Conversation

sokoide
Copy link
Contributor

@sokoide sokoide commented Jun 27, 2023

What does this PR do?

The patch is to reload a file config on SIGHUP.

Fixes #1188
Related to #3083

Motivation

If you 'touch' the file config, Traefik reloads it automatically. However, if it's created from a configmap in K8s, the file is read-only and can't touch it.
If cert files specified by the file config is updated by a cert agent (e.g. renewed to extend expiration date), we'd like the cert agent to force reload it.

More

  • Added/updated tests
  • Added/updated documentation

Additional Notes

The change is,

  1. Add SIGHUP handler
  2. On SIGHUP, it calls a new method watcher.ReloadFileConfig()
  3. watcher.ReloadFileConfig() gets a pointer of file provider from the providerAggregator and calls fileProficer.BuildConfiguration to get a configuration
  4. It sends a dynamic message with the configuration and ProviderName: "file" into the ConfigurationWatcher.allProvidersConfigs channel
  5. The existing file provider will receive the message and reload the config which reloads TLS certificates if it's updated

@gbolo

This comment was marked as off-topic.

@rtribotte
Copy link
Member

Hello @sokoide,

Thanks for this contribution!

I'll speak only on my behalf and it would be better to wait for other maintainers' opinions before making any changes, but I have a suggestion.

In my opinion, the way you managed to access the FileProvider to be able to produce a rebuild of its configuration (reload) is not fully satisfying as it needs a specific getter method for that provider on the ProviderAggregator.
Moreover, at a glance, before this PR the exported method BuildConfiguration of the file provider has no usage outside it's package, and this is maybe a mistake, as there are no exported BuildConfiguration methods in other providers.
Also, it's a bit off that ProviderAggregator pushes itself the file provider configuration in the allProvidersConfigs chan.

This PR is focused on the file provider but it seems to me that it could be a good thing to have a global reload method exposed by the ProviderAggregator, which could be actionable by the ConfigurationWatcher, and which would then be in its turn call the reload method on each "reloadable" providers (only the file provider for now).
To do that, we could introduce an interface like the throttled interface.
This would make the provider responsible for the actual reload operation.

@nmengin
Copy link
Contributor

nmengin commented Nov 30, 2023

Hello @sokoide,

Would you like some help to work on the PR?

@sokoide
Copy link
Contributor Author

sokoide commented Dec 1, 2023

@nmengin , thank you for the message! @sandy2008 and I will start looking at it early next week and get back.

@sokoide
Copy link
Contributor Author

sokoide commented Dec 5, 2023

@nmengin, @sandy2008 and I implemented the Reloadable interface as suggested. Would that be as expected? Any feedback would be appreciated!

@sandy2008
Copy link

@nmengin Hi! Would you mind take a look at this PR?

@sokoide
Copy link
Contributor Author

sokoide commented Dec 13, 2023

Hello @nmengin, the unit test failure in the actions is also reproducible in the latet master (0ee377b) w/o the PR. Is there anything we can do to make it pass (e.g. should we make a PR for v3.0.0-beta5 branch)?

2023/12/13 12:17:34 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Receive-Buffer-Size for details.
{"level":"error","error":"read tcp 127.0.0.1:39375->127.0.0.1:48252: i/o timeout","time":"2023-12-13T12:17:38Z","message":"Error while Peeking first byte"}
FAIL

@sandy2008
Copy link

Hey @sokoide,

Thanks for the rebase.

There is a linter issue: https://github.com/traefik/traefik/actions/runs/7418513873/job/20256267325?pr=9993. I've re-launched the unit tests too (there may be a flakiness)

Hi @nmengin! Thank you for your feedback.
We fixed the linting issues :)

nmengin
nmengin previously approved these changes Jan 29, 2024
Copy link
Contributor

@nmengin nmengin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👏

Thank you @sokoide @sandy2008.
It works as expected and it will help lots of people dealing with the file provider 👍

sokoide and others added 2 commits January 31, 2024 08:51
Co-Authored-By: sandy2008 <Yuxuan.Chen@morganstanley.com>
@nmengin nmengin dismissed their stale review January 31, 2024 09:19

Last minute suggestion below

@nmengin
Copy link
Contributor

nmengin commented Jan 31, 2024

Hello @sokoide, @sandy2008,

After more thought, the current implementation, using an interface to address the use case globally, doesn't seem adapted.

Indeed, because of our providers' implementation, the ReloadProviders method forces you to add a specific statement for the File Provider, while it's the only use case that makes sense for you PR.

With the other maintainers, we want your PR embedded in the incoming v3.0 RC.
To speed up the review, we will commit a simplified implementation later today.

If the suggestion looks good to you, we'll merge it.

@sokoide
Copy link
Contributor Author

sokoide commented Jan 31, 2024

@nmengin , sounds great :). Thank you very much!

@rtribotte rtribotte changed the title feat: reload file config on SIGHUP Reload provider file configuration on SIGHUP Jan 31, 2024
@nmengin
Copy link
Contributor

nmengin commented Jan 31, 2024

Hello @sokoide, @sandy2008

As mentioned above, we have added a commit to move the logic into the File provider.
Could you tell me if it looks good to you?

Copy link
Member

@mmatur mmatur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nmengin
Copy link
Contributor

nmengin commented Jan 31, 2024

I add the label bot/no-merge to ensure we wait for @sokoide and @sandy2008 feedback before merging.

@sokoide
Copy link
Contributor Author

sokoide commented Feb 1, 2024

@nmengin , thank you. I'm testing this scenario. Please give me one more day.

If cert files specified by the file config is updated by a cert agent (e.g. renewed to extend expiration date), we'd like the cert agent to force reload it.

@sokoide
Copy link
Contributor Author

sokoide commented Feb 1, 2024

@nmengin, @rtribotte, I tested and confirmed both clientauth (mTLS) and server TLS (DNS validation for client) were fine. LGTM :)
Thank you very much again for making the change!

Copy link
Contributor

@nmengin nmengin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

Copy link
Member

@rtribotte rtribotte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 👍

@traefiker traefiker merged commit d7ec0ce into traefik:v3.0 Feb 1, 2024
22 checks passed
@nmengin nmengin removed their assignment Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/file kind/enhancement a new or improved feature. priority/P2 need to be fixed in the future size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a hook to have Traefik live-reload its configuration
8 participants