Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually reload tls certificates #5495

Open
Nepoxx opened this issue Sep 25, 2019 · 64 comments
Open

Manually reload tls certificates #5495

Nepoxx opened this issue Sep 25, 2019 · 64 comments
Labels
area/provider/file area/tls contributor/wanted Participation from an external contributor is highly requested kind/enhancement a new or improved feature. priority/P3 maybe
Projects

Comments

@Nepoxx
Copy link

Nepoxx commented Sep 25, 2019

Do you want to request a feature or report a bug?

Feature

What did you expect to see?

My tls certificates are generated with Let's Encrypt remotely and are used by Traefik through a glusterfs mount. For that reason, Traefik is unable to properly monitor file changes and thus never knows when certificates are renewed (so it will serve an expired certificate). Having a way to tell Traefik to reload new certificates (or file configs in general) would allow the user to circumvent cases when Traefik is unable to use inotify.

Traefik Version

2.0

Similar to #1623

@ldez
Copy link
Contributor

ldez commented Sep 25, 2019

Hello,

could you give more information about your Traefik configuration?

@RomRider
Copy link

Same issue for me, I have this configuration in place:

tls:
  certificates:
    - certFile: /etc/letsencrypt/live/DOMAIN/fullchain.pem
      keyFile: /etc/letsencrypt/live/DOMAIN/privkey.pem

  stores:
    default:
      defaultCertificate:
        certFile: /etc/letsencrypt/live/DOMAIN/fullchain.pem
        keyFile: /etc/letsencrypt/live/DOMAIN/privkey.pem

The path is a NFS volume and the certificates are renewed outside of traefik (because not only used by traefik)

@ldez
Copy link
Contributor

ldez commented Sep 26, 2019

could you give more information about your Traefik configuration (static and dynamic configuration)?

@RomRider
Copy link

Sure:

global:
  checkNewVersion: false
  sendAnonymousUsage: false
serversTransport:
  insecureSkipVerify: true

providers:
  docker:
    endpoint: tcp://docker-socket-proxy2:2375
    exposedByDefault: false
    watch: true

  file:
    directory: /configs/
    watch: true

log:
  level: DEBUG
accessLog: {}

api:
  dashboard: true

entryPoints:
  https:
    address: :443
    forwardedHeaders:
      trustedIPs:
        - "127.0.0.1/32"
        - "x.x.x.x"
    proxyProtocol:
      trustedIPs:
        - "127.0.0.1/32"
        - "x.x.x.x"

  publicHttps:
    address: :8443
    forwardedHeaders:
      trustedIPs:
        - "127.0.0.1/32"
        - "x.x.x.x"
    proxyProtocol:
      trustedIPs:
        - "127.0.0.1/32"
        - "x.x.x.x"

TLS:

tls:
  certificates:
    - certFile: /etc/letsencrypt/live/DOMAIN/fullchain.pem
      keyFile: /etc/letsencrypt/live/DOMAIN/privkey.pem

  stores:
    default:
      defaultCertificate:
        certFile: /etc/letsencrypt/live/DOMAIN/fullchain.pem
        keyFile: /etc/letsencrypt/live/DOMAIN/privkey.pem

  options:
    default:
      minVersion: VersionTLS12

Dynamic example:

    labels:
      - traefik.enable=true
      - traefik.docker.network=${TRAEFIK_BACKEND_NETWORK}
      - traefik.http.routers.grafana.rule=Host(`grafana.${PRIVATE_DOMAIN}`)
      - traefik.http.routers.grafana.entryPoints=https
      - traefik.http.routers.grafana.tls=true
      - traefik.http.routers.grafana.tls.options=default
      - traefik.http.services.grafana.loadBalancer.server.port=3000

@ldez
Copy link
Contributor

ldez commented Sep 26, 2019

How you mount the file?

@RomRider
Copy link

NFS on the host, then I map the folder in the container:

    volumes:
      - ./traefik.yml:/traefik.yml:ro
      - ./configs:/configs:ro
      - ${DOCKER_DATA_NFS_FOLDER}/certbot/etc-letsencrypt:/etc/letsencrypt:ro

@Nepoxx
Copy link
Author

Nepoxx commented Sep 26, 2019

Stack.yml

version: "3.7"
services:
  traefik:
    image: traefik:v2.0
    ports:
      - "80:80"
      - "443:443"
      - "8080:8080"
      - "8082:8082"
    volumes:
      - type: bind
        source: /var/run/docker.sock
        target: /var/run/docker.sock
      - type: bind
        source: /mnt/gfs/docker/traefik/traefik.yml
        target: /etc/traefik/traefik.yml
      - type: bind
        source: /mnt/gfs/docker/traefik/certs
        target: /certs
      - type: bind
        source: /mnt/gfs/docker/traefik/config
        target: /config
      - type: bind
        source: /mnt/gfs/docker/traefik/acme
        target: /acme
    deploy:
      placement:
        constraints:
          - node.role == manager
    networks:
      traefik-net: {}

networks:
  traefik-net:
    name: traefik-net
    driver: overlay

Where /mnt/gfs is a gluster mount:

localhost:/gfs /mnt/gfs glusterfs defaults,_netdev,backupvolfile-server=venus 0 0

traefik.yml

entryPoints:
  http:
    address: ":80"

  https:
    address: ":443"

  metrics:
    address: ":8082"

providers:
  docker:
    swarmMode: true
    network: traefik-net
    exposedByDefault: false
  file:
    filename: /config/config.yml
    # inotify does not work with network mounts
    watch: false

# API and dashboard configuration
api:
  insecure: true
  debug: true

metrics:
  prometheus:
    entryPoint: metrics

config.yml (mentioned above):

tls:
  certificates:
    - certFile: /certs/_.example.com/_.example.com.cert
      keyFile: /certs/_.example.com/_.example.com.key
http:
  middlewares:
    redirect-to-https:
      redirectScheme:
        scheme: https
  routers:
    http-catchall:
      priority: 0
      entryPoints:
        - http
      middlewares:
        - redirect-to-https@file
      rule: 'hostregexp(`{host:.+}`)'
      service: noop
  services:
    noop:
      loadBalancer:
        servers:
          - url: 'http://127.0.0.1'

Replacing _.example.com.cert and _.example.com.key has no effect on Traefik

@dduportal dduportal added area/tls kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. area/provider/file and removed status/0-needs-triage labels Sep 26, 2019
@ldez ldez added this to issues in v2 via automation Sep 27, 2019
@RomRider
Copy link

An interesting way to handle that would be to provide an API endpoint to reload certs.

@dduportal dduportal added kind/proposal a proposal that needs to be discussed. and removed kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. labels Sep 27, 2019
@gtmadev
Copy link

gtmadev commented Sep 30, 2019

I just ran into this today as well when using a glusterfs volume for file configs. In my main config, I have it watch a directory in the glusterfs volume. That directory is where I store dynamic configurations. Some nodes in docker swarm seem to get the updated file config, and others don't, until I manually restart that instance.

It seems that Traefik is unable to know when files are changed or added (when using the watch option) and the directory is a glusterfs volume.

I wonder if this is what I am running into...

https://stackoverflow.com/questions/48877567/traefik-docker-hot-reload-configuration#comment97995443_48877567

@Nepoxx
Copy link
Author

Nepoxx commented Oct 3, 2019

@gtmadev I tried bind mounting using folder and that does not help.

This is a pretty major blocker for me, and I'd like to help move this forward. My initial idea is to add support for a kill signal that would trigger a config reload, probably SIGHUP (prometheus does this).

Thoughts?

@martine-stratdat
Copy link

This is also something I'd like to see. While Traefik may feel this is covered though it's letsencrypt solution it does not cover all real world cases.

  • This includes easy loading of purchased certificates
  • Cases where traefik is just one component of TLS use, including transitioning to traefik from other set ups.
  • Instances where rate limiting may be come an issue

There two solutions that will work here as I see it:

  • An API endpoint
  • A watch on certificates or certificate entrties

Given that this is a barrier of entry for Traefik I would like to see a solution given a priority.

@DrEsteban
Copy link

We also desperately need this feature, as the current strategies for triggering certificate reload leave a lot to be desired.

@hezten
Copy link

hezten commented Mar 6, 2020

I badly need a similar solution too, hope this is getting looked into :-)

@monotek
Copy link

monotek commented Mar 20, 2020

Same here with traefik 1.x helm chart.
Using existing secret for bought tls certs in kubernetes and telling traefik via ingress rule what the secretname is.
Updating the secret has no effect at all.
The old cert is still used...

Edit:

Seems our problem was that we had several secrets wiht the same name "my-tls-wildcard-secret" in different namespaces. We had to update all of them before traefik used it.

@ybizeul
Copy link

ybizeul commented Apr 19, 2020

Same issue here.

Traefik version 2.2.0 built on 2020-03-25T17:32:57Z with docker

Before changing certificates :

drwxr-xr-x    2 root     root          4096 Apr 19 15:56 .
drwxr-xr-x    3 root     root          4096 Apr 19 13:24 ..
-rw-r--r--    1 root     root          1062 Apr 19 15:57 nabox.crt
-rw-r--r--    1 root     root          1679 Apr 19 15:57 nabox.key
-rw-r--r--    1 root     root           127 Apr 19 15:53 traefik.yaml

When changing certificates :

time="2020-04-19T14:08:16Z" level=debug msg="Configuration received from provider file: {\"http\":{},\"tcp\":{},\"udp\":{},\"tls\":{\"stores\":{\"default\":{}}}}" providerName=file
time="2020-04-19T14:08:16Z" level=info msg="Skipping same configuration" providerName=file
time="2020-04-19T14:08:16Z" level=debug msg="Configuration received from provider file: {\"http\":{},\"tcp\":{},\"udp\":{},\"tls\":{\"stores\":{\"default\":{}}}}" providerName=file
time="2020-04-19T14:08:16Z" level=debug msg="Configuration received from provider file: {\"http\":{},\"tcp\":{},\"udp\":{},\"tls\":{\"stores\":{\"default\":{}}}}" providerName=file
time="2020-04-19T14:08:16Z" level=info msg="Skipping same configuration" providerName=file
time="2020-04-19T14:08:16Z" level=info msg="Skipping same configuration" providerName=file

And here is the directory :

drwxr-xr-x    2 root     root          4096 Apr 19 15:56 .
drwxr-xr-x    3 root     root          4096 Apr 19 13:24 ..
-rw-r--r--    1 root     root          1062 Apr 19 16:08 nabox.crt
-rw-r--r--    1 root     root          1675 Apr 19 16:08 nabox.key
-rw-r--r--    1 root     root           127 Apr 19 15:53 traefik.yaml

Traefik config :

      - "--log.level=DEBUG"
      - "--api.insecure=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--providers.file.directory=/tmp/ssl/" << File provider here
      - "--providers.file.watch=true"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.web.http.redirections.entrypoint.to=webssl"
      - "--entrypoints.web.http.redirections.entrypoint.scheme=https"
      - "--entrypoints.webssl.address=:443"
      - "--entrypoints.graphite.address=:2003"
    volumes:               
      - "/var/run/docker.sock:/var/run/docker.sock:ro"            
      - "/opt/conf/ssl:/tmp/ssl" << Traefik config and SSL files

@MartinMuzatko
Copy link

We are generating certificates on request and move them to the right file position. This replaces the inode, which (I believe) breaks traefiks automatic discovery of certificates.

@majorgearhead
Copy link

Any word on this possible bug? We programmatically generate our certificates from a known authority and they are replaced for which Traefik cannot see happen. We have to manually bounce trafik to get it to pick up the new certificates.

@Opti98
Copy link

Opti98 commented Aug 28, 2020

We generate out certificates from another authority too, and when we replace the certificates Traefik doesn't see the change. I have discovered if I touch the configuration file it does reload the new certificates, ie "touch certificates.toml"

@Sarke
Copy link
Contributor

Sarke commented Sep 2, 2020

I have discovered if I touch the configuration file it does reload the new certificates, ie "touch certificates.toml"

Really? That did not work for me last time I tried. I think Traefik caches the parsed config files and if it's still the same it doesn't re-apply them.

@xzycn
Copy link

xzycn commented Apr 2, 2022

Hi, is there any update for this?
I‘m using cert-manager to manage certificates, and file provider for traefik(2.6.1),certificate has been updated,but the previous one still in use. I don't want traefik to restart :)

@dginhoux
Copy link

Hi,

I've the same problem here when everything is stored on NFS share provided by netapp.
It could be very great to have an api endpoint that force all config refresh.

Have a good day.

@farzadso

This comment was marked as off-topic.

@avluis

This comment was marked as off-topic.

@bkraul
Copy link

bkraul commented Nov 14, 2022

There is also the issue of docker swarm setup. Let's say you have 3 manager nodes, you can touch the cert config yml on one of them, and it will work great, but the other two will know nothing about the updated cert...hence, issues. Same with acme. Only thing resolving it is a restart of the service stack.

@jralph

This comment was marked as off-topic.

@mpl

This comment was marked as off-topic.

@Amunak
Copy link

Amunak commented Mar 2, 2023

@Ajedi32 is spot on, also about the cert configured in default certs only not being enough.

What is absolutely baffling to me is that Traefik devs are so ignorant towards implementing a simple signal for reloading the configuration as pretty much every other daemon does. Sure, touching a dynamic config file works, but it's a horrible workaround - next time they'll implement configuration caching so only changes propagate and it'll stop working or something.

I'm on Traefik v3 now and #8243 doesn't seem to solve it for me (I have the certs mounted in a directory and not separately), a reload or restart is still needed. But my testing was limited as it's hard to test properly without accidentally reloading the configuration.

how do you do sighup via docker? wouldn't an API be better?

@MartinMuzatko you can do something like docker kill -s SIGUSR1 traefik. But it's not mutually exclusive either; I think it'd be nice if we got both an API and a signal trap. The API would almost certainly be more work though, especially since it'd be the first write API which would require some extra considerations to start with.

@Sarke
Copy link
Contributor

Sarke commented Mar 2, 2023

There is also the issue of docker swarm setup. Let's say you have 3 manager nodes, you can touch the cert config yml on one of them, and it will work great, but the other two will know nothing about the updated cert...hence, issues. Same with acme. Only thing resolving it is a restart of the service stack.

@bkraul I ended up doing a touch on the config files, then rsync to the other nodes. Then they all pick up the changes.

@bkraul
Copy link

bkraul commented Mar 2, 2023

Sounds legit...though a bit hacky. At that point I might just set up an Ansible playbook to do just what you described.

@hyst3ric41
Copy link

hyst3ric41 commented Jan 19, 2024

Any updates on this? I'm struggling the same, but the touch workaround didn't work for me... I touch /etc/traefik/traefik.yml in my manager node traefik instance but none of config seems to change, the only thing that works it's reloading the service, which is not an option because traefik it's supposedly designed for, detect changes automatically in the file provider directory!

@nmengin
Copy link
Contributor

nmengin commented Feb 8, 2024

Hello,

We have recently merged a PR that allows reloading the configuration when Traefik receives a SIGHUP signal.
Such a feature allows Traefik users to reload manually (executing the command kill -1 <TraefikProcessId>) when the TLS certificates are updated.
This feature will be available in the next version v3.0.

Is it enough for your use cases?

@nmengin nmengin removed their assignment Feb 8, 2024
@bkraul
Copy link

bkraul commented Feb 8, 2024

Na man...endpoint call would have been nice.

@dginhoux
Copy link

dginhoux commented Feb 8, 2024

Hi,

SIGHUP can be a great option for "non containers" deployed instances.
But in containerized env, thats not the same...

If certs files are generated by others internals tools and pushed in a shared (or not) space also mounted in the traefik container.... how to do a SIGUP and keep everything safe and isolated between traefik process and certs generators ?

I also think and endpoint (with auth) in the traefik API can be more secure.
Or a simplier thing.... like service discovery, traefik self watch each certs files and look for change by comparaing them (md5 ? / serial, date.... ) and if change, reload it

Have a good day.

@AndrewSav
Copy link
Contributor

Not to mention that Windows does not have SIGHUP

@rtribotte rtribotte self-assigned this Feb 12, 2024
@ybizeul
Copy link

ybizeul commented Mar 15, 2024

Just trying to sum this up :

  • SIGHUP not available until 3.0
  • Dynamic config requires an actual change of the conf file, changing the certificates on disk won't trigger a refresh, neither would touch the config files :
time="2024-03-15T17:29:33Z" level=debug msg="Skipping unchanged configuration." providerName=file

I'm sure that's useful for some use case, but definitely leaves a lot of people aside that were hoping for a simple monitoring of the cert files

@rtribotte
Copy link
Member

rtribotte commented Mar 18, 2024

Hello,

A configurable HTTP endpoint to trigger configuration reloads of the file provider is a good enhancement.
We think it should be an internal service, like api@internal.
It would not be possible to expose it via an option (e.g.: api.insecure) but only with a router, and its enablement would be controlled by a static configuration option.

We would love some help from the community on this, if any community member would like to contribute to this, let us know, and we will work with you to make sure you have all the information needed before starting, and ensure that we are aligned and can move quickly with the review and merge process.

@rtribotte rtribotte added kind/enhancement a new or improved feature. contributor/wanted Participation from an external contributor is highly requested and removed kind/proposal a proposal that needs to be discussed. labels Mar 18, 2024
@rtribotte rtribotte removed their assignment Mar 18, 2024
@rtribotte rtribotte added the priority/P3 maybe label Mar 18, 2024
@dginhoux
Copy link

dginhoux commented Mar 18, 2024

Yes, an endpoint can be great.

More simple... a flag !
Watch for a file "/certs/.need_to_update", if present, start update process and remove the flag.

@D0wn3r
Copy link

D0wn3r commented Apr 4, 2024

Hi !
I'm looking for the same feature that will reload well my certificates.
my dynamic configuration:

[http]
  [http.middlewares]
    [http.middlewares.vpn-ipwhitelist.ipWhiteList]
      sourceRange = ["10.0.0.0/8", "172.80.0.0/16", "172.90.0.0/16", "172.18.0.0/16"]
[tls.options]
  [tls.options.default]
    minVersion = "VersionTLS12"
    cipherSuites = [
      "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305",
      "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305"
    ]
[[tls.certificates]]
  certFile = "/etc/le/fullchain.pem"
  keyFile  = "/etc/le/privkey.pem"


[[tls.certificates]]
  certFile = "/etc/le/fullchain_pca12.pem"
  keyFile  = "/etc/le/privkey_pca12.pem"

When I change it or just touch the file, I have this log

level=debug msg="Configuration received: {\"http\":{\"middlewares\":{\"vpn-ipwhitelist\":{\"ipWhiteList\":{\"sourceRange\":[\"10.0.0.0/8\",\"172.80.0.0/16\",\"172.90.0.0/16\",\"172.18.0.0/16\"]}}}},\"tcp\":{},\"udp\":{},\"tls\":{\"options\":{\"default\":{\"minVersion\":\"VersionTLS12\",\"cipherSuites\":[\"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256\",\"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256\",\"TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384\",\"TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384\",\"TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305\",\"TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305\"],\"clientAuth\":{},\"alpnProtocols\":[\"h2\",\"http/1.1\",\"acme-tls/1\"]}}}}" providerName=file
time="2024-04-04T17:26:40+02:00" level=debug msg="Skipping unchanged configuration." providerName=file

Is that normal that TLS certificate are not mentionned in the "Configuration received" ?

@D0wn3r
Copy link

D0wn3r commented Apr 4, 2024

Yes, an endpoint can be great.

More simple... a flag ! Watch for a file "/certs/.need_to_update", if present, start update process and remove the flag.

Not great with multiple Traefik instances using the same folder of certificates. Only 1 will be able to update

@cdwiegand
Copy link

I looked around the code, and I'm not super-golang-capable, but I think with the way Providers are separate from DynamicConfig that it'd be hard to implement a way to ask the FileProvider to reload itself from the api package, as the api doesn't have access to the AggregateProvider (or its internal FileProvider instance), and FileProvider doesn't expose a way to reload its configuration externally - just fsnotify and at load time.

I did consider adding a time.Ticker routine and reevaluate every watched file every x seconds, but didn't really like the resulting code.

I also considered seeing if the RestProvider could "call in" with its own channel to AggregateProvider to "ask FileProvider, if valid, to reload itself" as AggregateProvider does have a reference to it (if using that provider), but I also felt that was very hacky and definitely didn't fit the way traefik is architected.

I ended up building a docker image with supervisord, with XML-RPC enabled, so I can call the endpoint there to SIGHUP the traefik process. It isn't ideal, but it's a valid workaround until someone can implement an API endpoint directly in traefik.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/file area/tls contributor/wanted Participation from an external contributor is highly requested kind/enhancement a new or improved feature. priority/P3 maybe
Projects
No open projects
v2
issues
Development

No branches or pull requests