ACME HTTP-01 challenge fails by timeout #2763

deargonaut · 2018-01-25T10:44:26Z

Do you want to request a feature or report a bug?

Bug

What did you do?

I am trying to fetch automatic certificates from Let's Encrypt with HTTP-01.

What did you expect to see?

Fetching certificates like before TLS-SNI problems.

What did you see instead?

No new certificates.

Possible problems / fixes

It looks like it has something to do with adding the http route to each domain (domain.com/.well-known/acme-challenge/[token]). When visiting the same route over https I receive an 404 directly. But via http timeouts.

https://github.com/containous/traefik/blob/5140bbe99a79b45f98c27fbb8e9b6833194af4cb/acme/challenge_http_provider.go#L22

Via Slack someone (maverick) tried my same configuration but with a consul backend. Maybe it has something to do with that?

When checking de debug logs it seems it "CleansUp" token for that domain before hitting the timeout. Maybe it has something to do with that?

Output of `traefik version`: (What version of Traefik are you using?)

Traefik version v1.5.0 built on 2018-01-23_04:42:32PM

What is your environment & configuration (arguments, toml, provider, platform, ...)?

defaultEntryPoints = ["http", "https"]
debug = true
logLevel = "DEBUG"

[entryPoints]
  [entryPoints.http]
  address = ":80"
#    [entryPoints.http.redirect]
#    entryPoint = "https"
  compress = true
  [entryPoints.https]
    address = ":443"
    compress = true
    [entryPoints.https.tls]

[acme]
  email = "email@address.com"
  caServer = "https://acme-staging.api.letsencrypt.org/directory"
  # Tried it on production as well
  storage = "/etc/traefik/acme/acme.json"
  entryPoint = "https"
  OnHostRule = true
  acmeLogging = true
  [acme.httpChallenge]
    entryPoint = "http"

# Enable Docker configuration backend
[docker]
  endpoint = "unix:///var/run/docker.sock"
  domain = "sandbox.domain.com"
  watch = true
  swarmmode = true
  exposedbydefault = true

[api]
  entryPoint = "traefik"
  dashboard = true
  address = ":8080"

  [api.statistics]
    recentErrors = 10

docker-compose.yml

version: '3'
services:
  nginx:
    image: nginx:1.13
    volumes:
      - "../workspace:/srv"
      - "./nginx/default.conf:/etc/nginx/conf.d/default.conf"
    deploy:
      labels:
        - "traefik.backend=rest-api"
        - "traefik.port=80"
        - "traefik.frontend.rule=Host:rest-api.sandbox.domain.com"
        - "traefik.docker.network=frontend"
        - "traefik.backend.loadbalancer.method=drr"
    networks:
      - frontend
      - backend

  php:
    image: php-fpm:7.1
    volumes:
      - "../workspace:/srv"
    networks:
      - backend

networks:
  backend:
    external:
      name: rest-api
  frontend:
    external:
      name: frontend

If applicable, please paste the log output in debug mode (`--debug` switch)

logs

time="2018-01-25T10:05:56Z" level=debug msg="LoadCertificateForDomains [rest-api.sandbox.domain.com]..." 
time="2018-01-25T10:05:56Z" level=debug msg="Looking for provided certificate to validate [rest-api.sandbox.domain.com]..." 
time="2018-01-25T10:05:56Z" level=debug msg="No provided certificate found for domains [rest-api.sandbox.domain.com], get ACME certificate." 
time="2018-01-25T10:05:56Z" level=debug msg="Loading ACME certificates [rest-api.sandbox.domain.com]..." 
legolog: 2018/01/25 10:05:56 [INFO][rest-api.sandbox.domain.com] acme: Obtaining bundled SAN certificate
legolog: 2018/01/25 10:05:56 [INFO][rest-api.sandbox.domain.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/w3M__oDqozE[...]T_SPCiF7p5CYLFI
legolog: 2018/01/25 10:05:56 [INFO][rest-api.sandbox.domain.com] acme: Could not find solver for: dns-01
legolog: 2018/01/25 10:05:56 [INFO][rest-api.sandbox.domain.com] acme: Trying to solve HTTP-01
time="2018-01-25T10:05:56Z" level=debug msg="Challenge Present rest-api.sandbox.domain.com" 
time="2018-01-25T10:06:07Z" level=debug msg="Challenge CleanUp rest-api.sandbox.domain.com" 
time="2018-01-25T10:06:07Z" level=error msg="map[rest-api.sandbox.domain.com:acme: Error 400 - urn:acme:error:connection - Fetching http://rest-api.sandbox.domain.com/.well-known/acme-challenge/GECQ9JRWb4pA[...]Bc3rmeveJd611YowU: Timeout
Error Detail:
	Validation for rest-api.sandbox.domain.com:80
	Resolved to:
		***.***.***.***
		***:*:*:*::*
	Used: ***:*:*:*::*

]" 
time="2018-01-25T10:06:07Z" level=error msg="Error getting ACME certificates [rest-api.sandbox.domain.com] : cannot obtain certificates map[rest-api.sandbox.domain.com:acme: Error 400 - urn:acme:error:connection - Fetching http://rest-api.sandbox.domain.com/.well-known/acme-challenge/GECQ9JRWb4pA0OlC[...]eJd611YowU: Timeout
Error Detail:
	Validation for rest-api.sandbox.domain.com:80
	Resolved to:
		***.***.***.***
		***:*:*:*::*
	Used: ***:*:*:*::*

]" 
time="2018-01-25T10:06:07Z" level=debug msg="LoadCertificateForDomains []..." 
legolog: 2018/01/25 10:06:07 [INFO][exceptions.sandbox.domain.com] acme: Obtaining bundled SAN certificate
time="2018-01-25T10:06:07Z" level=debug msg="LoadCertificateForDomains [exceptions.sandbox.domain.com]..." 
time="2018-01-25T10:06:07Z" level=debug msg="Looking for provided certificate to validate [exceptions.sandbox.domain.com]..." 
time="2018-01-25T10:06:07Z" level=debug msg="No provided certificate found for domains [exceptions.sandbox.domain.com], get ACME certificate." 
time="2018-01-25T10:06:07Z" level=debug msg="Loading ACME certificates [exceptions.sandbox.domain.com]..." 
legolog: 2018/01/25 10:06:07 [INFO][exceptions.sandbox.domain.com] AuthURL: https://acme-staging.api.letsencrypt.org/acme/authz/oUlowLzxA9hKGib[...]MpTqEWA4ksu345xc
legolog: 2018/01/25 10:06:07 [INFO][exceptions.sandbox.domain.com] acme: Could not find solver for: dns-01
legolog: 2018/01/25 10:06:07 [INFO][exceptions.sandbox.domain.com] acme: Trying to solve HTTP-01
time="2018-01-25T10:06:07Z" level=debug msg="Challenge Present exceptions.sandbox.domain.com" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label traefik.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label payment_php.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label my_php.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label webfrontend_php.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label rest-api_php.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label order_php.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label catalog_php.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label price_php.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label notifications_php.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Filtering container without port and no traefik.port label exceptions_php.1 : strconv.Atoi: parsing "": invalid syntax" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.entryPoints labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.entryPoints labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.entryPoints labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.entryPoints labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.entryPoints labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.entryPoints labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.entryPoints labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.entryPoints labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.whitelistSourceRange labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.entryPoints labels" 
time="2018-01-25T10:06:09Z" level=debug msg="Could not load traefik.frontend.auth.basic labels"

The text was updated successfully, but these errors were encountered:

nmengin · 2018-01-25T13:52:51Z

Hello @deargonaut.
Tthanks for your interest in the project.

This kind of timeout is generated by LEGO (the Let's Encrypt GO library used by Træfik).
It happens when LE cannot access to Træfik in the way to do a HTTP challenge.

Even if the log appears after the CleanUp log, it's generated before during the challenge step as you can see in the Træfik code.

Can you check if :

The subdomain rest-api.sandbox.domain.com is mapped to the host where Træfik is deployed
The port 80 of the host where Træfik is deployed is reachable by LE in the port 80.

Thanks in advance.

deargonaut · 2018-01-25T14:06:28Z

Hi @nmengin.
Thanks for your prompt reply.

For this setup everything is deployed on one node.
Traefik is deployed on sandbox.domain.com. And reachable by port 80. As well as rest-api.sandbox.domain.com.

It will only time-out (also in the browser) when I request the specific ACME hash, like: http://rest-api.sandbox.domain.com/.well-known/acme-challange/GECQ9JRWb4pABc3rmeveJd611YowU.
When I type an other hash it will immediately trigger a 404 on my application.

Does this give you enough information?

nmengin · 2018-01-26T09:50:12Z

Hello @deargonaut .

Is it possible for you to continue the discussion with the team in our Slack.
@juliens created a thread.

I guess thanks to this more interactive way it should be easier to help you.

Thanks in advance

deargonaut · 2018-01-31T10:08:05Z

While debugging with Juliens we found a fix for this error.

It seemed that while trying to reach the .well-known/acme-challange url it always wanted to go via IPv6. When we removed the IPv6-interface and cleared it from DNS it got authenticated and I received my certificates.

Issue will remain open for Julien to come up how to reproduce and maybe fix this.

schasse · 2018-02-07T16:36:37Z

Hi, I ran into the same issue and I am intrested in the fix which @deargonaut described. I have two questions, though.

it always wanted to go via IPv6

What is it? The Let's Encrypt client trying to reach .well-known/acme-challenge url?

we removed the IPv6-interface

From where did you remove the IPv6-interface? Did you remove it from the host?

deargonaut · 2018-02-07T20:39:59Z

Hi @schasse,

It refers to the acme mechanism indeed. The client used IPv6 while trying the HTTP challenge.

I removed the IPv6 interface from the host, yes. I am running instances on OpenStack and removed the net-public-ipv6 interface. Thus it released the ipv6 on the eth0 (in my case).

Does this make sense?

schasse · 2018-02-08T13:00:11Z

Makes sense. Thanks for clarifying!

glitchroy · 2018-02-28T12:01:19Z

EDIT: There was a problem on my end, port 80 was blocked by another firewall. It's opened now and the certificate was requested without a problem.

Hey, I have the same problem. I'm not using docker swarm or cluster mode, so it's only one instance of traefik.
The output seems to be the same

traefik    | time="2018-02-28T11:48:07Z" level=error msg="map[test.domain.com:acme: Error 400 - urn:acme:error:connection - Fetching http://test.domain.com/.well-known/acme-challenge/gA7GL[...]lhDA: Timeout
traefik    | Error Detail:
traefik    |    Validation for test.domain.com:80
traefik    |    Resolved to:
traefik    |            XXX.XXX.XXX.XX
traefik    |    Used: XXX.XXX.XXX.XX
traefik    |
traefik    | ]"

However, no IPv6 address is being reported, so I'm guessing that's not the problem.
I don't know if I should open a seperate issue with my whole setup, because it's the same error after all. The .well-known path is not reachable per browser. I usually have an apache instance among other things on port 80 in a different docker-compose file, but it makes no difference if I put that down.

lawrencegripper · 2018-08-22T20:45:03Z

I saw this issue when using Traefik on Azure ACI. Moving from the standard scratch based docker image to “1.7-alpine” tag resolved it for me. I can’t say why but may help others.

richsanram · 2019-04-17T05:34:19Z

I had this issue with the tag 2.0-alpine (I know that is an alpha version yet), and the way I solved this was replacing /etc/resolv.conf with a custom resolv.conf file, with 'nameserver 1.1.1.1'

After this, traefik works like a charm.

ghost · 2019-05-05T18:51:42Z

I have the same issue, and none of the above solved it. I don't have IPv6, ports are forwarded, still got the 400 timeout from Traefik, and 404 if I want to get the URL myself.

mephinet · 2019-05-06T17:21:07Z

I've been debugging an issue for a few days now: In a setup like https://docs.traefik.io/user-guide/examples/#onhostrule-option-and-provided-certificates-with-http-challenge where we have a default wildcard certificate and use letsentrypt for all other domains, traefik constantly used the wildcard certificate even for domains that were not matched by the wildcard certificate. The logs were repeatedly showing

level=error msg="Error getting challenge for token retrying in ...s"

I was able to solve the issue by temporarily disabling the HTTPS redirect (the [entryPoints.http.redirect] section).
Maybe someone who still has this issue can try to check whether this is in fact the root cause for the timeouts...

ghost · 2019-05-06T18:43:18Z

Turned out in my case it was the router. Linksys Velop (cursing the day I bought that) simply ignored my port forward on 80 so it can show it’s admin page in internal network, and custom 404 on external. Had to put my server to it’s DMZ (default forward target for all) to register. At a _different_ page on it’s ui I could add the port forward, so I think it is okay now. Worst case I need to manually put to DMZ again once certs expired. 2019. máj. 6. dátummal, 19:21 időpontban Philipp Gortan <notifications@github.com> írta:

…

I've been debugging an issue for a few days now: In a setup like https://docs.traefik.io/user-guide/examples/#onhostrule-option-and-provided-certificates-with-http-challenge where we have a default wildcard certificate and use letsentrypt for all other domains, traefik constantly used the wildcard certificate even for domains that were not matched by the wildcard certificate. The logs were repeatedly showing level=error msg="Error getting challenge for token retrying in ...s" I was able to solve the issue by temporarily disabling the HTTPS redirect (the [entryPoints.http.redirect] section). Maybe someone who still has this issue can try to check whether this is in fact the root cause for the timeouts... — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

pimjansen · 2019-09-18T21:51:51Z

Also havin this issue but as far as i can see no ipv6 on the dns atleast.

Is there any update or workaround for this?

jonkristian · 2020-02-16T21:48:25Z

@mephinet I too have this issue, but on 2.1. I feel it's quite similar because we also have a default wildcard cert for main domain, and use LE for other domains and I am seeing the same errors. I've created a post in discourse about it but for now I am still at a loss.

https://community.containo.us/t/cannot-retrieve-the-acme-challenge-for-token/4391/5

Did you ever figure out what was going on?

mephinet · 2020-02-16T22:14:16Z

Did you ever figure out what was going on?

No, unfortunately I never figured it out and finally switched to https://kubernetes.github.io/ingress-nginx/ ...

jonkristian · 2020-02-16T23:10:53Z

That's too bad, thanks for replying :)

trajano · 2020-06-05T12:19:53Z

This just happened to me today. I had a Traefik 1.7 setup for a while I just did a reboot to test something out and now its timing out.

trajano · 2020-06-05T12:21:21Z

I had this issue with the tag 2.0-alpine (I know that is an alpha version yet), and the way I solved this was replacing /etc/resolv.conf with a custom resolv.conf file, with 'nameserver 1.1.1.1'

The /etc/resolv.conf in traefik or on the server itself?

wbsouza · 2020-10-10T00:37:12Z

I have the same error occurring with the v2.3.1 on AWS ECS running with Fargate.

After the cluster creation via terraform, the HTTP proxy works fine but when we try to call the app using HTTPS the browser the error code is: SSL_ERROR_RX_RECORD_TOO_LONG (traefik is not responding with HTTPS but with HTTP).

With this code you can reproduce this error
git clone https://github.com/wbsouza/traefik-ecs

What is causing it?
The HTTPS does not work because there is a timeout from traefik when Letsencrypt try to validate the certificate.

Hipotetical domain: mycompany.com (change it by a true domain in the variables.tf file)
app_hostname = "myrealdomain.com"

First I tried to use the hostname from AWS and it seems that Letsencrypt blocks it:

time="2020-10-10T00:10:29Z" level=error msg="Unable to obtain ACME certificate for domains \"api-126145927.sa-east-1.elb.amazonaws.com\": unable to generate a certificate for the domains [api-126145927.sa-east-1.elb.amazonaws.com]: acme: error: 400 :: POST :: https://acme-v02.api.letsencrypt.org/acme/new-order :: urn:ietf:params:acme:error:rejectedIdentifier :: Error creating new order :: Cannot issue for \"api-126145927.sa-east-1.elb.amazonaws.com\": The ACME server refuses to issue a certificate for this domain name, because it is forbidden by policy, url: " providerName=le.acme routerName=whoami-secure@ecs rule="Host('api-126145927.sa-east-1.elb.amazonaws.com')"

Another try was to add one CNAME on my DNS entry to associate the AWS address to my FQDN
Initially I got one error because the DNS was still not updated:

time="2020-10-09T23:42:59Z" level=error msg="Unable to obtain ACME certificate for domains \"app.mycompany.com\": unable to generate a certificate for the domains [app.mycompany.com]: error: one or more domains had a problem:\n[app.mycompany.com] acme: error: 400 :: urn:ietf:params:acme:error:dns :: DNS problem: NXDOMAIN looking up A for app.mycompany.com - check that a DNS record exists for this domain, url: \n" rule="Host('app.mycompany.com')" providerName=le.acme routerName=whoami-secure@ecs

But later when the DNS had it updated I got another error:

time="2020-10-09T23:57:59Z" level=error msg="Unable to obtain ACME certificate for domains \"api.mycompany.com\": unable to generate a certificate for the domains [api.mycompany.com]: error: one or more domains had a problem:\n[api.mycompany.com] acme: error: 403 :: urn:ietf:params:acme:error:unauthorized :: Invalid response from http://api.mycompany.com/.well-known/acme-challenge/STzh6jw8b6p7ZiyAiZAIe9IVYzXwInsnOYCs1hw0U_I [18.229.176.157]: 404, url: \n" routerName=whoami-secure@ecs rule="Host('app.mycompany.com')"

pascalgross · 2020-12-18T07:40:04Z

I see the same error on multiple stacks. Obtaining LE certificates worked on others, with almost identical traefik config.

version: '3.7'

volumes:
    prometheus_data: {}
    grafana_data: {}

networks:
  monitor-network:
    driver: overlay
    name: inbound
  traefik-public:
    external: true

services:
  prometheus:
    image: prom/prometheus
    volumes:
      - ./prometheus/:/etc/prometheus/
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    networks:
      - monitor-network
      - traefik-public
    deploy:
      placement:
        constraints:
          - node.role==manager
      labels:
          - traefik.enable=true
          - traefik.docker.network=traefik-public
          - traefik.constraint-label=traefik-public
          - traefik.http.routers.prometheus-http.rule=Host(`prometheus.mycompany.com`)
          - traefik.http.routers.prometheus-http.entrypoints=web
          - traefik.http.routers.prometheus-http.middlewares=redirecttls
#          - traefik.http.routers.prometheus-http.middlewares=auth
          - traefik.http.routers.prometheus-https.rule=Host(`prometheus.mycompany.com`)
          - traefik.http.routers.prometheus-https.entrypoints=websecure
          - traefik.http.routers.prometheus-https.tls=true
          - traefik.http.routers.prometheus-https.tls.certresolver=letsencrypt
          - traefik.http.services.prometheus.loadbalancer.server.port=9090
          - traefik.http.routers.prometheus-https.middlewares=auth
      restart_policy:
        condition: on-failure

My traefik yaml file looks as follows:

version: '3'

services:
  reverse_proxy:
    image: traefik:v2.3.4
    command:
      # Docker swarm configuration
      - "--providers.docker.endpoint=unix:///var/run/docker.sock"
      - "--providers.docker.swarmMode=true"
      - "--providers.docker.exposedbydefault=false"
      - "--providers.docker.network=traefik-public"
      # Configure entrypoint
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      # SSL configuration
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"
#      - "--certificatesresolvers.letsencrypt.acme.tlschallenge=true"
      - "--certificatesresolvers.letsencrypt.acme.email=ssl@mycompany.com"
      - "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
#      - "--certificatesresolvers.letsencrypt.acme.caServer=https://acme-staging-v02.api.letsencrypt.org/directory"
      - "--api=true"
      - "--api.dashboard=true"
      - "--accesslog=true"
      - "--accesslog.filepath=/logs/access.log"
      - "--metrics.prometheus=true"
      - "--entryPoints.metrics.address=:8082"
      - "--metrics.prometheus.entryPoint=metrics"
      - "--metrics.prometheus.buckets=0.1,0.3,1.2,5.0"
      - "--pilot.token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    ports:
      - 80:80
      - 443:443
    volumes:
      # To persist certificates
      - traefik-certificates:/letsencrypt
      - traefik-logs:/logs
      # So that Traefik can listen to the Docker events
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - traefik-public
    deploy:
      mode: global
      placement:
        constraints:
          - node.role == manager
      update_config:
        parallelism: 1
        delay: 10s
      restart_policy:
        condition: on-failure
      labels:
        - "traefik.enable=true"
        - "traefik.http.routers.traefik.service=api@internal"
        - "traefik.http.routers.traefik.rule=Host(`traefik.mycompany.com`)"
        - "traefik.http.routers.traefik.entrypoints=web"
        - "traefik.http.routers.traefik.middlewares=redirecttls"
        - "traefik.http.services.traefik.loadbalancer.server.port=80"
        - "traefik.http.middlewares.redirecttls.redirectscheme.scheme=https"
        - "traefik.http.routers.traefiktls.service=api@internal"
        - "traefik.http.routers.traefiktls.rule=Host(`traefik.mycompany.com`)"
        - "traefik.http.routers.traefiktls.entrypoints=websecure"
        - "traefik.http.routers.traefiktls.tls.certresolver=letsencrypt"
        - "traefik.http.routers.traefiktls.middlewares=auth"
        - "traefik.http.services.traefik.loadbalancer.server.port=443"
        - "traefik.http.middlewares.auth.basicauth.users=abc:$$apr1$$xyz"
volumes:
  traefik-certificates:
  traefik-logs:
networks:
  traefik-public:
    external: true

Traefik logs the following error:

time="2020-12-18T06:46:02Z" level=error msg="Unable to obtain ACME certificate for domains "prometheus.mycompany.com": unable to generate a certificate for the domains [prometheus.mycompany.com]: error: one or more domains had a problem:\n[prometheus.mycompany.com] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Fetching http://prometheus.mycompany.com/.well-known/acme-challenge/W1CkrdBQ552lsSo9H9rfQhb8rxuVDlorhGx-VLbC3jY: Timeout after connect (your server may be slow or overloaded), url: \n" routerName=prometheus-https@docker rule="Host(prometheus.mycompany.com)" providerName=letsencrypt.acme

The domain prometheus.mycompany.com resolved via CNAME to an A and AAAA record.

dig prometheus.mycompany.com A prometheus.mycompany.com AAAA @8.8.8.8 +short
srv03.mycompany.com.
123.45.67.89
srv03.mycompany.com.
2aaa:f88:123:1234::2

pascalgross · 2020-12-19T11:14:59Z

Removing the AAAA IPv6 IP from the srv03.mycompany.com resolves the problem. How can that be?

trajano · 2020-12-19T17:17:03Z

@pascalgross can you confirm that accessing your server from the outside using the IPV6 address works correctly? Maybe that's why it failed.

pascalgross · 2020-12-19T17:54:41Z

@trajano I can ping the Server using IPv6, I can ssh using IPv6, but accessing a Webserver (e.g. traefik instance) using ipv6 fails. So I guess there is a) a configuration failure b) a bug in traefik.

trajano · 2020-12-19T21:10:49Z

But can you access the HTTP port using IPV6? (not just HTTPS). I guess curl -v http://ipv6address somehow

CarlQLange · 2021-02-06T13:42:27Z

I don't know if this helps anybody, but in Azure AKS, I needed to set "Outbound source network address translation" to "Outbound and inbound use the same IP. SNAT port exhaustion may occur." in the load balancer that pointed to Traefik. Otherwise I had this timeout issue.

SmallhillCZ · 2021-12-24T15:47:39Z

Just spent a day on this one, so to summarize for anyone with similar problem:

If you have AAAA DNS record (i.e. IPv6) on your domain, Let's Encrypt will always use that for ACME verification instead of IPv4
Docker Swarm doesn't bind published ports on IPv6 interfaces

=> Let's Encrypt will not be able to access verification code at domain.com/.well-known/acme-challenge/[token]

Solutions:

Delete the AAAA DNS record on your domain and wait with IPv6 support for better times (if Docker Swarm ever implements it)
Set your Traefic to run on master node and set port using long syntax with mode: host - this will bind the port also on IPv6

rtribotte · 2022-03-22T14:43:57Z

Hello,

For v2 version, the challenges mechanism has been rewritten since v2.4.0 by PR #7458.
As the original author seems to have a fix, and since we think we are not affected by the bug in v2, we are closing this issue.

Feel free to reopen if you can reproduce the issue with the latest v2 version.

ldez added status/0-needs-triage area/acme labels Jan 25, 2018

nmengin added the contributor/waiting-for-feedback label Jan 25, 2018

ldez removed the contributor/waiting-for-feedback label Jan 25, 2018

This comment has been minimized.

Sign in to view

ldez assigned juliens Jan 30, 2018

emilevauge added kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. priority/P2 need to be fixed in the future and removed status/0-needs-triage labels Feb 1, 2018

juliens removed their assignment Feb 1, 2018

This comment has been minimized.

Sign in to view

02JanDal mentioned this issue Jul 15, 2019

Traefik not getting SSL certificates for some domains, Let's Encrypt giving 400 error #5103

Closed

2 tasks

ddtmachado assigned rtribotte and unassigned mmatur Mar 22, 2022

rtribotte closed this as completed Mar 22, 2022

rtribotte added resolution/declined and removed priority/P2 need to be fixed in the future labels Mar 22, 2022

traefik locked and limited conversation to collaborators Apr 22, 2022

traefiker added the status/5-frozen-due-to-age label Apr 22, 2022

ACME HTTP-01 challenge fails by timeout #2763

ACME HTTP-01 challenge fails by timeout #2763

Comments

deargonaut commented Jan 25, 2018 • edited by ldez

Do you want to request a feature or report a bug?

What did you do?

What did you expect to see?

What did you see instead?

Possible problems / fixes

Output of traefik version: (What version of Traefik are you using?)

What is your environment & configuration (arguments, toml, provider, platform, ...)?

If applicable, please paste the log output in debug mode (--debug switch)

nmengin commented Jan 25, 2018

deargonaut commented Jan 25, 2018

This comment has been minimized.

This comment has been minimized.

nmengin commented Jan 26, 2018 • edited

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

deargonaut commented Jan 31, 2018

schasse commented Feb 7, 2018 • edited

deargonaut commented Feb 7, 2018 • edited

schasse commented Feb 8, 2018

This comment has been minimized.

glitchroy commented Feb 28, 2018 • edited by ldez

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

lawrencegripper commented Aug 22, 2018

This comment has been minimized.

This comment has been minimized.

richsanram commented Apr 17, 2019

ghost commented May 5, 2019

mephinet commented May 6, 2019

ghost commented May 6, 2019 via email

pimjansen commented Sep 18, 2019

jonkristian commented Feb 16, 2020

mephinet commented Feb 16, 2020

jonkristian commented Feb 16, 2020

trajano commented Jun 5, 2020

trajano commented Jun 5, 2020

wbsouza commented Oct 10, 2020 • edited by ldez

pascalgross commented Dec 18, 2020

pascalgross commented Dec 19, 2020

trajano commented Dec 19, 2020

pascalgross commented Dec 19, 2020

trajano commented Dec 19, 2020 • edited

CarlQLange commented Feb 6, 2021

SmallhillCZ commented Dec 24, 2021 • edited

rtribotte commented Mar 22, 2022

deargonaut commented Jan 25, 2018 •

edited by ldez

Output of `traefik version`: (What version of Traefik are you using?)

If applicable, please paste the log output in debug mode (`--debug` switch)

nmengin commented Jan 26, 2018 •

edited

schasse commented Feb 7, 2018 •

edited

deargonaut commented Feb 7, 2018 •

edited

glitchroy commented Feb 28, 2018 •

edited by ldez

wbsouza commented Oct 10, 2020 •

edited by ldez

trajano commented Dec 19, 2020 •

edited

SmallhillCZ commented Dec 24, 2021 •

edited