Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

version 0.9.1 regression causes a systematic crash (rolled back to 0.8.3 to fix) #56

Closed
benoit-pereira-da-silva opened this issue Jan 12, 2020 · 15 comments

Comments

@benoit-pereira-da-silva
Copy link

benoit-pereira-da-silva commented Jan 12, 2020

What version of the package are you using?

0.9.1

What are you trying to do?

We use successfully Certmagic in production since 10 months.

		certmagic.Default.Agreed = true
		certmagic.Default.Email = s.Email
		certmagic.Default.DisableHTTPChallenge = true

		cfg := certmagic.NewDefault()
		tlsConfig := cfg.TLSConfig()
		err := cfg.ManageSync(s.Hosts)
		if err != nil {
			return err
		}

What steps did you take?

We just have updated to certmagic 0.9.0 (crashes happened regularly)
With certmagic 0.9.1 panic is systematic
So we have rolled back to 0.8.3 that is stable.

2020/01/12 08:44:21 [INFO][cache:0xc000189720] Started certificate maintenance routine
2020/01/12 08:44:22 [INFO][0.tplst.com] Renew certificate
2020/01/12 08:44:22 [INFO][0.tplst.com] Renew: Waiting on rate limiter...
2020/01/12 08:44:22 [INFO][0.tplst.com] Renew: Done waiting
2020/01/12 08:44:22 [INFO] [0.tplst.com] acme: Trying renewal with 710 hours remaining
2020/01/12 08:44:22 [INFO] [0.tplst.com] acme: Obtaining bundled SAN certificate
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x7667a5]

goroutine 1 [running]:
time.(*Timer).Stop(...)
	/usr/local/go/src/time/sleep.go:74
github.com/cenkalti/backoff/v3.(*defaultTimer).Stop(0xc00059c120)
	/home/tplst/go/pkg/mod/github.com/cenkalti/backoff/v3@v3.2.1/timer.go:32 +0x25
github.com/cenkalti/backoff/v3.RetryNotifyWithTimer.func1(0xc1e9a0, 0xc00059c120)
	/home/tplst/go/pkg/mod/github.com/cenkalti/backoff/v3@v3.2.1/retry.go:45 +0x31
github.com/cenkalti/backoff/v3.RetryNotifyWithTimer(0xc000498420, 0x7f1daeca15e8, 0xc0004f7cc0, 0x0, 0xc1e9a0, 0xc00059c120, 0x0, 0x0)
	/home/tplst/go/pkg/mod/github.com/cenkalti/backoff/v3@v3.2.1/retry.go:53 +0x34d
github.com/cenkalti/backoff/v3.RetryNotify(...)
	/home/tplst/go/pkg/mod/github.com/cenkalti/backoff/v3@v3.2.1/retry.go:31
github.com/cenkalti/backoff/v3.Retry(...)
	/home/tplst/go/pkg/mod/github.com/cenkalti/backoff/v3@v3.2.1/retry.go:25
github.com/go-acme/lego/v3/acme/api.(*Core).retrievablePost(0xc0003e6000, 0xc0006f6500, 0x33, 0xc000730e40, 0x36, 0x40, 0xa09320, 0xc0003b63c0, 0xc000144610, 0xc0003b6460, ...)
	/home/tplst/go/pkg/mod/github.com/go-acme/lego/v3@v3.3.0/acme/api/api.go:107 +0x210
github.com/go-acme/lego/v3/acme/api.(*Core).post(0xc0003e6000, 0xc0006f6500, 0x33, 0xaebde0, 0xc0003b6460, 0xa09320, 0xc0003b63c0, 0x1, 0x200000003, 0xc000000180)
	/home/tplst/go/pkg/mod/github.com/go-acme/lego/v3@v3.3.0/acme/api/api.go:70 +0xf5
github.com/go-acme/lego/v3/acme/api.(*OrderService).New(0xc0003e60c0, 0xc0006db9c0, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/home/tplst/go/pkg/mod/github.com/go-acme/lego/v3@v3.3.0/acme/api/order.go:22 +0x240
github.com/go-acme/lego/v3/certificate.(*Certifier).Obtain(0xc0006e5200, 0xc0006db9b0, 0x1, 0x1, 0x1, 0xabacc0, 0xc000738e40, 0x0, 0xc000086958, 0x0, ...)
	/home/tplst/go/pkg/mod/github.com/go-acme/lego/v3@v3.3.0/certificate/certificates.go:102 +0x1ec
github.com/go-acme/lego/v3/certificate.(*Certifier).Renew(0xc0006e5200, 0xc00072e250, 0xb, 0xc000086180, 0x53, 0xc0000861e0, 0x53, 0xc00074a000, 0xe3, 0x2e3, ...)
	/home/tplst/go/pkg/mod/github.com/go-acme/lego/v3@v3.3.0/certificate/certificates.go:384 +0x427
github.com/mholt/certmagic.(*acmeClient).tryRenew(0xc0006e5230, 0xc00072e250, 0xb, 0xc000086180, 0x53, 0xc0000861e0, 0x53, 0xc00074a000, 0xe3, 0x2e3, ...)
	/home/tplst/go/pkg/mod/github.com/mholt/certmagic@v0.9.1/client.go:419 +0x80
github.com/mholt/certmagic.(*acmeClient).Renew(0xc0006e5230, 0xc214a0, 0xc000026158, 0x7fff3ff07516, 0xb, 0x0, 0x0)
	/home/tplst/go/pkg/mod/github.com/mholt/certmagic@v0.9.1/client.go:394 +0x624
github.com/mholt/certmagic.(*Config).RenewCert(0xc0003e60f0, 0xc214a0, 0xc000026158, 0x7fff3ff07516, 0xb, 0xc0000b2400, 0x20f, 0x40f)
	/home/tplst/go/pkg/mod/github.com/mholt/certmagic@v0.9.1/config.go:478 +0x183
github.com/mholt/certmagic.(*Config).manageOne(0xc0003e60f0, 0xc214a0, 0xc000026158, 0x7fff3ff07516, 0xb, 0x2, 0x8)
	/home/tplst/go/pkg/mod/github.com/mholt/certmagic@v0.9.1/config.go:426 +0x4f1
github.com/mholt/certmagic.(*Config).manageAll(0xc0003e60f0, 0x0, 0x0, 0xc0000ab320, 0x12, 0x12, 0x7f1daeced000, 0x0, 0xc000459180)
	/home/tplst/go/pkg/mod/github.com/mholt/certmagic@v0.9.1/config.go:394 +0x1d1
github.com/mholt/certmagic.(*Config).ManageSync(...)
	/home/tplst/go/pkg/mod/github.com/mholt/certmagic@v0.9.1/config.go:320
gitlab.tplst.com/theplaylist/tplst/pkg/com.(*HttpServer).Serve(0xc000499c48, 0xc139c0, 0xc000093a00, 0x0, 0xc0000ab320)
	/home/tplst/src/pkg/com/http_server.go:30 +0xd5
gitlab.tplst.com/theplaylist/tplst/pkg/cmds.(*Tplst).start(0xc000093a00, 0xc000093a00, 0x7fff3ff0750f)
	/home/tplst/src/pkg/cmds/tplst.go:158 +0xbe7
gitlab.tplst.com/theplaylist/tplst/pkg/cmds.NewTplst(...)
	/home/tplst/src/pkg/cmds/tplst.go:60
main.main()
	/home/tplst/src/tplst.go:85 +0x45e

Is "github.com/cenkalti/backoff" really required?

@mholt
Copy link
Member

mholt commented Jan 12, 2020

Woah, that looks like a bug in lego. I do not recognize this code. Will look at it more on Monday...

@mholt
Copy link
Member

mholt commented Jan 12, 2020

Pinging @ldez as the stack trace points to something in lego. He might have a better idea.

@benoit-pereira-da-silva
Copy link
Author

Matt, thanks for replying so fast. You can't recognize this code because its ours! I just wanted to underline the context. In fact We call certmagic.NewDefault() then cfg.ManageSync(s.Hosts) s.Hosts contains 10 domain names.

@mholt
Copy link
Member

mholt commented Jan 12, 2020 via email

@ldez
Copy link

ldez commented Jan 12, 2020

Hello,

The panic comes from cenkalti/backoff.

lego and certmagic use v3.0.0 of cenkalti/backoff

You are using v3.2.1 of cenkalti/backoff

An issue has been already reported in cenkalti/backoff: cenkalti/backoff#87

More information:

So the panic is not related to lego or certmagic, I recommend to use <v3.2.0 of cenkalti/backoff.

@benoit-pereira-da-silva
Copy link
Author

benoit-pereira-da-silva commented Jan 12, 2020

Hi @ldez,

Thanks for the explanation. We have rolled back to certmagic 0.8.3 successfully without side effects. This issue can cause serious problems to people using caddy, certmagic or lego, ... I'm sure that @mholt will fix the dependencies soon.

I recommend reading @rsc's "Our Software Dependency Problem" posted one year ago, but each days more relevant.

@ldez
Copy link

ldez commented Jan 12, 2020

You don't need to rollback certmagic, you just need to downgrade cenkalti/backoff to v3.0.0

Note that caddy use also the v3.0.0
https://github.com/caddyserver/caddy/blob/4b68de84181938381f604064336b0342e389c551/go.sum#L36

@mholt or me (I'm the main maintainer of lego) don't have to fix dependencies because we are using the good version: v3.0.0.

The problem comes only if:

  • you are not using go module
  • or you have set a explicit version (v3.2.1) to cenkalti/backoff
  • or you are using another lib that use v3.2.1 of cenkalti/backoff

I recommend reading Minimal Version Selection

But to avoid this issue for someone that update the cenkalti/backoff dependency in the future, I made a PR on lego. go-acme/lego#1043

@benoit-pereira-da-silva
Copy link
Author

@ldez sure we could have downgraded backoff to v3.0.0.

But we prefer to use consistent and well tested releases when possible. We have been comparing changes between v0.8.3 vs v0.91 and decided to roll back our production servers.

Our CI did not detect this issue, because testing SSL certificate management is not that easy to test. Certmagic is important for us and we prefer to wait for @mholt's fix.

NB: good news for caddy's users!

@ldez
Copy link

ldez commented Jan 12, 2020

There is nothing to fix in Certmagic.

https://github.com/mholt/certmagic/blob/7311b4680c76370b489f847c901e29eac18c9eb8/go.sum#L45

The problem comes only if:

  • you are not using go module
  • or you have set a explicit version (v3.2.1) to cenkalti/backoff
  • or you are using another lib that use v3.2.1 of cenkalti/backoff

I recommend reading Minimal Version Selection

@benoit-pereira-da-silva
Copy link
Author

@ldez we are using go modules and did not have set an explicit version.
I m investigating to understand what happened.

@benoit-pereira-da-silva
Copy link
Author

@ldez thanks for that clarification. After controlling the go.sum i can confirm that neither lego nor certmagic are responsible of that panic. I'm closing the issue.

@mholt
Copy link
Member

mholt commented Jan 12, 2020

Thank you both for investigating while I was asleep 😅 -- ldez especially, that was very helpful! That's what I get for reading stack traces at 2 in the morning. 😴

To clarify one thing though:

Neither CertMagic nor Caddy use the cenkalti/backoff library. You'll notice it does not appear in our go.mod file. It appears in go.sum however because one of our dependencies uses it, so it needs to appear in the checksum database for the tooling to help make sure the code hasn't been tampered with before compiling.

Hope that is helpful! Thanks again to both of you for your cooperation!

@benoit-pereira-da-silva
Copy link
Author

@mholt i'm sorry to have stressed you so late/early in the eve/mor/ning. Some of our servers have crashed due to an involuntary update of cenkalti/backoff. I should have investigate a little bit more before opening an issue, but i sincerely thought it could affect seriously your ecosystem.
We rely only on 3 dependencies one is certmagic. The stack trace made it the ideal suspect!

@ldez
Copy link

ldez commented Jan 12, 2020

To be precise:

The go.mod file reference only direct dependencies (and indirect dependencies if a direct dependency don't use go module)

The go.sum file reference direct and indirect dependencies (transitive dependencies).
This file contains the reference to all the libraries versions used by all the dependencies (this is why the go.sum can reference several version of the same dependency).
The go.sum is not used as a lock for dependencies (i.e. it's not used by a project to find the version of an direct or indirect dependency) but the go.sum allows to known the minimal version (highest version) of an indirect dependency that will be used by a client of a library.

CertMagic and Caddy don't use directly the cenkalti/backoff library, but they use it because they use lego. It's not an optional dependency and the highest version in the go.sum is the version used during the compilation.

@mholt
Copy link
Member

mholt commented Jan 12, 2020

@benoit-pereira-da-silva Thanks for checking in anyway, it was my choice to check my notifications in the middle of the night in this case.

@ldez I believe that's correct, thank you for explaining that in detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants