Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

100% CPU after upgrade to consul 1.5.2 #6120

Closed
42wim opened this issue Jul 12, 2019 · 3 comments
Closed

100% CPU after upgrade to consul 1.5.2 #6120

42wim opened this issue Jul 12, 2019 · 3 comments
Labels
type/bug Feature does not function as expected waiting-reply Waiting on response from Original Poster or another individual in the thread

Comments

@42wim
Copy link
Contributor

42wim commented Jul 12, 2019

After upgrading to 1.5.2 our consul agents where taking all available CPU.

I traced the issue to PR #5468

We specify our IPv6 recursor on startup with the flag -recursor [2001:db8:85a3:8d3:1319:8a2e:370:7348] but without a port (which should use the default port 53)

This PR introduced a tight loop on that port 53 check in:

consul/agent/dns.go

Lines 271 to 275 in a82e6a7

START:
_, _, err := net.SplitHostPort(recursor)
if ae, ok := err.(*net.AddrError); ok && ae.Err == "missing port in address" {
recursor = ipaddr.FormatAddressPort(recursor, 53)
goto START
which results in an every increasing bracket adding [[[2001:db8:85a3:8d3:1319:8a2e:370:7348]]:53]:53 etc and taking all the CPU

According to golang/go#20470 the "[]" are only allowed when specifying a port with an ipv6 address.

But the -recursor 2001:db8:85a3:8d3:1319:8a2e:370:7348 fails with too many colons in address

So a possible fix would be to check for that error as below:

START:
	_, _, err := net.SplitHostPort(recursor)
	if ae, ok := err.(*net.AddrError); ok && (ae.Err == "missing port in address" ||  ae.Err == "too many colons in address") {
		recursor = ipaddr.FormatAddressPort(recursor, 53)
		goto START

ping @pierresouchay

Simple workaround for this bug is specifying the port -recursor [2001:db8:85a3:8d3:1319:8a2e:370:7348]:53

@banks banks added type/bug Feature does not function as expected good first issue A well-defined bug or improvement with sufficient context which should be approachable for new contr labels Jul 12, 2019
@banks
Copy link
Member

banks commented Jul 12, 2019

Thanks @42wim for the great debugging tips too.

You're proposed fix sounds like a good one to me. If you'd like to contribute it as a PR with a test that would be much appreciated. But if not we will take a look.

@42wim
Copy link
Contributor Author

42wim commented Aug 5, 2019

Made PR for this in #6128

@pearkes pearkes added waiting-reply Waiting on response from Original Poster or another individual in the thread waiting-pr-merge labels Sep 19, 2019
@freddygv freddygv removed the good first issue A well-defined bug or improvement with sufficient context which should be approachable for new contr label Oct 4, 2019
@stale
Copy link

stale bot commented Nov 3, 2019

Hey there, This issue has been automatically closed because there hasn't been any activity for at least 90 days. If you are still experiencing problems, or still have questions, feel free to open a new one 👍

@stale stale bot closed this as completed Nov 3, 2019
@hanshasselberg hanshasselberg reopened this Feb 3, 2020
42wim added a commit to 42wim/consul that referenced this issue Feb 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Feature does not function as expected waiting-reply Waiting on response from Original Poster or another individual in the thread
Projects
None yet
Development

No branches or pull requests

5 participants