Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use correct domain root-servers.net for DNS probes #1163

Merged
merged 1 commit into from Apr 10, 2023

Conversation

andreaskaris
Copy link
Contributor

@andreaskaris andreaskaris commented Apr 3, 2023

Is this a BUG FIX or a FEATURE ?:

Uncomment only one, leave it on its own line:

/kind bug

/kind enhancement

Closes #1164

What this PR does / why we need it:

Special notes for your reviewer:

Release note:
NONE

NONE

Fix for https://issues.redhat.com/browse/OCPBUGS-11272

At time of this writing (both in OpenShift QE and from my laptop) we can see that root-server.net returns nothing:

[akaris@linux test]$ cat main.go
// You can edit this code!
// Click here and start typing.
package main

import (
"fmt"
"net"
)

func main() {
for _, i := range []string{"[www.google.com](http://www.google.com/)", "a.root-servers.net", "root-servers.net", "root-server.net"} {
ns, err := net.LookupNS(i)
fmt.Println(i, "ns", ns, "err", err)
}
}
[akaris@linux test]$ go run .
[www.google.com](http://www.google.com/) ns [] err lookup www.google.com on 127.0.0.53:53: no such host
a.root-servers.net ns [] err lookup a.root-servers.net on 127.0.0.53:53: no such host
root-servers.net ns [0xc000180200 0xc000180210 0xc000180230 0xc000180240 0xc000180250 0xc000180260 0xc000180270 0xc000180280 0xc000180290 0xc0001802a0 0xc0001802b0 0xc0001802c0 0xc0001802d0] err
root-server.net ns [] err lookup root-server.net on 127.0.0.53:53: no such host
[akaris@linux test]$

The correct domain is root-servers.net (plural) anyway: https://www.iana.org/domains/root/servers

Signed-off-by: Andreas Karis <ak.karis@gmail.com>
@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Apr 3, 2023
@kubevirt-bot kubevirt-bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 3, 2023
@kubevirt-bot
Copy link
Collaborator

Hi @andreaskaris. Thanks for your PR.

I'm waiting for a nmstate member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kubevirt-bot kubevirt-bot added size/XS release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Apr 3, 2023
@k8scoder192
Copy link

k8scoder192 commented Apr 3, 2023

Why do I get Can't find root-servers.net (same for root-server.net).

nslookup root-servers.net 8.8.8.8
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
*** Can't find root-servers.net: No answer

however, root-servers.org, a.root-servers.net, b.root-servers.net, etc work (see below)

Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   b.root-servers.net
Address: 199.9.14.201
Name:   b.root-servers.net
Address: 2001:500:200::b

Personally it's my opinion to not hardcode values. Allow this to be configurable for the user (and allow them to disable DNS check if they so wish).

I ran into this very issue here
#1164

@cybertron @qinqon @andreaskaris

@qinqon
Copy link
Member

qinqon commented Apr 4, 2023

/retest

@qinqon
Copy link
Member

qinqon commented Apr 4, 2023

Why do I get Can't find root-servers.net (same for root-server.net).

nslookup root-servers.net 8.8.8.8
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
*** Can't find root-servers.net: No answer

however, root-servers.org, a.root-servers.net, b.root-servers.net, etc work (see below)

Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   b.root-servers.net
Address: 199.9.14.201
Name:   b.root-servers.net
Address: 2001:500:200::b

Personally it's my opinion to not hardcode values. Allow this to be configurable for the user (and allow them to disable DNS check if they so wish).

I ran into this very issue here #1164

@cybertron @qinqon @andreaskaris

The dns probe is auto disabled if it's not working before apply, so we hare safe here.

As a follow up of this we can add a field to the NMState CR to sepcify the endpoint for the DNS probe.

@qinqon
Copy link
Member

qinqon commented Apr 4, 2023

This is weird I am sure this was working before, did the change the name ?

pkg/probe/probes.go Show resolved Hide resolved
@k8scoder192
Copy link

@qinqon

The dns probe is auto disabled if it's not working before apply, so we hare safe here.

As a follow up of this we can add a field to the NMState CR to sepcify the endpoint for the DNS probe.

Thanks for the response and for putting in a field to specify the endpoint in a future release (hopefully soon :)). For this particular PR, I am glad to see the change is now "root-servers.org" that's that correct one

Regarding dns probe auto disabled before apply, I'm not sure this is working otherwise I wouldn't be seeing dns probe failures (it would be skipped but it's not) see #1164

@qinqon
Copy link
Member

qinqon commented Apr 5, 2023

Can you retest it ? maybe you were runnig it when root-server.net was down or the like, DNS probe like Gateway probe are suppose to be deactivated if not passing before apply:

You have to see at your logs "WARNING not selecting dns probe"

@qinqon
Copy link
Member

qinqon commented Apr 5, 2023

/retest

golang hiccup

fatal: unable to access 'https://github.com/shazow/go-diff/': OpenSSL SSL_connect: Connection reset by peer in connection to github.com:443 

@qinqon
Copy link
Member

qinqon commented Apr 5, 2023

/lgtm
/approve

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 5, 2023
@kubevirt-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qinqon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 5, 2023
@qinqon
Copy link
Member

qinqon commented Apr 5, 2023

/retest

/home/prow/go/pkg/mod/k8s.io/apimachinery@v0.23.0/pkg/api/resource/amount.go:23:2: unrecognized import path "gopkg.in/inf.v0": reading https://gopkg.in/inf.v0?go-get=1: 502 Bad Gateway
	server response: Cannot obtain refs from GitHub: cannot talk to GitHub: Get https://github.com/go-inf/inf.git/info/refs?service=git-upload-pack: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
/home/prow/go/pkg/mod/sigs.k8s.io/structured-merge-diff/v4@v4.1.2/value/value.go:26:2: unrecognized import path "gopkg.in/yaml.v2": reading https://gopkg.in/yaml.v2?go-get=1: 502 Bad Gateway
	server response: Cannot obtain refs from GitHub: cannot talk to GitHub: Get https://github.com/go-yaml/yaml.git/info/refs?service=git-upload-pack: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
/home/prow/go/pkg/mod/sigs.k8s.io/controller-tools@v0.8.0/pkg/schemapatcher/gen.go:24:2: unrecognized import path "gopkg.in/yaml.v3": reading https://gopkg.in/yaml.v3?go-get=1: 502 Bad Gateway
	server response: Cannot obtain refs from GitHub: cannot talk to GitHub: Get https://github.com/go-yaml/yaml.git/info/refs?service=git-upload-pack: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

@andreaskaris
Copy link
Contributor Author

andreaskaris commented Apr 5, 2023

I'm pretty sure it's root-servers.net, you need to run an NS lookup, not a normal lookup:

dig ns root-servers.net

This is weird I am sure this was working before, did the change the name?

I got no clue why this was working before, but the official domain afaict was always root-servers.net

$ dig ns root-servers.net

; <<>> DiG 9.18.12 <<>> ns root-servers.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18560
;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 27

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;root-servers.net.		IN	NS

;; ANSWER SECTION:
root-servers.net.	43571	IN	NS	l.root-servers.net.
root-servers.net.	43571	IN	NS	g.root-servers.net.
root-servers.net.	43571	IN	NS	c.root-servers.net.
root-servers.net.	43571	IN	NS	k.root-servers.net.
root-servers.net.	43571	IN	NS	m.root-servers.net.
root-servers.net.	43571	IN	NS	d.root-servers.net.
root-servers.net.	43571	IN	NS	b.root-servers.net.
root-servers.net.	43571	IN	NS	e.root-servers.net.
root-servers.net.	43571	IN	NS	f.root-servers.net.
root-servers.net.	43571	IN	NS	i.root-servers.net.
root-servers.net.	43571	IN	NS	h.root-servers.net.
root-servers.net.	43571	IN	NS	j.root-servers.net.
root-servers.net.	43571	IN	NS	a.root-servers.net.
akaris@linux iphelpers (improve-iterate-for-assignment-separate-commits2)]$ whois root-servers.net
[Querying whois.verisign-grs.com]
[Redirected to whois.networksolutions.com]
[Querying whois.networksolutions.com]
[whois.networksolutions.com]
Domain Name: ROOT-SERVERS.NET
Registry Domain ID: 2751247_DOMAIN_NET-VRSN
Registrar WHOIS Server: whois.networksolutions.com
Registrar URL: http://networksolutions.com
Updated Date: 2021-05-11T17:43:59Z
Creation Date: 1995-07-04T04:00:00Z
Registrar Registration Expiration Date: 2024-07-03T04:00:00Z
Registrar: Network Solutions, LLC
Registrar IANA ID: 2
Reseller: 
Domain Status: serverDeleteProhibited https://icann.org/epp#serverDeleteProhibited
Domain Status: serverTransferProhibited https://icann.org/epp#serverTransferProhibited
Domain Status: serverUpdateProhibited https://icann.org/epp#serverUpdateProhibited
Registry Registrant ID: 
Registrant Name: VERISIGN INC.
Registrant Organization: VERISIGN INC.
Registrant Street: 12061 BLUEMONT WAY
Registrant City: RESTON
Registrant State/Province: VA
Registrant Postal Code: 20190-5684
Registrant Country: US
Registrant Phone: +1.7039481212
Registrant Phone Ext: 
Registrant Fax: +1.7039483670
Registrant Fax Ext: 
(...)

From what I see, it's also still an issue:

[akaris@linux dig]$ cat main.go 
package main

import (
	"context"
	"fmt"
	"net"
)

func main() {
	r := &net.Resolver{
		// PreferGo: true,
	}
	lookup, err := r.LookupNS(context.TODO(), "root-servers.net")
	fmt.Println(lookup, err)

	lookup, err = r.LookupNS(context.TODO(), "root-server.net")
	fmt.Println(lookup, err)
}
[akaris@linux dig]$ go run .
[0xc000098000 0xc000098010 0xc000098030 0xc000098040 0xc000098050 0xc000098060 0xc000098070 0xc000098080 0xc000098090 0xc0000980a0 0xc0000980b0 0xc0000980c0 0xc0000980d0] <nil>
[] lookup root-server.net on 127.0.0.53:53: no such host

From what I saw in my tests, the DNS probe is run on startup and when it fails it auto-disabled. I dunno what happens if stuff is already running and then things go down

@andreaskaris
Copy link
Contributor Author

/retest

@kubevirt-bot
Copy link
Collaborator

@andreaskaris: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@qinqon
Copy link
Member

qinqon commented Apr 5, 2023

/retest

centos 9 stream issues

@qinqon
Copy link
Member

qinqon commented Apr 5, 2023

dig ns root-servers.net

Ahh right, it was not a normal lookup, maybe we were having a false possitive here and "dns" probe was always disabled since root-server.net do not exist

/hold

Can you change it back to .net ?

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 5, 2023
@qinqon
Copy link
Member

qinqon commented Apr 5, 2023

/ok-to-test

@kubevirt-bot kubevirt-bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 5, 2023
@qinqon
Copy link
Member

qinqon commented Apr 5, 2023

/hold cancel

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 5, 2023
@qinqon
Copy link
Member

qinqon commented Apr 5, 2023

/retest

Pretty bad day

go: downloading github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd
/home/prow/go/pkg/mod/sigs.k8s.io/controller-tools@v0.8.0/pkg/loader/loader.go:30:2: golang.org/x/tools@v0.1.6-0.20210820212750-d4cc65f0b2ff: invalid version: git fetch -f origin refs/heads/*:refs/heads/* refs/tags/*:refs/tags/* in /home/prow/go/pkg/mod/cache/vcs/7d9b3b49b55db5b40e68a94007f21a05905d3fda866f685220de[88](https://prow.ci.kubevirt.io/view/gs/kubevirt-prow/pr-logs/pull/nmstate_kubernetes-nmstate/1163/pull-kubernetes-nmstate-e2e-handler-k8s/1643547344144175104#1:build-log.txt%3A88)f9c9bad98a: exit status 128:
	error: RPC failed; curl 56 OpenSSL SSL_read: Connection reset by peer, errno 104
	error: 115[90](https://prow.ci.kubevirt.io/view/gs/kubevirt-prow/pr-logs/pull/nmstate_kubernetes-nmstate/1163/pull-kubernetes-nmstate-e2e-handler-k8s/1643547344144175104#1:build-log.txt%3A90) bytes of body are still expected
	fetch-pack: unexpected disconnect while reading sideband packet
	fatal: early EOF
	fatal: fetch-pack: invalid index-pack output

@andreaskaris
Copy link
Contributor Author

/retest-failed

@andreaskaris
Copy link
Contributor Author

/retest-required

@andreaskaris
Copy link
Contributor Author

/retest

@k8scoder192
Copy link

I'm pretty sure it's root-servers.net, you need to run an NS lookup, not a normal lookup:

dig ns root-servers.net

This is weird I am sure this was working before, did the change the name?

I got no clue why this was working before, but the official domain afaict was always root-servers.net

$ dig ns root-servers.net

; <<>> DiG 9.18.12 <<>> ns root-servers.net
up root-server.net on 127.0.0.53:53: no such host

TRUNCATED OUTPUT

From what I saw in my tests, the DNS probe is run on startup and when it fails it auto-disabled. I dunno what happens if > stuff is already running and then things go down


nslookup fails with root-servers.net. It passes with root-servers.org. Here is my output

vanilla@LAPTOP-HGR0D2DR:~$ nslookup root-server.net     <------
;; connection timed out; no servers could be reached



vanilla@LAPTOP-HGR0D2DR:~$ nslookup root-servers.net    <------
Non-authoritative answer:
*** Can't find root-servers.net: No answer


vanilla@LAPTOP-HGR0D2DR:~$ nslookup root-servers.org     <------
Non-authoritative answer:
Name:   root-servers.org
Address: 193.0.11.23
Name:   root-servers.org
Address: 2001:67c:2e8:25::c100:b17

Also the comment that it will skip DNS lookup if it fails at the start. I know the code is there but it's not working.
I even completely removed k8s-nmstate from my cluster. Reinstalled v0.74 and it still attempted DNS lookup (obviously with the wrong domain "root-server.net"

@andreaskaris @qinqon

@andreaskaris
Copy link
Contributor Author

andreaskaris commented Apr 5, 2023

@k8scoder192 You are querying A records. The go code queries the NS record for root-servers.net. Again, that's the correct domain, as it holds the ns servers for the *.root-servers.net A entries.

[akaris@linux ~]$ nslookup -type=ns root-servers.net
Server:		127.0.0.53
Address:	127.0.0.53#53

Non-authoritative answer:
root-servers.net	nameserver = b.root-servers.net.
root-servers.net	nameserver = k.root-servers.net.
root-servers.net	nameserver = l.root-servers.net.
root-servers.net	nameserver = c.root-servers.net.
root-servers.net	nameserver = j.root-servers.net.
root-servers.net	nameserver = i.root-servers.net.
root-servers.net	nameserver = g.root-servers.net.
root-servers.net	nameserver = e.root-servers.net.
root-servers.net	nameserver = a.root-servers.net.
root-servers.net	nameserver = h.root-servers.net.
root-servers.net	nameserver = d.root-servers.net.
root-servers.net	nameserver = m.root-servers.net.
root-servers.net	nameserver = f.root-servers.net.

Authoritative answers can be found from:
f.root-servers.net	internet address = 192.5.5.241
g.root-servers.net	internet address = 192.112.36.4
e.root-servers.net	internet address = 192.203.230.10
e.root-servers.net	has AAAA address 2001:500:a8::e
k.root-servers.net	internet address = 193.0.14.129
a.root-servers.net	has AAAA address 2001:503:ba3e::2:30
k.root-servers.net	has AAAA address 2001:7fd::1
m.root-servers.net	internet address = 202.12.27.33
h.root-servers.net	has AAAA address 2001:500:1::53
b.root-servers.net	internet address = 199.9.14.201
b.root-servers.net	has AAAA address 2001:500:200::b
f.root-servers.net	has AAAA address 2001:500:2f::f

As I stated earlier (and as you can see in the code snippet), the go code does an NS entry lookup.
A lookup for A records will fail as there are none under the root, but we don't care for that.
As I said earlier, that matches IANA's documentation: https://www.iana.org/domains/root/servers
And if you google for it, you'll also find that it's the correct domain.

root-servers.org is that website here https://root-servers.org/ but we care about the root entry of the dns server list and not about the website:

akaris@linux ~]$ dig ns root-servers.org

; <<>> DiG 9.18.12 <<>> ns root-servers.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21044
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 9

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;root-servers.org.		IN	NS

;; ANSWER SECTION:
root-servers.org.	3546	IN	NS	ns-ext.isc.org.
root-servers.org.	3546	IN	NS	ns.maxgigapop.net.
root-servers.org.	3546	IN	NS	sec2.authdns.ripe.net.
root-servers.org.	3546	IN	NS	jokulsa.asbyrgi.net.
root-servers.org.	3546	IN	NS	nnp.netnod.se.
root-servers.org.	3546	IN	NS	a.icann-servers.net.
[akaris@linux ~]$ whois root-servers.org
[Querying whois.pir.org]
[whois.pir.org]
Domain Name: root-servers.org
Registry Domain ID: daa17d4699c343d487770cd5c877ab88-LROR
Registrar WHOIS Server: http://whois.joker.com
Registrar URL: http://www.joker.com
Updated Date: 2022-09-26T07:52:31Z
Creation Date: 1998-11-12T05:00:00Z
Registry Expiry Date: 2031-11-11T05:00:00Z
Registrar: CSL Computer Service Langenbach GmbH d/b/a joker.com a German GmbH
Registrar IANA ID: 113
Registrar Abuse Contact Email: abuse@joker.com
Registrar Abuse Contact Phone: +49.21186767447

@andreaskaris
Copy link
Contributor Author

/retest

@andreaskaris
Copy link
Contributor Author

andreaskaris commented Apr 5, 2023

Also the comment that it will skip DNS lookup if it fails at the start. I know the code is there but it's not working.
I even completely removed k8s-nmstate from my cluster. Reinstalled v0.74 and it still attempted DNS lookup (obviously with the wrong domain "root-server.net"

Yes but what I saw is that it'll throw an error and then happily continues with the probe disabled.
I had someone in our QE run a verification for something else and yes there are lots of error messages like this on startup:

{"level":"error","ts":"2023-04-03T03:24:06.657Z","logger":"probe","msg":"failed checking DNS connectivity","error":"[lookup root-server.net on 10.0.0.2:53: no such host]","stacktrace":"github.com/nmstate/kubernetes-nmstate/pkg/probe.runDNS\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/probe/probes.go:239\ngithub.com/nmstate/kubernetes-nmstate/pkg/probe.dnsCondition.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/probe/probes.go:206\nk8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:220\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:233\nk8s.io/apimachinery/pkg/util/wait.WaitForWithContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:660\nk8s.io/apimachinery/pkg/util/wait.poll\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:594\nk8s.io/apimachinery/pkg/util/wait.PollImmediateWithContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:526\nk8s.io/apimachinery/pkg/util/wait.PollImmediate\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:512\ngithub.com/nmstate/kubernetes-nmstate/pkg/probe.Select\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/probe/probes.go:261\ngithub.com/nmstate/kubernetes-nmstate/pkg/helper.ApplyDesiredState\n\t/go/src/github.com/openshift/kubernetes-nmstate/pkg/helper/client.go:166\ngithub.com/nmstate/kubernetes-nmstate/controllers/handler.(*NodeNetworkConfigurationPolicyReconciler).Reconcile\n\t/go/src/github.com/openshift/kubernetes-nmstate/controllers/handler/nodenetworkconfigurationpolicy_controller.go:218\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234"}

But eventually it shows:

{"level":"info","ts":"2023-04-03T03:24:06.657Z","logger":"probe","msg":"WARNING not selecting dns probe"}

And then works by ignoring the DNS probe in the future. But I tested this with downstream OpenShift so I can't tell if there's a difference upstream

@k8scoder192
Copy link

Agreed, the handler pod, will try a few times then fail DNS and move to ping. The problem is the amount of time until it moves on to the next probe (ping). It's > 5 min. If you are talking about something else, I apologize, I'm specifically referring to the handler pod when looking at the logs. My code looks at event streams and it 5+ min is very long before it sees

status:
  conditions:
    reason: SuccessfullyConfigured
    status: "True"
    type: Available

I want to thank you for your feedback and looking into all of this. Much appreciated.

@andreaskaris
Copy link
Contributor Author

andreaskaris commented Apr 5, 2023

Ah so it's not that the container crashes or anything, but that it takes a long time to start up due to the failed (and skipped) DNS probe. Yeah, then we're on the same page :-) I guess some follow-up is indeed needed to this here, but (if CI actually let me) I just wanted to push a quick fix to the issue at hand

@andreaskaris
Copy link
Contributor Author

/retest

1 similar comment
@andreaskaris
Copy link
Contributor Author

/retest

@qinqon
Copy link
Member

qinqon commented Apr 10, 2023

/retest

The centos 9 stream RPM issue has being fixed upstream, but looks like we have e2e test issue at main

@kubevirt-bot kubevirt-bot merged commit 1bcf612 into nmstate:main Apr 10, 2023
@k8scoder192
Copy link

@qinqon @andreaskaris great to see this was merged. Do you know when a release will be created?

cybertron pushed a commit to cybertron/kubernetes-nmstate that referenced this pull request Oct 27, 2023
Signed-off-by: Andreas Karis <ak.karis@gmail.com>
(cherry picked from commit 1bcf612)
openshift-ci bot added a commit to openshift/kubernetes-nmstate that referenced this pull request Oct 31, 2023
[4.12] OCPBUGS-22480: Use correct domain root-servers.net for DNS probes (nmstate#1163)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dns probe failed, need to disable dns check
4 participants