Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dns not work on windows10 #2095

Closed
ytfrank opened this issue Sep 18, 2021 · 16 comments
Closed

dns not work on windows10 #2095

ytfrank opened this issue Sep 18, 2021 · 16 comments

Comments

@ytfrank
Copy link

ytfrank commented Sep 18, 2021

Still failed to parse the k8s service name after executing "telepresenece connect" successfully:

io.netty.resolver.dns.DnsResolveContext$SearchDomainUnknownHostException: Search domain query failed. Original hostname: 'redis-cs.svc.cluster.local' failed to resolve 'redis-cs.svc.cluster.local' after 7 queries
at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:718)

While "telnet redis-cs.svc.cluster.local 6379" is OK in the command.

Windows10 64bit
telepresence v2.4.4-nightly-5cae5bfc(api v3)

@thallgren
Copy link
Member

@ytfrank Can you please run telepresence loglevel debug, reproduce the issue, and then provide the resulting daemon.log file? You'll find the log files under %USERPROFILE%\AppData\Local\telepresence\logs on a windows box.

@ytfrank
Copy link
Author

ytfrank commented Sep 22, 2021

@thallgren
Sorry for the late repay!
You could search "redis-cs.svc.cluster.local" in the daemon.log.

Thank you very much!

daemon.log

@ytfrank
Copy link
Author

ytfrank commented Sep 22, 2021

@thallgren
Add some infos:
The cluster's dns server(kube-dns) is listening in 10.96.0.10:53
kube-dns ClusterIP 10.96.0.10 53/UDP,53/TCP,9153/TCP 336d k8s-app=kube-dns

image

@thallgren
Copy link
Member

The name redis-cs.svc.cluster.local missing a namespace component. Names are usually <name>.<namespace>.svc.cluster.local. This is just <name>.svc.cluster.local.

The logs show that the DNS-query is sent over to your cluster and that the cluster fails to find it.

From what machine do you run the telnet command that works?

@thallgren
Copy link
Member

Another thing that I note is that the queries are for "AAAA" records (IPv6 address) and "CNAME" (Alias from one name to another) only. There's never a query for an "A" record (IPv4 address). I would assume that telnet tries (and succeeds) the "A" record.

@ytfrank
Copy link
Author

ytfrank commented Sep 22, 2021

@thallgren
Sorry for the confuse...
Yeah, the full name is "redis-cs.testXXX.svc.cluster.local", while testXXX is the name of the namespace and and is global replaced in the log file for the security issue.
The cmd "telnet redis-cs.testXXX.svc.cluster.local 6379" is OK in the same windows machine which encountered the dns issue when starting the tomcat process.

And how to see there is no query for "A" record? By inspecting the code or the log?
Thanks a lot!

@ytfrank
Copy link
Author

ytfrank commented Sep 22, 2021

@thallgren

OK, I saw the the types in types.go. Meanwhile, dns works well in Mac, and I also only saw the type 1 & 28 in the log file of Mac:

6451 2021-09-22 19:33:06.2220 debug daemon/server-dns/Server : LookupHost "redis-cs.test82.svc.cluster.local"
6452 2021-09-22 19:33:06.2641 debug daemon/server-dns/Server : QUERY[1] redis-cs.test82.svc.cluster.local. -> 10.104.12.83
6453 2021-09-22 19:33:06.5517 debug daemon/server-dns/Server : QUERY[28] redis-cs.test82.svc.cluster.local. -> EMPTY

@thallgren
Copy link
Member

Where does the query originate from?

@ytfrank
Copy link
Author

ytfrank commented Sep 22, 2021

It's from the log file of Mac:
/Users/apple/Library/Logs/telepresence/daemon.log.

Here is the doc of dns of telepresence, but I cannot find how it works in the windows.
https://www.telepresence.io/docs/latest/reference/routing/#dns-resolution

daemon.log

@thallgren
Copy link
Member

thallgren commented Sep 23, 2021

Sorry, I was unclear. I want to know what process it is that makes the original request to "redis-cs.svc.cluster.local" on Windows.
Two reasons for that:

  1. On Windows, the URL is "redis-cs.svc.cluster.local". It lacks namespace. On Mac it contains the namespace "test82".
  2. On Windows, there's no query for the A-record.

I don't think it's likely that Telepresence is the cause of those differences.

@ytfrank
Copy link
Author

ytfrank commented Sep 23, 2021

@thallgren
Sorry for the confuse!
Actually, It's the same java process on Windows and Mac, and both of them request to "redis-cs.test82.svc.cluster.local". "test82" was removed in the damon log of Windows.
Yeah, the only difference is no query for the A-record (only QTYPE[5] & QTYPE[28]) on Windows 10 when starting the java process in IntelliJ IDEA.

@thallgren
Copy link
Member

Are you saying that Telepresence removed "test82" from the DNS query? I really can't see that happening.

@ytfrank
Copy link
Author

ytfrank commented Sep 23, 2021

No, "test82" was removed by I manually before uploading the log file. The original info is the same.
Below is the original info in the log file on Windows:

2021-09-22 13:56:48.7621 debug daemon/server-dns/Server : LookupHost "redis-cs.test82.svc.cluster.local"
2021-09-22 13:56:49.2292 debug daemon/server-router/TUN reader : -- POOL udp 10.96.0.0:52444 -> 10.96.0.10:53, count now is 14
2021-09-22 13:56:49.6545 debug daemon/server-dns/Server : QTYPE[5] redis-cs.test82.svc.cluster.local. -> EMPTY
2021-09-22 13:56:49.6552 debug daemon/server-router/TUN reader : <- DNS udp 10.96.0.10:53 -> 10.96.0.0:53871, len 51
2021-09-22 13:56:49.6557 debug daemon/server-router/TUN writer : -> TUN udp 10.96.0.10:53 -> 10.96.0.0:53871
2021-09-22 13:56:49.6896 debug daemon/server-router/TUN reader : -> DNS udp 10.96.0.0:53871 -> 10.96.0.10:53, len 80
2021-09-22 13:56:49.6907 debug daemon/server-dns/Server : LookupHost "redis-cs.test82.svc.cluster.local.staff.com.cn"
2021-09-22 13:56:49.7448 debug daemon/server-router/TUN reader : -- POOL udp 10.96.0.0:51258 -> 10.96.0.10:53, count now is 13
2021-09-22 13:56:49.9472 debug daemon/server-dns/Server : QTYPE[28] redis-cs.test82.svc.cluster.local.staff.com.cn. -> NOT FOUND

Maybe dns works different from the OS. Does Telepresence works well on window 10 from your side?
Has anyone encountered the issue before?

@thallgren
Copy link
Member

Telepresence works fine on my laptop and it also passes our CI-tests on Windows.

I honestly don't think this is a telepresence issue. I can't see neither Telepresence nor the OS would change the DNS queries from a type A to a type CNAME. That must be something that the app is doing differently on Windows.

@ytfrank
Copy link
Author

ytfrank commented Sep 23, 2021

Yeah, I think so. Will try another process on windows 10.
Thanks very much!

@thallgren
Copy link
Member

Closing, as we agree that this isn't a telepresence problem. Feel free to reopen if you find new things that points to Telepresence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants