Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ostree pull seems to do a lot of dns lookups #894

Open
alexlarsson opened this issue May 30, 2017 · 10 comments
Open

ostree pull seems to do a lot of dns lookups #894

alexlarsson opened this issue May 30, 2017 · 10 comments

Comments

@alexlarsson
Copy link
Member

alexlarsson commented May 30, 2017

I've seen a lot of reports of DNS issues during flatpak install. It seems that sometimes one of the dns servers for gnome.org has a hiccup, which tends to abort the entire install.

Example user report

• [ironman:~] bratner $ flatpak install --user http://flatpak.pitivi.org/pitivi.flatpakref
This application depends on runtimes from:
  http://sdk.gnome.org/repo/
Configure this as new remote 'gnome' [y/n]: y
Installing: org.pitivi.Pitivi/x86_64/stable
Required runtime for org.pitivi.Pitivi/x86_64/stable (org.gnome.Platform/x86_64/3.22) is not installed, searching...
Found in remote gnome, do you want to install it? [y/n]: y
Installing: org.gnome.Platform/x86_64/3.22 from gnome
[#=                  ] Downloading: 0 bytes/183.1 MB (0 bytes/s)               
error: While pulling runtime/org.gnome.Platform/x86_64/3.22 from remote gnome: Error resolving 'sdk.gnome.org': Name or service not known
• [ironman:~] bratner $ ping sdk.gnome.org
PING sdk.gnome.org (209.132.180.169) 56(84) bytes of data.
64 bytes from sdk.gnome.org (209.132.180.169): icmp_seq=1 ttl=47 time=235 ms
64 bytes from sdk.gnome.org (209.132.180.169): icmp_seq=2 ttl=47 time=236 ms
^C
--- sdk.gnome.org ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 235.710/235.972/236.235/0.552 ms
• [ironman:~] bratner $ dig sdk.gnome.org

; <<>> DiG 9.10.3-P4-Ubuntu <<>> sdk.gnome.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1259
;; flags: qr ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;sdk.gnome.org.			IN	A

;; ADDITIONAL SECTION:
sdk.gnome.org.		772	IN	A	209.132.180.169

;; Query time: 1 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue May 30 11:37:50 IDT 2017
;; MSG SIZE  rcvd: 58

• [ironman:~] bratner $ cat /etc/resolv.conf 
 # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
 #     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.1.1
search Home
• [ironman:~] bratner $ sudo vi /etc/resolv.conf 
• [ironman:~] bratner $ flatpak install --user http://flatpak.pitivi.org/pitivi.flatpakref
Installing: org.pitivi.Pitivi/x86_64/stable
Required runtime for org.pitivi.Pitivi/x86_64/stable (org.gnome.Platform/x86_64/3.22) is not installed, searching...
Found in remote gnome, do you want to install it? [y/n]: y
Installing: org.gnome.Platform/x86_64/3.22 from gnome
[####################] 9 delta parts, 73 loose fetched; 178829 KiB transferred in 68 seconds
Installing: org.gnome.Platform.Locale/x86_64/3.22 from gnome
[####################] 3 metadata, 1 content objects fetched; 13 KiB transferred in 4 seconds
Installing: org.pitivi.Pitivi/x86_64/stable from org.pitivi.Pitivi-1-origin
[####################] 6 delta parts, 34 loose fetched; 117805 KiB transferred in 35 seconds
• [ironman:~] bratner $ cat /etc/resolv.conf 
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 8.8.8.8
nameserver 127.0.1.1
search Home

I wonder if we're doing something wrong wrt name resolution? Shouldn't glibc cache the resolve from the first lookup and reuse for each pull? Do we need to do the resolve once and then manually specify the IP for each request in the pull?

@dustymabe
Copy link
Contributor

+1 - i've definitely seen issues where a failure to resolve a name during the middle of a pull results in the pull being aborted.

@cgwalters
Copy link
Member

I don't believe glibc does any caching unless nscd is enabled, and nscd has lots of issues. I suggest NetworkManager with dns=dnsmasq.

@cgwalters
Copy link
Member

There's also systemd-resolved.

@alexlarsson
Copy link
Member Author

@cgwalters Don't you think it makes sense to only do a dns resolve once per pull operation though? I mean that way you'd e.g. always hit the same mirror instance for the entire operation, and also avoid a lot of dns requests.

@alexlarsson
Copy link
Member Author

@cgwalters I mean, in the non-static-delta case, are we really doing one DNS resolve operation for each object?

@cgwalters
Copy link
Member

Well, HTTP keepalives should really obviate most DNS issues except for initial setup. It looks like pitivi.org does keepalives.

I'm uncertain about caching DNS in ostree explicitly...feels like it's more libsoup/libcurl or the system's job; and actually one thing we likely want to enable is higher level software like gnome-software be able to use GNetworkMonitor to dynamically watch for repos to become available. Doing something like that would end up implicitly caching DNS at a higher level.

@alexlarsson
Copy link
Member Author

That all sounds fine in theory, but people show up all the time with these dns issues.

@cgwalters
Copy link
Member

I'm not saying there's no problem but...oh hm interesting, Firefox caches DNS. (I was about to argue that one really wanted a system-wide cache for web browsing etc. - AIUI e.g. Windows has a system-wide cache)

One thing here is that currently we do use multiple connections - so even if we're using keepalives, I suspect in the middle of a pull at least with libsoup we may end up failing if DNS transiently fails during a pull for the same server, even if we have an open connection.

@alexlarsson
Copy link
Member Author

I think what happens is that gnome has multiple dns servers, but generally only one is borked. So, if we resolve once we'll either fail the entire pull, or succeed it. Whereas without caching we're pretty much guaranteed to hit the borked one.

Also, i'm not sure we want a generic cache, but rather something that is part of OtPullData. I.e. one resolve per pull operation.

@alexlarsson
Copy link
Member Author

Additionally, it just seems safer in terms of things like round-robin dns mirroring to use the same server for the entire pull operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants