Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upCache updates #1957
Comments
andrewdavidwong
added
enhancement
C: core
P: major
labels
May 5, 2016
andrewdavidwong
changed the title from
Cache package updates
to
Cache updates
May 5, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
taradiddles
May 5, 2016
It's indeed a common problem when deploying fedora vms/containers, or with server farms. Debian has apt-cacher(ng) but fedora doesn't have something similar.
Solutions that came up:
- squid as a caching proxy (as noted by andrew).
Need to check cache expiry time, max object size, ..., in order to minimize cache miss.
Similar notes:
https://www.berrange.com/posts/2015/12/09/setting-up-a-local-caching-proxy-for-fedora-yum-repositories/ - pulp ; http://www.pulpproject.org/
too heavy ? - cobbler ; https://fedorahosted.org/cobbler/
From the doc: "Cobbler can also optionally help with [...] yum package mirroring infrastructure"
But - looks specific to fedora and maybe too heavy too. - guru labs automatic mirror - https://www.gurulabs.com/goodies/guru-guides/YUM-automatic-local-mirror/
Needs a web server, and without tweaks to the yum repo files, to be authoritative for well-known fedora mirro domains. Other than that, looks like a rather "clean" solution. - mrepo ; http://dag.wiee.rs/home-made/mrepo/
Oldish but may get the work done.
Anyway, instead of having specific tools for each distro it would be wiser to have a generic solution.
So - all in all, the squid solution may be the best one, with cache misses rate being something to investigate.
taradiddles
commented
May 5, 2016
|
It's indeed a common problem when deploying fedora vms/containers, or with server farms. Debian has apt-cacher(ng) but fedora doesn't have something similar. Solutions that came up:
Anyway, instead of having specific tools for each distro it would be wiser to have a generic solution. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
May 5, 2016
Member
Actually apt-cacher-ng works for Fedora too :)
Maybe we can simply use it instead of tinyproxy as update proxy?
|
Actually apt-cacher-ng works for Fedora too :) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
taradiddles
May 5, 2016
apt-cacher-ng works on fedora for mirroring debian stuff, but does it really work for mirroring (d)rpms/metadata downloaded with yum/dnf ?
From the doc [1]: "6.3 Fedora Core - Attempts to add apt-cacher-ng support ended up in pain and the author lost any motivation in further research on this subject. "
[1] https://www.unix-ag.uni-kl.de/~bloch/acng/html/distinstructions.html#hints-fccore
taradiddles
commented
May 5, 2016
|
apt-cacher-ng works on fedora for mirroring debian stuff, but does it really work for mirroring (d)rpms/metadata downloaded with yum/dnf ? From the doc [1]: "6.3 Fedora Core - Attempts to add apt-cacher-ng support ended up in pain and the author lost any motivation in further research on this subject. " [1] https://www.unix-ag.uni-kl.de/~bloch/acng/html/distinstructions.html#hints-fccore |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
May 5, 2016
Member
Yes, I've seen this. But in practice it works. The only problem is
dynamic mirror selection - it may make caching difficult (when each time
different mirror is selected).
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
|
Yes, I've seen this. But in practice it works. The only problem is Best Regards, |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
adrelanos
May 5, 2016
Member
Marek Marczykowski-Górecki:
Actually apt-cacher works for Fedora too :)
Maybe we can simply use it instead of tinyproxy as update proxy?
Can it also let through non-apt traffic? Specifically I am wondering
about tb-updater.
|
Marek Marczykowski-Górecki:
Can it also let through non-apt traffic? Specifically I am wondering |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
May 5, 2016
Member
Can it also let through non-apt traffic? Specifically I am wondering
about tb-updater.
That's interesting question - if you have apt-cacher-ng instance handy,
it worth a try. Anyway it has quite flexible configuration, so probably
doable.
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
That's interesting question - if you have apt-cacher-ng instance handy, Best Regards, |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
adrelanos
May 6, 2016
Member
I don't think there is a generic solution that works at the same time
well enough for both, Debian and Fedora based. Why do we need a generic
all at once solution anyhow? Here is what I suggest:
- Let's keep tinyproxy as is. As fallback. And for misc traffic.
(tb-updater, user custom stuff and what not.) - Let's install apt-cacher-ng and the fedora caching proxy by default in
the UpdateVM. - Let's configure Debian based VMs to use apt-cacher-ng.
- Let's configure Fedora based VMs to use the fedora caching proxy.
What do you think?
Can it also let through non-apt traffic? Specifically I am wondering
about tb-updater.That's interesting question - if you have apt-cacher-ng instance handy,
it worth a try. Anyway it has quite flexible configuration, so probably
doable.
I've read all the config, and tired, does not seem possible but never
mind as per my above suggestion.
|
I don't think there is a generic solution that works at the same time
What do you think?
I've read all the config, and tired, does not seem possible but never |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
May 7, 2016
Member
It will require more resources (memory), somehow wasted when one use for example only Debian templates. But maybe it is possible to activate those services on demand (socket activation comes to my mind). It will be even easier for qrexec-based updates proxy.
|
It will require more resources (memory), somehow wasted when one use for example only Debian templates. But maybe it is possible to activate those services on demand (socket activation comes to my mind). It will be even easier for qrexec-based updates proxy. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
taradiddles
May 7, 2016
Why do we need a generic all at once solution anyhow
I'm all for 100% caching success rate with a specific mechanism for each distro, but do Qubes developpers/contributors have time to develop/support that feature ?
If yes, that's cool ; otherwise, a solution like squid would be easy to implement, and since it's distro agnostic it will help not only the supported distros (fedora, debian, arch?), but also other distributions that users install in HVMs (even windows then). The problems/unknowns with squid are the cache miss rate, the cache disk usage in order to minimize those, and the use of different mirrors with yum (although I find out that I usually always connect to the same one).
taradiddles
commented
May 7, 2016
I'm all for 100% caching success rate with a specific mechanism for each distro, but do Qubes developpers/contributors have time to develop/support that feature ? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qjoo
May 7, 2016
I'm using polipo proxy => tor to cache updates. I also modified the repo configuration to use one specific update server instead of dynamically selecting it. I'm planing to document my setup and will post a link here.
qjoo
commented
May 7, 2016
|
I'm using polipo proxy => tor to cache updates. I also modified the repo configuration to use one specific update server instead of dynamically selecting it. I'm planing to document my setup and will post a link here. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kalkin
May 7, 2016
Member
Just wanted to throw in https://github.com/yevmel/squid-rpm-cache I planned to setup a dedicated squid vm and use the above mentioned config/plugin to cache rpms, but never found the time for it.
The problems/unknowns with squid are the cache miss rate, the cache disk usage in order to minimize those, and the use of different mirrors with yum (although I find out that I usually always connect to the same one).
Currently i just use my NAS which has a "normal" squid running as caching proxy. I have an ansible script which generates me my templates. In the templates I replaced the metalink parameter with baseurl to the nearest Fedora mirror, in /etc/yum.repos.d/fedora.repo. In /etc/yum.conf I replaced the proxy option with my NAS proxy and allowed TempalteVMs to connect to it.
|
Just wanted to throw in https://github.com/yevmel/squid-rpm-cache I planned to setup a dedicated squid vm and use the above mentioned config/plugin to cache rpms, but never found the time for it.
Currently i just use my NAS which has a "normal" squid running as caching proxy. I have an ansible script which generates me my templates. In the templates I replaced the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
May 7, 2016
Member
My experience with squid is horrible in terms of resources (RAM, I/O usage) for small setups. Looks like an overkill for just downloading updates from a few templates from time to time.
|
My experience with squid is horrible in terms of resources (RAM, I/O usage) for small setups. Looks like an overkill for just downloading updates from a few templates from time to time. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
adrelanos
May 9, 2016
Member
I don't like saying this, but we should also consider making this an additional, non-default option or wontfix also. I like apt-cacher-ng very much and use it myself. However, introducing it by default into Qubes would lead to new issues, more users having issues with upgrading due to added technical complexity. There are corner cases where apt-cacher-ng introduces new issues, such as showing Hash Sum mismatch errors during apt-get update.
|
I don't like saying this, but we should also consider making this an additional, non-default option or wontfix also. I like apt-cacher-ng very much and use it myself. However, introducing it by default into Qubes would lead to new issues, more users having issues with upgrading due to added technical complexity. There are corner cases where apt-cacher-ng introduces new issues, such as showing |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
taradiddles
May 10, 2016
FWIW I have squid installed on an embedded router (RB450g) for a 25+ people office and it's been running for literally ages without any problem. There's a strict bandwidth control (delay pools), which is usually the biggest offender in terms of resources, but squid's memory usage has constantly been < 20 Mo and highest CPU usage < 6%. Granted, the office's uplink speed is low - in the megabits/s range - but the resources available for updateVM are in another league compared to the embedded stuff and the setup - only caching - is not fancy.
tl;dr, squid is not as bad as it used to be years ago.
The issues you mention reinforce my concern that it will be too time-consuming for Qubes devs to support distro-specific solutions. A simple generic one, even if not optimal is still better than nothing at all, rather than "wontfix".
Plus, users kalkin and qjoo seem to have a working solution, why not try those ?
just my 2c - not pushing for anything, you guys are doing a great work !
taradiddles
commented
May 10, 2016
|
FWIW I have squid installed on an embedded router (RB450g) for a 25+ people office and it's been running for literally ages without any problem. There's a strict bandwidth control (delay pools), which is usually the biggest offender in terms of resources, but squid's memory usage has constantly been < 20 Mo and highest CPU usage < 6%. Granted, the office's uplink speed is low - in the megabits/s range - but the resources available for updateVM are in another league compared to the embedded stuff and the setup - only caching - is not fancy. tl;dr, squid is not as bad as it used to be years ago. The issues you mention reinforce my concern that it will be too time-consuming for Qubes devs to support distro-specific solutions. A simple generic one, even if not optimal is still better than nothing at all, rather than "wontfix". just my 2c - not pushing for anything, you guys are doing a great work ! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrewdavidwong
May 10, 2016
Member
At the very least, we should provide some documentation (or suggestions or pointers in the documentation) regarding something like @taradiddles's router solution. Qubes users are more likely than the average Linux user to have multiple machines (in this case, virtual) downloading exactly the same updates.
|
At the very least, we should provide some documentation (or suggestions or pointers in the documentation) regarding something like @taradiddles's router solution. Qubes users are more likely than the average Linux user to have multiple machines (in this case, virtual) downloading exactly the same updates. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Rudd-O
May 10, 2016
Looks like what you want is Squid with an adaptive disk cache size (for storing packages in the volatile /var/cache/squid directory), and configured with no memory cache. Since the config file can be in a different place and the unit file can be overridden to specify the Qubes specific config file, it may work very well for this purpose. Squid is goddamn good these days, and it supports regex-based filters (plus you can block methods other than GET, and you can support proxy caching FTP sites).
OTOH, it's always a security footprint issue to run a larger codebase for a cache. Also, Squid caching can be ineffective if multiple VMs download files from different mirrors (remember that the decision of which mirror to use is left practically at random to the VM calling onto the Squid proxy to do its job).
For those reasons, it may be wise to investigate solutions that do a better job of proxy caching using a content-addressable store, or matching file names.
Rudd-O
commented
May 10, 2016
|
Looks like what you want is Squid with an adaptive disk cache size (for storing packages in the volatile OTOH, it's always a security footprint issue to run a larger codebase for a cache. Also, Squid caching can be ineffective if multiple VMs download files from different mirrors (remember that the decision of which mirror to use is left practically at random to the VM calling onto the Squid proxy to do its job). For those reasons, it may be wise to investigate solutions that do a better job of proxy caching using a content-addressable store, or matching file names. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Rudd-O
May 10, 2016
Perhaps a custom Go-based (to prevent security vulns) cache that can listen for requests using the net/http module, and proxy them to the VMs? This has potential to be a very efficient solution too, as a Go program would have a minuscule memory footprint.
Rudd-O
commented
May 10, 2016
|
Perhaps a custom Go-based (to prevent security vulns) cache that can listen for requests using the net/http module, and proxy them to the VMs? This has potential to be a very efficient solution too, as a Go program would have a minuscule memory footprint. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@Rudd-O Have a look at this https://github.com/mojaves/yumreproxyd |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Rudd-O
commented
May 12, 2016
|
Looking. Note we need something like that for Debian as well. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Rudd-O
May 12, 2016
The code is not idiomatic Go and there are some warts there that I would fix before including it anywhere. Just as a small example on https://github.com/mojaves/yumreproxyd/blob/master/yumreproxy/yumreproxy.go#L33 you can see he is using a nil value as a sort of a bool. That is not correct -- the return type should be (bool, struct).
Rudd-O
commented
May 12, 2016
|
The code is not idiomatic Go and there are some warts there that I would fix before including it anywhere. Just as a small example on https://github.com/mojaves/yumreproxyd/blob/master/yumreproxy/yumreproxy.go#L33 you can see he is using a nil value as a sort of a bool. That is not correct -- the return type should be (bool, struct). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Rudd-O
May 12, 2016
https://github.com/mojaves/yumreproxyd/blob/master/yumreproxy/yumreproxy.go#L73 <- also problematic. TODO: path sanitization is not what you want in secure software.
But the BIGGEST problem, is that the program appears not to give a shit about concurrency. Save into cache and serve from cache can have a race, and no locking is performed, nor are channels being used there. Big fat red flag. The right way to do that by communicating with the Cache aspect of the application through channels -- send request to the Cache, await for response, if not available, then download file, send storage to the Cache, await for response.
Also, all content types returned are application/rpm. That's wrong in many cases.
BUT, that only means that project can be extended or rewritten, and it should not be very difficult to do so.
Rudd-O
commented
May 12, 2016
|
https://github.com/mojaves/yumreproxyd/blob/master/yumreproxy/yumreproxy.go#L73 <- also problematic. But the BIGGEST problem, is that the program appears not to give a shit about concurrency. Save into cache and serve from cache can have a race, and no locking is performed, nor are channels being used there. Big fat red flag. The right way to do that by communicating with the Cache aspect of the application through channels -- send request to the Cache, await for response, if not available, then download file, send storage to the Cache, await for response. Also, all content types returned are application/rpm. That's wrong in many cases. BUT, that only means that project can be extended or rewritten, and it should not be very difficult to do so. |
added a commit
that referenced
this issue
May 31, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jun 6, 2016
I just uploaded the Squid-based https://github.com/rustybird/qubes-updates-cache (posted to qubes-devel too)
rustybird
commented
Jun 6, 2016
•
|
I just uploaded the Squid-based https://github.com/rustybird/qubes-updates-cache (posted to qubes-devel too) |
added a commit
that referenced
this issue
Jun 6, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jun 8, 2016
The latest commit (-57 lines, woo) reworks qubes-updates-cache to act as a drop-in replacement for qubes-updates-proxy. No changes to the client templates are needed at all now.
rustybird
commented
Jun 8, 2016
|
The latest commit (-57 lines, woo) reworks qubes-updates-cache to act as a drop-in replacement for qubes-updates-proxy. No changes to the client templates are needed at all now. |
added a commit
that referenced
this issue
Jun 8, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jun 8, 2016
Member
How much memory does it use? I.e. is it a good idea to have it instead of tinyproxy by default, or give the user a choice?
|
How much memory does it use? I.e. is it a good idea to have it instead of tinyproxy by default, or give the user a choice? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
taradiddles
Jun 9, 2016
FWIW I had a similar setup running after my last post, the difference being that I used/tweaked the store_id program mentioned by @kalkin in an earlier post [1]. But there were many cache misses ; a quick look at the log showed that different mirrors would send different mime types for the same rpm (or repo) file, so that might be the culprit. Other tasks piled up and I didn't have time to work on that.
@marmarek : after boot, memory = ~30Mo (as far as you can trust ps). But I guess the question is more on the long term use, after squid has cached many objects. Rusty used 'cache_mem=0', so there shouldn't be a huge difference in mem usage, but he might have more statistics.
@rustybird : tinyproxy's configuration is quite locked down, maybe that would be a good idea to do the same with squid's ? I'm also not sure it is a good idea to mess with the cache ids for files other than rpm/repo (and deb/...).
For instance, stuff like:
acl localnet src 10.137.0.0/16
acl http_ports port 80
acl SSL_ports port 443
acl CONNECT method CONNECT
http_access deny to_localhost
http_access deny CONNECT !SSL_ports
http_access allow http_ports
http_access allow SSL_ports
http_access deny all
# that one was from https://github.com/yevmel/squid-rpm-cache
# have to understand why that's changed
#refresh_pattern . 0 20% 4320
# 3 month 12 month
refresh_pattern . 129600 33% 525600
# cache only specific files types
acl rpm_files urlpath_regex \/Packages\/.*\.rpm
acl repodata_files urlpath_regex \/repodata\/.*\.(|sqlite\.xz|xml(\.[xg]z)?)
cache allow rpm_files
cache allow repodata_files
cache deny all
taradiddles
commented
Jun 9, 2016
|
FWIW I had a similar setup running after my last post, the difference being that I used/tweaked the store_id program mentioned by @kalkin in an earlier post [1]. But there were many cache misses ; a quick look at the log showed that different mirrors would send different mime types for the same rpm (or repo) file, so that might be the culprit. Other tasks piled up and I didn't have time to work on that. @marmarek : after boot, memory = ~30Mo (as far as you can trust ps). But I guess the question is more on the long term use, after squid has cached many objects. Rusty used 'cache_mem=0', so there shouldn't be a huge difference in mem usage, but he might have more statistics. @rustybird : tinyproxy's configuration is quite locked down, maybe that would be a good idea to do the same with squid's ? I'm also not sure it is a good idea to mess with the cache ids for files other than rpm/repo (and deb/...). For instance, stuff like:
|
added a commit
that referenced
this issue
Jun 9, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jun 9, 2016
How much memory does it use?
With DefaultMemoryAccounting=yes in /etc/systemd/system.conf, the following values were observed in /sys/fs/cgroup/memory/system.slice/qubes-updates-cache.service/memory.memsw.max_usage_in_bytes:
- Squid first started, created new cache dir = 41 MiB
- Upgraded a new clone of qubes-template-fedora-23-3.0.4-201601120722 (~450 packages) = 202 MiB
- Upgraded another new clone of the same template, ~100% cache hits = still 202 MiB
- Squid restarted, uses filled cache dir = 16 MiB
That's already the latest commit, which sets memory_pools off in the Squid config to allow the system to reclaim unused memory. But apparently Squid doesn't free() aggressively enough yet for our purposes.
But there were many cache misses ; a quick look at the log showed that different mirrors would send different mime types for the same rpm (or repo) file, so that might be the culprit.
Yes, that seems to happen sometimes, probably because .drpm is a relatively young file extension. Is it possible to make Squid ignore the MIME type header?
tinyproxy's configuration is quite locked down, maybe that would be a good idea to do the same with squid's ?
Definitely. IIRC Whonix also wants some sort of magic string from the proxy port? Paging @adrelanos :)
I'm also not sure it is a good idea to mess with the cache ids for files other than rpm/repo (and deb/...).
So far I haven't seen the regexes in https://github.com/rustybird/qubes-updates-cache/blob/master/usr/lib/qubes/updates-cache-dedup#L6-L7 match anything else besides metadata and packages. Files aren't listed explicitly because that's such a hassle to maintain for all compression formats and package types, e.g. Debian source packages didn't work with qubes-updates-proxy when tinyproxy still used filters.
rustybird
commented
Jun 9, 2016
With
That's already the latest commit, which sets
Yes, that seems to happen sometimes, probably because .drpm is a relatively young file extension. Is it possible to make Squid ignore the MIME type header?
Definitely. IIRC Whonix also wants some sort of magic string from the proxy port? Paging @adrelanos :)
So far I haven't seen the regexes in https://github.com/rustybird/qubes-updates-cache/blob/master/usr/lib/qubes/updates-cache-dedup#L6-L7 match anything else besides metadata and packages. Files aren't listed explicitly because that's such a hassle to maintain for all compression formats and package types, e.g. Debian source packages didn't work with qubes-updates-proxy when tinyproxy still used filters. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jun 9, 2016
IIRC Whonix also wants some sort of magic string from the proxy port? Paging @adrelanos :)
Sorry, never mind, I literally found something with grep -r magic /etc/tinyproxy. Will check that out.
rustybird
commented
Jun 9, 2016
•
Sorry, never mind, I literally found something with |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
adrelanos
Jun 9, 2016
Member
Ivan:
tinyproxy's configuration is quite locked down, maybe that would be a good idea to do the same with squid's ?
Tinyproxy configuration was relaxed some time ago. There was a ticket
and discussion. In short: locking down tinyproxy does not improve actual
security. Users who explicitly configure their applications to use the
updates proxy should be free to do so.
|
Ivan:
Tinyproxy configuration was relaxed some time ago. There was a ticket |
added a commit
that referenced
this issue
Jun 9, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jun 9, 2016
Tinyproxy configuration was relaxed some time ago. There was a ticket and discussion. In short: locking down tinyproxy does not improve actual security. Users who explicitly configure their applications to use the updates proxy should be free to do so.
There's the "Squid Manager" though, which I've restricted access to in commit rustybird/qubes-updates-cache@0da1dcd -- along with a basic sanity check that requests are coming from 10.137.*.
Also a paragraph on how to use qubes-updates-cache with Whonix at the moment: https://github.com/rustybird/qubes-updates-cache/blob/3b9d5e153f89b551e9b38f82928cbc7c9c2f7ba3/README#L32-L35 (works nicely BTW, tons of cache hits across Debian / Whonix GW / Whonix WS)
rustybird
commented
Jun 9, 2016
•
There's the "Squid Manager" though, which I've restricted access to in commit rustybird/qubes-updates-cache@0da1dcd -- along with a basic sanity check that requests are coming from 10.137.*. Also a paragraph on how to use qubes-updates-cache with Whonix at the moment: https://github.com/rustybird/qubes-updates-cache/blob/3b9d5e153f89b551e9b38f82928cbc7c9c2f7ba3/README#L32-L35 (works nicely BTW, tons of cache hits across Debian / Whonix GW / Whonix WS) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
adrelanos
Jun 10, 2016
Member
I have just now finished documenting Qubes-Whonix torified updates proxy:
https://www.whonix.org/wiki/Dev/Qubes#Torified_Updates_Proxy
In essence, Whonix TemplateVMs are getting the output of UWT_DEV_PASSTHROUGH="1" curl --silent --connect-timeout 10 "http://10.137.255.254:8082/" and grep it for <meta name="application-name" content="tor proxy"/>. If that matches, that test is considered successful.
Of course qubes-updates-cache's squid should only include the magic string, if it is actually torrified, i.e. run inside sys-whonix.
Do you know if it is possible to conditionally inject this magic string? Or if not, we need to modify Qubes-Whonix torified updates check to do something supported by squid.
I am wondering if any whonix-gw-firewall modifications will be required. Current tinyproxy rules:
https://github.com/Whonix/whonix-gw-firewall/blob/724a0fc0546c83555a008cd1b7b03c048519121a/usr/bin/whonix_firewall#L310-L328
Does squid support outgoing proxy settings? Can squid be configured to use a Tor SocksPort?
|
I have just now finished documenting Qubes-Whonix torified updates proxy: In essence, Whonix TemplateVMs are getting the output of Of course qubes-updates-cache's squid should only include the magic string, if it is actually torrified, i.e. run inside sys-whonix. Do you know if it is possible to conditionally inject this magic string? Or if not, we need to modify Qubes-Whonix torified updates check to do something supported by squid. I am wondering if any whonix-gw-firewall modifications will be required. Current tinyproxy rules: Does squid support outgoing proxy settings? Can squid be configured to use a Tor SocksPort? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jun 10, 2016
Member
Do you know if it is possible to conditionally inject this magic string? Or if not, we need to modify Qubes-Whonix torified updates check to do something supported by squid.
AFAIR in case of tinyproxy it is placed in default error page. Squid should allow the same.
AFAIR in case of tinyproxy it is placed in default error page. Squid should allow the same. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jun 10, 2016
Does squid support outgoing proxy settings? Can squid be configured to use a Tor SocksPort?
Haven't found anything about outgoing HTTP proxies. Semi-official Socks support can be added in during compilation via libsocks, which Debian doesn't seem to do, but ...
I am wondering if any whonix-gw-firewall modifications will be required. Current tinyproxy rules:
https://github.com/Whonix/whonix-gw-firewall/blob/724a0fc0546c83555a008cd1b7b03c048519121a/usr/bin/whonix_firewall#L310-L328
... I think you'd only need to change --uid-owner tinyproxy to --uid-owner squid.
AFAIR in case of tinyproxy it is placed in default error page. Squid should allow the same.
Yes, the relevant file to modify is /usr/share/squid-langpack/templates/ERR_INVALID_URL from Debian package squid-langpack.
rustybird
commented
Jun 10, 2016
•
Haven't found anything about outgoing HTTP proxies. Semi-official Socks support can be added in during compilation via libsocks, which Debian doesn't seem to do, but ...
... I think you'd only need to change
Yes, the relevant file to modify is /usr/share/squid-langpack/templates/ERR_INVALID_URL from Debian package squid-langpack. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jun 10, 2016
All the security implications of using qubes-updates-cache I could think of:
https://github.com/rustybird/qubes-updates-cache/blob/master/README#L8
Edit: Hmm, regarding (1) it's really the same with qubes-updates-proxy. Not sure why I always (wrongly) thought of that as circuit-isolated per client VM...
rustybird
commented
Jun 10, 2016
•
|
All the security implications of using qubes-updates-cache I could think of: Edit: Hmm, regarding (1) it's really the same with qubes-updates-proxy. Not sure why I always (wrongly) thought of that as circuit-isolated per client VM... |
added a commit
that referenced
this issue
Jun 11, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jun 19, 2016
Some news:
- Made qubes-updates-cache work on Debian, incl. Whonix gateways pending PRs Whonix/whonix-firewall#1 and Whonix/qubes-whonix#2
- Ordered the systemd service after bind-dirs.sh/rc.local and made it simpler and more reliable, which should fix TCP_SWAPFAIL_MISS cache corruption
- Switched to asynchronous cache backend, reducing memory consumption to ~ 130 MiB after one Fedora template upgrade. Still have to find out why adding new objects leaks memory at all
- Rewrote the URL rewriting script in pure bash with no child processes. Now in addition to dedup it also transparently upgrades some hosts from HTTP to HTTPS: {ftp,yum,deb}.qubes-os.org, www.whonix.org, deb.torproject.org, dl.google.com, mirrors.kernel.org
rustybird
commented
Jun 19, 2016
•
|
Some news:
|
rustybird
referenced this issue
Jun 19, 2016
Closed
HTTPS versions of yum.qubes-os.org and deb.qubes-os.org broken #2082
added a commit
that referenced
this issue
Jun 19, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
adrelanos
Jul 3, 2016
Member
Made qubes-updates-cache work on Debian, incl. Whonix gateways pending PRs Whonix/whonix-firewall#1 and Whonix/qubes-whonix#2
This is done btw.
This is done btw. |
added a commit
that referenced
this issue
Jul 3, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
qjoo
Jul 3, 2016
I use polipo as a caching proxy between template VMs and Tor SOCKS port. It has SOCKS support and might be more lightweight than squid?
qjoo
commented
Jul 3, 2016
|
I use polipo as a caching proxy between template VMs and Tor SOCKS port. It has SOCKS support and might be more lightweight than squid? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jul 4, 2016
Made qubes-updates-cache work on Debian, incl. Whonix gateways pending PRs Whonix/whonix-firewall#1 and Whonix/qubes-whonix#2
This is done btw.
Looks like whonix-gw-firewall needs a version bump, and qubes-whonix 5.3-1 hasn't been uploaded yet?
I use polipo as a caching proxy between template VMs and Tor SOCKS port. It has SOCKS support and might be more lightweight than squid?
It doesn't seem to support deduplication or (transparent) rewriting of URLs :(
rustybird
commented
Jul 4, 2016
Looks like whonix-gw-firewall needs a version bump, and qubes-whonix 5.3-1 hasn't been uploaded yet?
It doesn't seem to support deduplication or (transparent) rewriting of URLs :( |
added a commit
that referenced
this issue
Jul 5, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
adrelanos
Jul 5, 2016
Member
Rusty Bird:
Made qubes-updates-cache work on Debian, incl. Whonix gateways pending PRs Whonix/whonix-firewall#1 and Whonix/qubes-whonix#2
This is done btw.
Looks like whonix-gw-firewall needs a version bump, and qubes-whonix 5.3-1 hasn't been uploaded yet?
Yes. The usual ETA to reach Whonix stable users is the next release,
Whonix 14.
|
Rusty Bird:
Yes. The usual ETA to reach Whonix stable users is the next release, |
added a commit
that referenced
this issue
Jul 7, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
adrelanos
Jul 26, 2016
Member
- I'll release a qubes-whonix package with your qubes-updates-cache changes soon. (Currently in developers repository, contains some other fixes.)
- Should not writing to /etc i.e. /etc/systemd/system/multi-user.target.wants/qubes-updates-cache.service better be avoided and standard distribution default systemd folders /lib/systemd/system be used?
- Where is the code for qubes-updates-cache.service?
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Jul 26, 2016
Member
Where is the code for qubes-updates-cache.service?
Should not writing to /etc i.e. /etc/systemd/system/multi-user.target.wants/qubes-updates-cache.service better be avoided and standard distribution default systemd folders /lib/systemd/system be used?
The standard way is to create such symlink in post-installation script (preferably using presets). But since the service is controlled by qvm-service, it may be indeed good idea to provide the symlink in the package. In such a case it should live in /lib/systemd/system and be relative one.
The standard way is to create such symlink in post-installation script (preferably using presets). But since the service is controlled by qvm-service, it may be indeed good idea to provide the symlink in the package. In such a case it should live in /lib/systemd/system and be relative one. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Jul 26, 2016
Short update:
- The TCP_SWAPFAIL_MISS cache corruption still happens. It looks like Squid 3 just cannot deal with unclean shutdowns.
- Squid 4 fixes the memory leak! But the latest beta (4.0.12) is still too crashy to use.
But since the service is controlled by qvm-service, it may be indeed good idea to provide the symlink in the package. In such a case it should live in /lib/systemd/system and be relative one.
It's currently created in /etc, as if qubes-updates-cache.service was listed in https://github.com/QubesOS/qubes-core-agent-linux/blob/master/vm-systemd/75-qubes-vm.preset just like qubes-updates-proxy.service.
But I'll have to move at least the actual qubes-updates-cache.service to $(pkg-config --variable=systemdsystemunitdir systemd) anyway, installing it to /usr/lib/systemd/system is wrong for Debian. Then the symlink could be moved there, too.
rustybird
commented
Jul 26, 2016
|
Short update:
It's currently created in /etc, as if qubes-updates-cache.service was listed in https://github.com/QubesOS/qubes-core-agent-linux/blob/master/vm-systemd/75-qubes-vm.preset just like qubes-updates-proxy.service. But I'll have to move at least the actual qubes-updates-cache.service to |
added a commit
that referenced
this issue
Jul 30, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Rudd-O
Aug 4, 2016
I would really like to urge folks into developing a custom cache solution using the very mature Go libraries that exist for HTTP and proxying. It will be memory-safe (no pointer bullshit), it will be far smaller than trying to shoehorn Squid into this role, and it will be trivial to provide a proper solution that will cache requested file names based on content.
Rudd-O
commented
Aug 4, 2016
|
I would really like to urge folks into developing a custom cache solution using the very mature Go libraries that exist for HTTP and proxying. It will be memory-safe (no pointer bullshit), it will be far smaller than trying to shoehorn Squid into this role, and it will be trivial to provide a proper solution that will cache requested file names based on content. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
adrelanos
Aug 14, 2016
Member
Looks like whonix-gw-firewall needs a version bump, and qubes-whonix 5.3-1 hasn't been uploaded yet?
I'll release a qubes-whonix package with your qubes-updates-cache changes soon. (Currently in developers repository, contains some other fixes.)
It's in Whonix jessie (stable) repository for a few days now. (And if you reinstall Qubes-Whonix 13 from qubes-tempaltes-community repository, it is also included.)
It's in Whonix jessie (stable) repository for a few days now. (And if you reinstall Qubes-Whonix 13 from qubes-tempaltes-community repository, it is also included.) |
added a commit
that referenced
this issue
Aug 14, 2016
andrewdavidwong
added this to the Release 4.0 milestone
Dec 24, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rustybird
Mar 18, 2017
The latest qubes-updates-cache has many new rewriting rules that transparently upgrade repository URLs to HTTPS, and optionally to .onion (#2576).
Current coverage:
| Repository | HTTPS | HTTP .onion |
|---|---|---|
| yum.Qubes | upgrade | upgrade to v3 |
| deb.Qubes | upgrade | upgrade to v3 |
| Whonix | upgrade | upgrade to v3 |
| Debian | upgrade | upgrade to v2 |
| Debian Security | upgrade | upgrade to v2 |
| Fedora | upgrade | - |
| RPM Fusion | upgrade | - |
| Tor Project | upgrade | upgrade to v2 |
| upgrade | - | |
| Fedora-Cisco | uncached | - |
| Adobe | - | - |
rustybird
commented
Mar 18, 2017
•
|
The latest qubes-updates-cache has many new rewriting rules that transparently upgrade repository URLs to HTTPS, and optionally to .onion (#2576). Current coverage:
|
andrewdavidwong commentedMay 5, 2016
It's common for users to have multiple TemplateVMs that download many of the same packages when being individually updated. Caching these packages (e.g., in the UpdateVM) would allow us to download a package only once, then make it available to all the TemplateVMs which need it (and perhaps even to dom0), thereby saving bandwidth.
This has come up on the mailing lists several times over the years:
Here's a blog post about setting up a squid caching proxy for DNF updates on baremetal Fedora: