Skip to content
This repository has been archived by the owner on Jul 4, 2023. It is now read-only.

Apache urls and mirrors #37945

Closed
vszakats opened this issue Mar 21, 2015 · 9 comments
Closed

Apache urls and mirrors #37945

vszakats opened this issue Mar 21, 2015 · 9 comments
Labels

Comments

@vszakats
Copy link
Contributor

After recent updates SSL/TLS is automatically enforced for both Apache homepages and urls/mirrors.
So far so good. But, url is often set to a mirror selection page like below:

url "https://www.apache.org/dyn/closer.cgi?path=product/.../product-1.0.0.tar.gz"

The problem is that every mirror option is plaintext HTTP (some FTP even), so the actual download will not be SSL/TLS protected.

Luckily, Apache does have an SSL/TLS download location, which is sometimes manually
added as a mirror:

mirror "https://archive.apache.org/dist/product/.../product-1.0.0.tar.gz"

Because this mirror is available for all Apache hosted packages, it would be nice if CurlApacheMirrorDownloadStrategy could automatically consider it as a secondary download location (aka mirror), even though it is missing [1] from the Apache mirror JSON file downloaded from the selection page.

Something like this could be added to Library/Homebrew/download_strategy.rb / CurlApacheMirrorDownloadStrategy / _fetch:

@mirror = 'https://archive.apache.org/dist/' + mirrors.fetch('path_info')

Then, an audit rule could be added to drop the explicit 'https://archive.apache.org/dist/*' (and other) mirror lines from Apache formulae, and one to enforce the official mirror selection page as the url.

Does that make any sense?

[1] Though, two Apache hosted backup entries are listed instead in the JSON, both of which support SSL/TLS but fail to match with the site certificate due to the nested subdomains, so they can only be accessed in plaintext: http://www.eu.apache.org/dist/, http://www.us.apache.org/dist/.

@MikeMcQuaid
Copy link
Member

I think I understand here but it would be great if you could try and make a PR (or @bfontaine if he is interested).

@DomT4
Copy link
Member

DomT4 commented Mar 31, 2015

Do the current mirror url sets geolocate? I presumed that was why we're using the geolocation url rather than the secured mirror - Due to download speed, ease-of-use, etc for users outside of Apache's server locations. Not overly attached to the geolocation though if the archive works as well for everyone.

@vszakats
Copy link
Contributor Author

The address is served from multiple IP addresses, but I wouldn't think it has much geolocation-based routing, and no HTTP redirection for sure. I think you're right with your assumptions. In my opening post I wasn't proposing to switch the primary urls to the archive.apache.org domain (even though from a pure security standpoint this would be the best), but instead to automatically use it as an implicit mirror in case the official url is inaccessible. If speed is OK (it is from Europe) and Apache Software Foundation is OK with it as well, it'd be much better and simpler to use only archive.apache.org indeed.

@bfontaine
Copy link
Contributor

I’m pretty busy right now and won’t be able to look at that until this weekend, so feel free to make a PR in the meantime.

@jacknagel
Copy link
Contributor

Like this?

diff --git a/Library/Homebrew/download_strategy.rb b/Library/Homebrew/download_strategy.rb
index 72a64bd..589147c 100644
--- a/Library/Homebrew/download_strategy.rb
+++ b/Library/Homebrew/download_strategy.rb
@@ -343,7 +343,9 @@ def _fetch
     @tried_apache_mirror = true

     mirrors = Utils::JSON.load(apache_mirrors)
-    @url = mirrors.fetch('preferred') + mirrors.fetch('path_info')
+    path_info = mirrors.fetch("path_info")
+    @url = mirrors.fetch('preferred') + path_info
+    @mirrors |= %W[https://archive.apache.org/dist/#{path_info}]

     ohai "Best Mirror #{@url}"
     super

@vszakats
Copy link
Contributor Author

vszakats commented Apr 7, 2015

@jacknagel Yes, it's exactly what I had in mind. Thanks a lot.

@DomT4
Copy link
Member

DomT4 commented Apr 7, 2015

it's exactly what I had in mind. Thanks a lot.

Apologies that I misunderstood you further up. This looks like a neat idea.

I agree that it'd be nice at some point to use links that are pure SSL/TLS rather than SSL/TLS > plaintext > plaintext download, but I presume Apache nudge people towards the geolocating mirrors rather than the archive deliberately. It'd be cool to get a nod from them beforehand if we ever decided to switch the two over permanently.

@vszakats
Copy link
Contributor Author

vszakats commented Apr 7, 2015

No probs at all @DomT4.

This Apache page clearly states they stance on the issue: https://www.apache.org/dev/mirrors.html

Specifically [as of 2015-04-07]:

  • The mirrors should be used by default.
  • www.apache.org/dist/ and www.eu.apache.org/dist/ may be used only as fallback/backup mirrors.

There is also useful information about the distinction between https://www.apache.org/dist/ and https://archive.apache.org/dist/.

@vszakats
Copy link
Contributor Author

vszakats commented Apr 9, 2015

Having created a PR out it (Thanks @jacknagel), I'm closing this one.

@vszakats vszakats closed this as completed Apr 9, 2015
@Homebrew Homebrew locked and limited conversation to collaborators Jul 10, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants