Skip to content

Loading…

More CDNs #270

Open
kurtextrem opened this Issue · 12 comments

3 participants

@kurtextrem

see http://www.cdnperf.com/. I'll add more as I find them.

@gorhill

Yes, but I need to look at the requests. If it is to serve things that are unlikely to be pulled from somewhere else, than we can't cache this, or this would quickly cause the cache to be filled, and evict stuff that may be more useful, like the things shared by many sites.

I have observed that often CDN will serve stuff specific to a single site, so we really need to ensure that the usefulness of local mirroring is no crippled by such requests.

@kurtextrem

I'd say you should definitely add the Microsoft CDNs. They're commonly used like Google or jQuery's itself. Blocking requests to a russian search engine (Yandex) won't hurt too :)

@gorhill

I will share insights about how I proceeded to include the ones already there. I rely completely on the reference benchmark.

So currently I extended the benchmark to include more various sites, and eliminated redundancies.

Once I have the results (spreadsheet of latest one, pre-revised benchmark though), using out-of-the-box filter lists, I look at uBlock columns, and see which third-party is commonly pulled from various 1st-party. Since the ordering is 3rd-party-then-1st-party, it's easy to spot ubiquitous 3rd-parties.

Now this is about a lot of high-traffic top sites, and I expect large data-miner 3rd-parties to show up in there. Once I spot what appear to be a high-traffic 3rd-parties, I visit the various sites depending on it, and look at the requests, to extract, if any, requests which are commonly made by the first-party sites. The specific ones I can't include them, that would accomplish nothing. Other mechanism will be needed to address these (referrer neutering or something).

So to reiterate: I can't include anything without closely looking at the details to request, or this will "break" local mirroring.

@kurtextrem

@gorhill Yeah, that's true. However, Microsoft is one of the popular CDNs. You should add it. :) Ah and, you spoke of 5 MB limit. You could introduce the optional permission "unlimitedStorage" and ask the user to give permission to use that when activating the experimental setting.

@gorhill

The permission unlimitedStorage is already in effect. Whatever is stored needs to be loaded at launch, and large amount of data will translate in longer load time and CPU churning. So I will reiterate: development through benchmarking. Mirroring is a curse if whatever is stored is used only once then flushed.

Your suggestion to "add the Microsoft CDNs" is completely opposite of my methodology: What exactly is "Microsoft CDNs"? What sites pull from it? How do the net requests look like? Immutable and shared across different sites or no clear pattern and very specific to the client sites?

I didn't see Microsoft in my benchmark, except for one single instance of aspectcdn.com as 3rd-party of msn.com. Nothing else. It is not very useful to mirror one CDN which caters to its own related entities. As said, make a detailed case of what and why. I will not just throw in stuff based solely on "add the Microsoft CDNs", this will be based on hard data.

ajax.googleapis.com, fonts.google.api, cdnjs.cloudflare.com et al. were included based on looking closely at the net requests following the results of the benchmark. platform.twitter.com/widgets.js was added because this one single resource is pulled from a large number of unrelated sites.

I will stick to this methodology, or else this local mirroring can easily turned into a liability if not carefully fed good data.

@kurtextrem

My bad, haven't looked at the manifest file. I meant Microsofts jQuery CDN. I can't prove its popularity, but the jQuery team always includes it in their changelog posts. Some pages are using it I assume.

@gorhill

As said, I need specific sites using it, so that I can inspect how the net requests are crafted, I can't throw in guess work in there, too much potential for things going wrong. The CDN could require that a query parameter in the URL changes with every request (not uncommon, i.e. &t=), and mirroring these would be a complete waste. I still don't even know the host name to expect for "Microsofts jQuery CDN".

@kurtextrem

Oh, I thought you got it already as you mentioned it: http://www.asp.net/ajax/cdn#Using_jQuery_from_the_CDN_21 (aspnetcdn)

@gorhill

Thanks for the link. That resolves all my worries. It does look indeed it is pretty straightforward to add Microsoft CDN, this detailed reference page dispel any worries I had, and from it I can craft a solid regexp to filter out unimportant parts in the requestURL.

@gorhill gorhill added a commit that referenced this issue
@gorhill gorhill this addresses #270 c81c859
@sanilunlu

As stated above too, there may be some more cdn's to be added:

cdn.jsdelivr.net
  ^cdn\.jsdelivr\.net\/
yastatic.net
  ^yastatic\.net\/

Example libs:
https://yastatic.net/angularjs/1.2.23/angular.min.js
https://cdn.jsdelivr.net/underscorejs/1.6.0/underscore-min.js

@sanilunlu

Adobe font library is used by some sites:

use.typekit.net
    ^use\.typekit\.net\/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.