Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add other domains owned by admiral? #4

Open
KeenRivals opened this issue Aug 12, 2017 · 82 comments

Comments

@KeenRivals
Copy link

commented Aug 12, 2017

Many other domains were found that are owned by Admiral and point to the same IP as #1. There's a list at https://pgl.yoyo.org/adservers/admiral-domains.txt

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 12, 2017

@tofof

This comment has been minimized.

Copy link
Contributor

commented Aug 12, 2017

That list is woefully incomplete. See my comment here for some analysis. Click the reverse-domain links I provide (e.g. ipv4info for functionalclam.com) and you can see how deep this rabbit hole goes, and that's just for a single ip.

Sample admiral domains not in that list:
btez8.xyz
innocentwax.com
completecabbage.com
4jnzhl0d0.com
h78xb.pw

It's trivial to observe hundreds of Admiral domains, they probably number in the thousands.

@mvasilkov

This comment has been minimized.

Copy link

commented Aug 12, 2017

Let's kill the whole Admiral thing with fire!

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 12, 2017

I've merged #8 which adds more Admiral domains, if there are still some missing I'm happy to add them as they are discovered

@mirague

This comment has been minimized.

Copy link

commented Aug 12, 2017

I support adding all domains pointing to the same content - it's likely all these domains would eventually have found their way into EasyList in the first place. Might be less likely now.

@JamyDev

This comment has been minimized.

Copy link

commented Aug 12, 2017

Maybe block that IP too in anticipation of more domains being added?

@tofof

This comment has been minimized.

Copy link
Contributor

commented Aug 13, 2017

#8 is still not even close.

It has...

  • Some but not all - e.g. breezybath.com - of the domains currently on 104.155.48.223 (the same ip that unknowntray.com is served from)
  • Some but not all - e.g. axiomaticalley.com - of the domains currently at 146.148.6.205
  • Some but not all - e.g. btez8.xyz - of domains currently at 35.186.249.84

It's missing...

... I think that demonstrates my point. (Yes, there are many duplicates once you start putting all of these together).

If someone actually wants to make a serious attempt, which hasn't happened yet, just walk the related domains on a tool (like threatcrowd) that lets you do so easily.

It's trivial to verify these even when you're learning the naming patterns, since they all serve up the same image. But you really have to do the verifications. Bannersnack.com, for example, is NOT an Admiral domain, even though it was hosted alongside them once.

Some starting points that I haven't already exhausted above include...
tzwaw.pw
0D7DK.XYZ
pz37t.xyz
3jsbf5.xyz - beware, there's at least one domain (apstylebook.com) that'd be a false positive.
4jnzhl0d0.com
82o9v830.com
familiarfloor.com

The biggest problem is that they use google/amazon hosting and so you can't trivially blacklist everything that resolves into their IP space, and that tools like ipv4info, threatcrowd, alienvault, tcpiputils, all have incomplete datasets. You really need multiple people using different toolsets walking the same space to root all of these out.

@lol768

This comment has been minimized.

Copy link

commented Aug 13, 2017

@tofof Given the ones that I have seen seem to use valid SSL certs from Let's Encrypt, do you think crawling the CT logs is a viable way of checking for these?

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

What about doing it the other way around? As you said, it's trivial to verify a domain. What about checking all suspicious domains? If one is found, it's sent to some central server to be added to the list. Another way would be to just blacklist the IPs. This would have some false positives, but if it becomes a default filter for ublock/adblock it would have the effect of forcing them to clean up their IP ranges. But really, how often do you have a legitimate reason for loading random .js from a .xyz domain?

@lol768
They issue hundreds of thousands of certificates each day, but if you manage to get the first filtering down it's feasible, then you can use DNS to filter out more.

@mvasilkov
I don't think that's feasible. Maybe by sending abuse notices to the registrars, since they care more about quantity than quality.

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

Also, playing cat-and-mouse with domains is not really a problem. It costs at least $1 to register one, and much less to find and block it.

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 14, 2017

Actually, @LOL786's suggestion sounds like a great idea for keeping track of new registrations from this company. It could be automated as a nightly script.

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 14, 2017

That's just a way of narrowing down the search space; we'd still verify that the domain behaved like the others.

I'd like to keep this to a pure domain blacklist, as opposed to running code on the client, for a few reasons: 1. it's more portable to existing blocking extensions, 2. it's more performant, and 3. I'm more comfortable with the legal defensibility (I am confident that a passive blacklist can never be illegal; I don't want to speculate about the legality of a more active client-side approach)

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

@zymase
You can just check the image dimensions, or the length of the body. This works for me:
def isAdmiralDomain(domain): try: response = requests.get('https://' + domain) except: return False if len(response.text) == 179: return True else: return False

Otherwise, how on earth can a list be conceived otherwise than with try-and-succeed/try-and-fail policy? It's just impossible. We are facing networks as dirty as bots.

Start of with the LE cert list. Filter away anything that doesn't end with .com, .xyz, or .pw. Issue DNS queries for everything. If the whois isn't protected, remove it. If they're not using whatever registrars they're using, remove it. Then use tor/proxies (it's kind of counter productive for them to block IPs) for the final verification.

Why not literally treat them as bots? I don't have any, but SMTP accounts are allegedly quite cheap. What if you'd just send bulk abuse notices to the registrars and accuse them of being C2 servers for some botnet? They generally don't care if the notices are valid, they only care about the quantity of notices they're getting.

@paulgb

That's just a way of narrowing down the search space; we'd still verify that the domain behaved like the others.
And this job can be automated easily, as long as the domain can be identified once you can put it in the list forever.

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

GitHub stripped away the formatting, so you need to add some newlines and indentation.
The images are identical right now.
CRC32: 8db019c1
MD5: 681e062bb33b9ba28f3427e7283c81a8
SHA1: 3fcf7e14e92043a00926d340d45778b618bc87a9
SHA2: 32afacb9285649aa4af43ea03e7cd9a522aa3e6d0554a2dabe308fac4531be5f

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

Another identifying mark is the robots.txt, which is unlikely to change:
User-agent: * Disallow: /

The 404 page seems nonstandard:
404 page not found
(it's not HTML)
Content-Type:"text/plain; charset=utf-8"

The registrar is enom for the .com domains and namecheap for the .xyz and .pw domains.

They all use the same 4 nameservers. You should be able to enumerate from that, it's an uncommon combination.
NS-1212.AWSDNS-23.ORG
205.251.196.188
NS-1627.AWSDNS-11.CO.UK
205.251.198.91
NS-305.AWSDNS-38.COM
205.251.193.49
NS-697.AWSDNS-23.NET
205.251.194.185

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

@zymase
You want to block the domain names. You can get a new IP for almost nothing, getting a new domain name costs some money. The picture's name changes, but the site's structure is the same.

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

Also worth noting is that you can query however many admiral domains you want with Tor. If they start blocking IPs, then their ads won't work anymore.

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

@zymase
I don't understand what you're trying to say. You can look at a domain and see if it has that picture. If it does, it also has the script we want to block.

The domains don't resolve to the picture. The domains resolve to Google IPs, which then serve that picture. The domains all point to different IPs.

The purpose is to block the script they use, the easiest way to do this is to block the Admiral domains so they can't serve the script.

@tofof

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2017

@anon182739 wrote:

Filter away anything that doesn't end with .com, .xyz, or .pw.

I wouldn't use such a filter. First of all, Admiral has at least one .us domain that I've already prominently mentioned in this thread; it's literally the first domain I name as 'missing'. Second, there's no reason to think they won't expand to other TLDs. Nine months ago, noone had spotted any .xyz Admiral domains - I believe they were only using .com and .pw at that time.

@tofof

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2017

@zymase
You've stated that you don't understand and are not a coder. It's acceptable to be interested, to use the emoji-response features, etc, but please don't clutter a single-issue thread with philosophical meanderings, well-wishings, tortuous analogies, and otherwise "laying the obvious," whatever that means.

To address your final point:
No, there is no reason to think that there must be a common resource, accessible to the public, that would identify all such domains.
If instead you mean that there must be a reason for these domains? Yes, the reason is that Admiral owns them and happens, for now, to serve the same content from all of them. They could just as easily serve nothing but a 403 or 204 error code, or just blackhole connections.

The list's criteria are already stated: it is a list of domains who have misused DMCA takedowns to attempt removal from other lists. This issue is for the suggestion that affiliated domains that are owned by the same company and used for the same purpose be included alongside the singular example named on a DMCA takedown thus far. The criteria for inclusion that's being proposed, then, is similarly obvious: Admiral-owned domains that appear to serve the same (lack of) content and presumably host the scripts used in serving advertising at affiliated websites.

Quite contrary to your assertion, such a list, if built, will be built exactly the same way all other advertising-blocking lists are built: on the finds of participants reporting "I found another one". The starting points are found when an Admiral-protected website (e.g. thewindowsclub.com) uses scripts hosted on an Admiral server to display its contents.

I have already outlined the best possible way to find more Admiral domains given a starting domain: by using tools meant for that, i.e. tools that identify spatially- and temporally-related domains.

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 14, 2017

@tofof Thanks for the analysis you've been doing, I have only been skimming this conversation while working on providing all the blacklist formats people want but now that that's done I want to take a real stab at automating some of this.

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

@zymase

We're not going to quest the whole web, domain after domain to see the ones which point to that picture, right?

You can narrow it down enough so you don't need to check the entire web.
@tofof
What's wrong with scraping DNS/cert lists? They can easily make sure that 1 domain = 1 ip and avoid tainting each other, it's non-trivial to make it harder to verify the domains.
@paulgb
I'm already working on it, I've managed to hack together a python script that does the job. Should I post it here, or is 'security by obscurity' better?

It turns out they only have 159 domains apparently, all the "different starting points" were somehow interlinked.

https://pastebin.com/6mPnXBiR

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 14, 2017

Great stuff, is this from the CT log or just from grabbing the IPs already found?

Let's keep the script apart from this, but if you're not a paying GitHub user I can create a private repo and add you to it so we can collaborate.

paulgb added a commit that referenced this issue Aug 14, 2017

@anon182739

This comment has been minimized.

Copy link

commented Aug 14, 2017

This naming scheme is interesting.
If you visit any admiral domain (example: http://abandonedclover.com http://abruptroad.com) you get that image:
http://abandonedclover.com/6f044848f5e9030b6fd409a7e153defd6d8c4e58fb082a44da549ed3e421f9755aedb08132895be1e0d578e7
But each time you refresh the page, you get a new URL:
http://abandonedclover.com/f1e5b5d86bcceb851312e5cc5f7bce26bb10ab951c152cb63a5068954caaa20d196ab3a042d306f561b71c22
You can use it multiple times, and across different domains. I really wonder how this works? Is it a signature of some sort?
@paulgb
This is from scraping one domain (hfc195b.com) and recursively querying the results from threatcrowd. It seems to cover all of the "starting points" listed though, except for 0d7dk.xyz pz37t.xyz 3jsbf5.xyz that weren't reachable. So this should be all of their domains.
Sure, or I can send it in a PM if you want. It's not of much use now though.

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 15, 2017

Sure, a PM works for me. My email is paulgb@gmail.com

Cheers.

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

Oh, you can't send GitHub PMs anymore apparently.
Gmail filters any anonymous e-mail addresses since they're used for spam. If you already have github premium, it's probably easier to do it that way.

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 15, 2017

Ok, I created a repo.

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

I can't see anything. Where do I get the notice?

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 15, 2017

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

It also has other interesting stuff, gives away some information about how it's structured internally. owlsr.us is used as gateway, it might have been specially registered for that purpose.

Wordpress Plugin
Easy install. No JS code required. Proxy requests to minimize risk of adblocker intervention.

Custom Integration
Proxy Requests through your own domain to minimize risk of adblocker intervention.

Also, there are lots of API endpoints we can use to identify domains.
http://staging.owlsr.us/js?p=asd
http://owlsr.us/record

@tofof

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2017

@anon182739: thanks again for your script, it's very helpful. I've been playing with it for a bit now, and I notice a couple things.

Obvious statement: unfortunately, your script is limited by the threatcrowd data.
Less-obvious: there are domains that are Admiral that threatcrowd completely lacks.

For example:
https://otx.alienvault.com/indicator/hostname/2znp09oa.com shows profitrumour.com among the related domains, which is an Admiral domain.

But unfortunately, threatcrowd's picture of that is a big zilch.
Edit: I (foolishly) retyped the url and had missed the u, but it's still a big zilch:
profitrumor

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

Yes, it's a shame. What services are there to get all domains with a given nameserver? Because they all share the same 4 nameservers.

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

https://otx.alienvault.com/otxapi/indicator/hostname/whois/2znp09oa.com Here, just parse this JSON and use as starting domains for the script, it doesn't matter that they're not in threatcrowd. (1 domain per line, no http before)

@tofof

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2017

Yeah, just was looking at that myself, looks like it's possible to at least walk that api's space as well. I'll see about expanding it if you don't beat me to it; I'm not going to get to it before tomorrow evening at the earliest.

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

@tofof No need for walking, since you can do queries on DNS name servers.
grep -P '"domain": "[^"]+' --only-matching | cut -c 12-

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

I strongly believe these ones are the only ones that are active in the wild:
https://pastebin.com/Bu2gFH9J
I ran script that created 600 accounts and got the script URLs, these are the only ones in the list. The least common one (jadeitite.com) is present 8 times, the most common one (82o9v830.com) is present 20 times.

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 15, 2017

On thing on the topic of unminifying code, etc.: In order for this project to be bulletproof should it end up in court, I can't include any links obtained that way. Since a judge is unlikely to be technically adept enough to understand nuance in this area, I'd rather keep a good distance from anything that could be made to sound like reverse engineering to someone non-technical.

The network-based approaches are defensible though, so as long as we stick to that route we'll be fine.

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

Clean room reverse engineering is legal. So by analogy, you should be able to share domains you got from observing the script's behavior, but not by reverse engineering them.

If you're worried about the legal aspects, keep in mind that the CFAA is very broad. Bulk registering accounts could be illegal in theory, depending on the definition of "authorization".
(a) Whoever—
(2) intentionally accesses a computer without authorization or exceeds authorized access, and thereby obtains—
(C) information from any protected computer;

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 15, 2017

It's not just the CFAA that I want to steer clear from, it's also DMCA 1201 which was the basis of Admiral's takedown against EasyList.

In general I guess my stance is: given that there are many ways of obtaining the list, we should do it in the way that is least likely to be misunderstood by a judge.

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

Are you worried about takedowns or legal responsibility?
If the former, use a git provider based outside of the US (bitbucket is australian, launchpad is british, osdn is japanese, ow2 consortium is french, self-hosted gitlab onion is in the jurisdiction of anonymous proxy)
If the latter, use a throwaway account and Tor

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 15, 2017

But that misses the goal of the project entirely! I'm not trying to evade any laws, quite the opposite. I'm trying to show that Admiral's interpretation of the law is incorrect, that it wouldn't hold up in court, and that they know that.

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

If you're not violating any laws, avoiding takedowns is just a convenience thing. DMCA takedown notices are for copyright infringement, circumvention of technical protections as defined in 17 USC § 1201 isn't copyright infringement.

@tofof

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2017

I agree with @paulgb that Admiral's interpretation of the DMCA - that an item in a list constitutes a 'copyright circumvention mechanism' - is incorrect, and should be challenged. I further agree that the DMCA takedown process is not an appropriate remedy should a circumvention mechanism actually exist - it's instead for direct infringmeent. I understood that to be the primary reason for the creation of this project: to invite such a takedown, challenge it, and disprove this legal theory.

Expanding the list to include related Admiral domains that appear to function identically to the DMCA-takedown'd one seems in line with that goal. Particularly for a list created by crawling publically accessible networks and observing the content that Admiral willingly serves up (its landing page image).

@anon182739 seems focused on the feasibility of maintaining such a list in a hostile environment; throwaways, tor, non-US providers all work toward that goal. That's a potentially valuable process too, but quite in contradiction with the stated goals of this project.

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 15, 2017

I agree with that interpretation @anon182739 but Admiral's stance is that it does and that's how we got here.

@tofof exactly

@tofof

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2017

On the topic of reversing minified code - as far as I understand, none of that has been done in generating any of the list to this point. The list and addendum that @anon182739 has linked to is built from walking the recorded observations on threatcrowd - starting from a known Admiral domain, examining what IPs it was hosted on and what other domains were hosted on those IPs, and recursing, then examining for each domain its current public home page to see if it identifies itself as part of the Admiral protection scheme.

Note that the content that Admiral domains serves (attached below) explicitly covers such a use: "if you arrived here on accident and are not looking for information about this domain, feel free to hit back in your browser or close the tab." In other words, when we purposefully visit the domain and are exactly looking for information about it, that image is meant to be our answer. It spells out what type of content ("Javascript, HTML, CSS, video and images") is served and for what nominal purpose ("to control access to copyrighted content ... and understand how visitors are accessing their copyrighted content") it does so. Whether this is an honest summary of Admiral's use of these domains is another matter.

My mentions of reversing minified code were with respect to observing the behavior of the Admiral scripts themselves, and in historical context where I was examining those scripts in support of an entirely different project (Reek's Anti-Adblock) - which I linked to in my first post here. Also note that minification is not for the purposes of obscuring or defeating an observer; if it was, you wouldn't leave function names like "hasDisabledAdBlocker" intact. Minification is simply to reduce the size of the file that needs to be transmitted, so that content loads faster.

Public-facing content Admiral willingly serves to all visitors of its domains:
0018aa644424254f733b14cc2d656b361bfbdb7085745ae38df94c2545f529626c1deeb3fd996f10b84c80c7

@tofof

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2017

[moved to its own post for clarity and importance]

At any rate, no knowlege nor use of Admiral's scripts were involved in creating the list and addendum thus far. In fact, I would say that the algorithm that produced the list and addendum thus far is superior in that respect to the unknown provenence of the items from pull request #8. I would actually recommend that @paulgb replace that list with the ones derived here, and will shortly create a pull to do so that he can merge if he agrees.

@paulgb

This comment has been minimized.

Copy link
Owner

commented Aug 15, 2017

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

@tofof
No, the addendum (https://pastebin.com/CYvL1GyJ and https://pastebin.com/Bu2gFH9J) isn't from threatcrowd, it's from using a script to register accounts and get the script domain they use.

I ran script that created 600 accounts and got the script URLs, these are the only ones in the list. The least common one (jadeitite.com) is present 8 times, the most common one (82o9v830.com) is present 20 times.

@anon182739

This comment has been minimized.

Copy link

commented Aug 15, 2017

@anon182739 seems focused on the feasibility of maintaining such a list in a hostile environment; throwaways, tor, non-US providers all work toward that goal. That's a potentially valuable process too, but quite in contradiction with the stated goals of this project.

That's true. I was linked here from elsewhere and didn't read the project goals. I'm mostly interested in the technical side of things and what actions give the end result of a more complete list.

@anon182739

This comment has been minimized.

Copy link

commented Aug 16, 2017

Is there any other place where admiral is being discussed? This seems to be the only active github issue about it, are there any active threads elsewhere?
https://github.com/anon182739/admiraljs - 206 admiral JS files from different domains, might be useful. They're all identical save for the domain name referenced, so a regex should block them.

@unicorntaco

This comment has been minimized.

Copy link

commented Sep 8, 2018

It's trivial to observe hundreds of Admiral domains, they probably number in the thousands.

If someone actually wants to make a serious attempt,

Could not this higher level principle be applied:

This exceedingly eccentric blog suggests corralling the evil via ASN blocking. Guilt by association seems like a grand idea, but implementing it is not in my wheelhouse.

Nano Adblocker, a fork of Ublock Origin, seems to have some more userscripty type powers.

@tofof @paulgb @anon182739

@TNW9imKLC3fv

This comment has been minimized.

Copy link

commented Feb 23, 2019

This was released on 2018-11-01 updated day after: https://github.com/jkrejcha/AdmiraList - but hasn't been updated since then.
This is currently being updated: jerryn70/GoodbyeAds#6 - Though again it's another solution that relies on a manually-updated list of domains which the spam companies seem to be deliberately bypassing by registering a new domain every day - the ASN blocking might be worth looking into, or something that automatically queries the "private" WHOIS databases along the lines of Scihub

(Using NoScript, I saw Issu using a creepy-sounding domain "shallowsmile.com" and found this thread by doing a DuckDuckGo search for said domain with the quotation marks to force searches to find that exact phrase)

@anon182739

This comment has been minimized.

Copy link

commented Apr 10, 2019

There is no need to block any ASNs. It is sufficient to register a few hundred accounts with fake information and see which domains are being offered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants
You can’t perform that action at this time.